Skip to content

Support S7 extension of S4 classes and vice-versa#659

Open
lawremi wants to merge 79 commits into
mainfrom
issue-456-s4-extends-s7
Open

Support S7 extension of S4 classes and vice-versa#659
lawremi wants to merge 79 commits into
mainfrom
issue-456-s4-extends-s7

Conversation

@lawremi

@lawremi lawremi commented May 27, 2026

Copy link
Copy Markdown
Collaborator

Fixes #456.

First iteration on enabling an S7 class to extend an S4 class and vice-versa.

The concrete, fail-fast goal is to enable SummarizedExperiment, a complex, central package in Bioconductor, to be rewritten in S7, based on S4 classes defined lower in the stack, like those from S4Vectors and IRanges, while not breaking compatibility with dependent packages, like SingleCellExperiment, via S4 shims extending the S7 classes. SingleCellExperiment is complex and deeply dependent on SummarizedExperiment, so if it continues working, it is a very good indicator.

What has been achieved outside of these improvements to S7:

  • Fixed a bug in the methods package related to upcasting S4 objects derived old classes. Should be part of the next patch release, but it will cause release failures in the checks.
  • Modified S4Vectors devel branch to use a class1() helper to be explicit when it wants the scalar name of the object's class and to ensure that it is robust to S3-style class vectors. Also improved the setValidity2() helper so that validity functions are enclosed in the S4Vectors namespace, enabling S4 class objects to remain identical across lazy loading. This is important since S4 class objects end up in the S7 class object, which is carried by S7 objects, and we want them to pass identity checks across packages.
  • Rewrote SummarizedExperiment (branch) to use S7 classes, S7 generics and S7 syntax for defining methods on upstream S4 generics. Also defined S4 classes with the original names that derive from their corresponding S7 classes as a compatibility layer. It passes its tests, which were ported to S7 while preserving the original logic.
  • Ensured SingleCellExperiment passes tests without any modification.
  • The path to Bioconductor adoption of S7 is now wide open.

For S7 classes extending S4 classes, the current approach is to instantiate S7 objects, meaning that the S4 bit is not set. The justification is that there are S4 object invariants that an S7 object cannot satisfy, and vice versa. The big one being the return value of class(). Instead we have an S7 shim that does a reasonable job of mimicking an S4 object, including slot access, validity checking, and initialization, but not class() compatibility. See the evolving man page insert for more details.

When an S7 class is made to inherit from an S4 class, S4_register() is automatically called on the S7 class. This branch extends S4_register() to declare the S7 properties and S7_class attribute as S4 slots, which is helpful for making the S4 class more self-describing but critical in case the S7 class needs to be later extended by S4.

To extend an S7 class using setClass(), the S7 class first needs to be registered via S4_register() and S4_contains() should be called to obtain a suitable value for the contains= argument. S4_contains() is an abstraction that currently just requires the S7 properties to be static, ie not using getters and setters, since S4 code using slot() would bypass those. Any instances of an S7-derived S4 class are proper S4 objects, with the bit set and a scalar class vector.

Besides adding and changing hundreds of lines of logic around S4 classes, significant changes to support this in S7 include:

  • Made S7 property storage compatible with S4 NULL slot sentinels. Until now, setting an S7 property to NULL removed the attribute, and S4 does not like it when its slots disappear from the attributes.
  • The inheritance checks in prop.c need to be robust to scalar class vectors, using @.S3Class instead.
  • Added S4 method registration for internal generics when S7 method signatures include classes with S4 ancestry. S4 methods on internal generics take precedence over S3 methods, so when S7 defines a method on an internal generic, it should define an S4 method if any of the classes in the signature are S4-derived classes, since otherwise it might be overridden by an S4 method defined for a parent class. S7 must also support multi-dispatch methods on internal generics where S4 supports it. The dimnames<-() generic is an example.
  • @<-.S7_object() needs to support slots defined on S4 children that are not S7 properties, because S4 classes that extend S7_object will dispatch to the method, even when the S4 bit is set.

Resolving the two incompatible class() return values, scalar vs. non-scalar, is a larger issue. My proposal is that we export methods:::.class1() as methods::class1() and encourage all S4 code to use class1() to make the expectation of a scalar return value explicit. Practically, we could update the core of Bioconductor to adhere to that policy and thus enable new Bioconductor packages to extend the infrastructure with S7.

@lawremi lawremi force-pushed the issue-456-s4-extends-s7 branch 6 times, most recently from 87b33e2 to 5e5d36d Compare June 3, 2026 00:26
@lawremi lawremi force-pushed the issue-456-s4-extends-s7 branch from 5e5d36d to 3650492 Compare June 7, 2026 20:14
@lawremi lawremi changed the title Support S7 extension of S4 classes Support S7 extension of S4 classes and vice-versa Jun 7, 2026
@lawremi lawremi force-pushed the issue-456-s4-extends-s7 branch from 3650492 to eb1092b Compare June 7, 2026 21:40
@lawremi lawremi requested review from hadley and t-kalinowski June 7, 2026 21:42

@hadley hadley left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you know infinitely more about S4 than me 😄 so I focussed on docs and style.

Also need an update to NEWS.md?

Comment thread man/rmd/S4-compatibility.Rmd Outdated
Comment thread man/rmd/S4-compatibility.Rmd Outdated
Comment thread man/rmd/S4-compatibility.Rmd Outdated
old-class object, not as an S4-bit object, so `isS4(child)` is `FALSE`.
This avoids advertising full S4 object invariants that the object does not
satisfy.
* The S3 class vector is not scalar. S4 code that needs one primary class name

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How common is this use of scalar class()?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty common, unfortunately. I would guess most uses are in show() methods.

Comment thread man/rmd/S4-compatibility.Rmd Outdated
Comment thread man/rmd/S4-registration.Rmd Outdated
Comment thread R/method-register.R Outdated
Comment thread R/method-register.R Outdated
Comment thread R/S4.R Outdated
Comment thread tests/testthat/test-class.R Outdated
Comment thread tests/testthat/test-convert.R Outdated
})

it("can convert_up() an S4-derived S7 object to an S4 object ", {
on.exit(S4_remove_classes(c("ParentS4", "ChildS7")))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth considering extening local_S4_class() to handle more cases so we could skip the explicit cleanup steps (which are easy to forget)?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local_S4_class() is convenient and robust, but it obfuscates the code, in my opinion. I think readers expect to see setClass(), setOldClass(), and setClassUnion(). I'd prefer something like S4_remove_after(setClass(...)).

How about we defer this to a future change after merging this branch and discussing? Since the changes are so extensive, rebasing is becoming a chore. It might be better to iterate over multiple MRs.

@t-kalinowski t-kalinowski left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lawremi I took a cursory look and didn’t see any obvious issues. Could you please tag me again once you’ve had a chance to resolve Hadley’s comments? I’ll take a more thorough look then. That way I can avoid duplicating effort or leaving comments that might become stale.

@lawremi lawremi force-pushed the issue-456-s4-extends-s7 branch from 7a09607 to 238130f Compare June 20, 2026 00:37
@lawremi

lawremi commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

I think all of the major comments have been resolved, so @t-kalinowski can please give it a look.

@lawremi lawremi force-pushed the issue-456-s4-extends-s7 branch from a2cb084 to c88c241 Compare June 23, 2026 01:06
@hadley hadley requested a review from t-kalinowski June 24, 2026 12:44

@t-kalinowski t-kalinowski left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments in #720

@lawremi

lawremi commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator Author

Since @t-kalinowski had the good idea of using AI to help review this branch, I asked an agent to generate this review guide that walks a reviewer through the diff. The links are to code, not diffs, so I'm not sure how helpful they are, but they do point to cohesive bits of logic. Please let me know if this helps and/or if you'd like to schedule a call for me to walk you through this.

Review Guide for PR #659: S4/S7 Inheritance

PR: #659
Full diff: https://github.com/RConsortium/S7/pull/659/files
Branch compare: main...issue-456-s4-extends-s7

This PR is large because the feature crosses object representation, S4 class
registration, construction, initialization, validation, property access, and
method registration. This guide is intended to make the review tractable by
separating the design invariants from the files that implement them.

What This PR Enables

The goal is two-way S4/S7 inheritance interop:

  1. An S7 class can extend an S4 class.

    • S4 slots are represented as S7 properties.
    • S4 dispatch and methods::is() see the S4 parent.
    • The resulting S7 object remains an old-class/S3-style object, so
      isS4() is still FALSE.
  2. An S4 class can extend an S7 class.

    • S7 properties are exposed as formal S4 slots on the reified old class.
    • S4 subclasses are real S4 objects.
    • S7 dispatch still sees the S7 class through _S7_class.
    • S4 validity can invoke S7 validation.
  3. S7 method registration works when S4/internal generics are involved.

    • Internal generics with S4 dispatch, such as dim() or dimnames<-, can
      get S4 methods when an S7 class has an S4 ancestor.
    • External-generic sentinels preserve enough information to register full
      S4 signatures.

Suggested Review Order

1. Start With The Public Contract

Read these first, before implementation details:

Questions to answer at this level:

  • Is the advertised behavior the right behavior?
  • Are the documented caveats explicit enough?
  • Is S4_contains() the right user-facing name for “this S7 class is suitable
    for methods::setClass(contains = )”?

2. Review The Core S4 Bridge

Most of the feature lives in R/S4.R:

Key invariants:

  • S4_register() always creates a reified S4 old class for an S7 class.
    This avoids a dependent package being stuck with a non-reified old class if it
    later wants to extend the S7 class from S4.
  • The S4-visible class carries a formal _S7_class slot whose value has S4
    class "S7_class".
  • S4_contains() does not register as a side effect; it asserts that the class
    is registered and that its properties can safely cross the S4 inheritance
    boundary.
  • Properties with custom getters or setters cannot be exposed as S4 slots.
  • S7 unions are not eagerly registered except when the user explicitly calls
    S4_register() on the union; when possible, existing S4 unions are reused.

3. Review Object Representation And Class Lookup

These are the representation-level changes reviewers should check carefully:

Key invariants:

  • S4 subclasses that inherit from S7 use a formal _S7_class slot
  • R code should use S7_class(object) or .Call(S7_class_, object), not
    direct slot(object, "S7_class") or attr(object, "S7_class").

4. Review Property Access Across The Boundary

Property access is intentionally not perfectly encapsulated once S4 code is
allowed to see S7 properties as slots. Important pieces:

Known compromise:

  • methods::slot() and methods::slot<-() can bypass S7 property setters.
    This is a consequence of exposing properties as S4 slots for compatibility.
    methods::validObject() should be used when S4 code mutates those slots.

5. Review Validation

The validation path is split between S7 and S4:

Key questions:

  • Does S4 validation call into S7 in the right places?
  • Does it avoid recursively validating the same S7 hierarchy repeatedly?
  • Are invalid S4/S7 mismatches reported as failures rather than silently
    accepted?

6. Review Dispatch And Method Registration Last

The dispatch-related change is smaller and mostly separable:

Key invariant:

  • Registering an S7 method on an internal generic should still use the S3 path
    for ordinary S3/S7 behavior, but additionally register an S4 method when the
    signature has an S4 ancestor and S4 will be the runtime dispatch path.

Tests Worth Reading

It is probably worth skimming some of the tests.

Most of the behavioral spec is in tests/testthat/test-S4.R.

Recommended test slices:

Lower-Risk / Mechanical Files

These are useful to skim after the core review:

  • NAMESPACE: exports/imports for the new API.
  • man/*.Rd: generated from roxygen, except worth checking wording.
  • NEWS.md: high-level feature note.
  • .lintr.R: local lintr defaults.
  • .Rbuildignore: vignette/review-support housekeeping.
  • Snapshot changes under tests/testthat/_snaps/.

Known Caveats To Confirm

These are intentional and documented:

  • S7 objects that directly extend S4 are not S4 objects: isS4(x) is FALSE.
    S4 inheritance should be checked with methods::is(x, "Parent").
  • class(x) for S7 objects remains non-scalar to preserve S3/S7 dispatch
    semantics. S4 code that needs a primary name should use class(x)[1L]
    or methods::class1(x) when available.
  • Exposing S7 properties as S4 slots means S4 code can bypass S7 setters with
    slot<-.
  • Properties with custom getters/setters are rejected by S4_contains().
  • S7 unions are registered with S4 only when explicitly requested.
  • _S7_class is the new storage name; S7_class is only a legacy read
    fallback for older serialized/lazy-loaded objects.

@hadley hadley left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's worth doing a little more style polishing but really I it's fine to merge as is. It doesn't need to be perfect, it just needs to advance S4 support and if we get it in quickly we'll avoid a bunch of merge conflicts.

Comment on lines +140 to +142
on.exit(S4_remove_classes(c("ParentS4", "ChildS7")))
setClass("ParentS4", slots = list(x = "numeric"))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use local_s4_class() instead?

Comment thread R/class-spec.R Outdated
Comment thread R/class.R Outdated
Comment thread R/convert.R Outdated
lawremi and others added 27 commits June 27, 2026 12:03
…ic is internal and an element of the signature has an S4 ancestor. This is needed because internal generics will favor S4 methods on S4 objects, so there is potential for inheriting overrides.
…ary S4 classes instead of old classes, because an old class implies an S3 instance of the object can exist, ie, there can be an S3Part().
… it to get stripped during S4 upcasting, breaking the S7 object
…in principle, there should only be one ::S4Slots shim for each S4 derivative of an S7 class
Co-authored-by: Hadley Wickham <h.wickham@gmail.com>
Co-authored-by: Hadley Wickham <h.wickham@gmail.com>
…S4 to inherit from S7; there's just two types of classes: the stub old class and the reified old class. The former for when only dispatch is needed, the latter for when inheritance (in either direction) is needed.
…ontains=TRUE used to do), which ensures that classes are always able to be used as S4 superclasses. Introduce a separate helper S4_contains() that just ensures the S7 class is a suitable parent of an S4 class.
@lawremi lawremi force-pushed the issue-456-s4-extends-s7 branch from 8da17d8 to 616de46 Compare June 27, 2026 20:08
@lawremi

lawremi commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator Author

@t-kalinowski is it OK if we merge this and then address concerns from #720? There are some important things to fix there, but I'd rather do it on top of this.

@hadley

hadley commented Jul 3, 2026

Copy link
Copy Markdown
Member

Tomasz is out on vacation so I’ll answer on his behalf 😀 yes it’s fine.

I’ve also had conversations with him about my sense that codex tends to make code more technically correct at the expense of readability/understandability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Mixing S4 and S7 inheritance hierarchies

3 participants