fix: inherit Core from BaseEstimator for sklearn compatibility (Issue…#564
Open
CRAFTYPROGRAMMER826 wants to merge 1 commit into
Open
fix: inherit Core from BaseEstimator for sklearn compatibility (Issue…#564CRAFTYPROGRAMMER826 wants to merge 1 commit into
CRAFTYPROGRAMMER826 wants to merge 1 commit into
Conversation
Author
|
@dswah could you please review and provide your valuable feedback on the changed classes and discovered gaps responsible for incompatibility |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #422
Overview
This PR begins the foundational work to make pyGAM compatible with the
scikit-learn estimator API (v1.7+), enabling pyGAM models to be used
inside sklearn Pipelines, GridSearchCV, and cross_validate workflows
without manual wrappers.
Research & Audit
A full compatibility audit was performed on the pyGAM source code against
the scikit-learn estimator contract. The findings are documented in
compatibility_log.mdwhich includes diagnostic test outputs, MRO analysis,signature inspection, and check_estimator failure traces. Reviewers are
strongly encouraged to go through it for full context.
Root Cause Analysis
The audit identified the following structural incompatibilities:
1. Missing BaseEstimator in Inheritance Chain (Fixed in this PR)
The foundational class
pygam.core.Coreinherited directly fromobject,meaning no pyGAM estimator was recognized by sklearn's internal tooling.
Refactoring
Coreto inherit fromsklearn.base.BaseEstimatorcorrectsthe MRO across all estimators (LinearGAM, LogisticGAM, PoissonGAM, etc.)
and provides
get_params(),set_params(), and__sklearn_tags__()through standard inheritance.
Verified after fix:
clone(LinearGAM())✅Pipeline([StandardScaler(), LinearGAM()]).fit(X, y)✅2.
**kwargsin__init__Signatures (Remaining — next PR)Child estimators like
LinearGAMpass parameters up toCorevia**kwargs. sklearn's introspection usesinspect.signature()to discoverparameters — anything inside
**kwargsis invisible to it. This meansparameters like
n_splinesandlamcannot be read byget_params(),cannot be set by
set_params(), and cannot be tuned byGridSearchCV.The fix requires replacing
**kwargswith explicit, default-valuedparameters in each estimator's
__init__.3. Non-Standard Parameter Storage (Remaining — next PR)
Core.__init__stores parameters with leading underscores(e.g.
self._name = name). sklearn requires a strict 1:1 mapping —a parameter named
namemust be stored asself.name. The underscorestorage causes failures during cloning and hyperparameter tuning as
sklearn cannot locate the stored values.
4. Custom
get_params/set_paramsConflict (Remaining — next PR)pyGAM defines its own
get_params()andset_params()with non-standardarguments (
force=False, custom filtering for user-facing vs non-user-facingattributes). These override the BaseEstimator versions and break nested
object handling inside Pipelines. These custom implementations need to be
removed or refactored to conform to the sklearn contract.
Status
This PR addresses the foundational inheritance issue. The remaining items
above are well-understood and scoped — they will be addressed in follow-up
PRs as part of the GSoC 2026 project proposal for Issue #422.
Please refer to
compatibility_log.mdfor full diagnostic outputs, test traces, and analysis.