Draft: Improved p-values for smooth terms (Wood 2013b). For #163. by vsl366 · Pull Request #586 · dswah/pyGAM

vsl366 · 2026-06-06T06:20:20Z

Draft PR. Replaces the current method of computing p values in PyGAM with the method mentioned in Simon Wood's "On p-values for smooth components of an extended generalized additive model" (Biometrika 100:221-228), addressing issue #163.

This is work in progress. A basic implementation has been finished with minimal optimization and limited testing, which includes seperate testing using monte carlo, and empirical validation against mgcv's reference, with many different categories of test cases.

Changes:

Added edf1_per_coef to model statistics: the effective degrees of freedom per coefficient, computed as 2 * diag(F) - diag(F^2) per Wood (2013) eq. 4. The existing 'edof' (sum of diag(F)) is unchanged and still used for AIC, GCV, CIs.
Added two helper methods on the GAM class:
- _liu2: chi-squared mixture tail probability using the moment matching approximation from Liu, Tang, Zhang (2009).
- _liu2_scaled_quadrature: integrates _liu2 over scaled chi-squared quantiles for the estimated-scale case.
Added _woodteststat, which implements the test statistic and reference distribution from the paper.
- Get R from Xj
- Stage A: eigendecompose R V_β R.T to get its principal directions and their variances. mgcv's sign convention is applied to the eigenvectors so the results match the reference implementation exactly.
- Stage B: split τ into an integer part k = floor(τ) and a fractional part ν = τ - k.
- Stage C: discard near-zero eigenvalues. If τ exceeds the usable rank, trim it down. If no usable directions remain (degenerate V_β), default with p = 1.
- Stage D: build the rank-τ pseudoinverse in factored form. The top k-1 directions get fully inverted (divided by sqrt of their eigenvalue). The last two directions are handled together as a boundary pair, partially inverted via the B-tilde block from the paper (a 2x2 matrix mixing them by a factor that depends on ν).
- Stage E: compute T as the squared norm of the projection of β̂ R onto these scaled directions. Because the matrix square-root of B-tilde has a sign ambiguity, two versions of T are computed (with the two sign orientations of the boundary block) and the resulting p-values are averaged, as in the paper.
- Stage F: compute the p-value. For fractional τ with known scale, the reference distribution is a weighted chi-squared mixture (weights 1, 1, ..., ν₁, ν₂ where ν₁ + ν₂ = ν + 1) and the tail is evaluated by _liu2. For estimated scale, _liu2_scaled_quadrature is used. For integer τ, the function falls through to the simpler chi-squared (known scale) or F (estimated scale) reference, since the boundary block isn't needed.
Modified _compute_p_value to use the above helper function and implement the changes.
Added new tests in the test_pvalues file, validating all created functions
Other small adjustments
Removed known bug warning

References:

Wood (2013) Biometrika 100:221-228
Liu, Tang, Zhang (2009) Comp Stat Data Analysis 53:853-856

vsl366 added 3 commits May 22, 2026 13:08

Add edf1 and liu2 helpers for Wood (2013) p-values

530d1ed

Add woodteststat, update compute_p_value, add more test cases

58d3ca5

Remove known bug warning

cb7fe98

vsl366 force-pushed the gsoc-vsl366 branch from 917fe49 to cb7fe98 Compare June 27, 2026 15:48

removed unnecessary coef centering

b2da86a

vsl366 force-pushed the gsoc-vsl366 branch from d3c2cb5 to c8c59a2 Compare June 27, 2026 16:11

Centered based on smooths, and decomposed Xj

a0065d0

vsl366 force-pushed the gsoc-vsl366 branch from c8c59a2 to a0065d0 Compare June 27, 2026 16:14

vsl366 added 3 commits June 27, 2026 21:57

Added QR Partially, T.B.D Testing, Unchecked Version

886689c

Completed QR, Added tests

6039204

Completed QR, Added tests

8469ff4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Improved p-values for smooth terms (Wood 2013b). For #163.#586

Draft: Improved p-values for smooth terms (Wood 2013b). For #163.#586
vsl366 wants to merge 8 commits into
dswah:mainfrom
vsl366:gsoc-vsl366

vsl366 commented Jun 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vsl366 commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vsl366 commented Jun 6, 2026 •

edited

Loading