[WIP] Feature/var masking train stepper#1196
Conversation
5f40449 to
0524dc2
Compare
| # Build a NaN-free view of input for the corrector and ocean model. | ||
| # When variable masking augmentation fills IC channels with NaN, those NaNs | ||
| # survive normalization's fill (which only applies to input_norm) and would | ||
| # propagate through area-weighted means in physics corrections, producing NaN | ||
| # outputs and zero gradients. Replacing with the normalizer's denormalized | ||
| # estimate (climatological mean for masked variables) keeps corrections valid. |
There was a problem hiding this comment.
I disagree with this take - the corrector is not valid if we're giving it climatological means instead of actual values. Like, we can't apply a corrector if the values it needs to do the correction are missing. This avoids a NaN or crash but I think a crash is the right call in this scenario.
If input data is missing for the correction, the way the code is right now I think it should raise an exception and halt. Perhaps we could add configuration to only apply certain corrections if values are available, and make the corrector mask-aware.
However for just masking inputs randomly for the network, we could consider making use of the input timestep data despite that input being masked from the network itself. This could work in a dropout-style approach for masking.
Working on masking to train stepper
Short description of why the PR is needed and how it satisfies those requirements, in sentence form.
Changes:
symbol (e.g.
fme.core.my_function) or script and concise description of changes or added featureCan group multiple related symbols on a single bullet
Tests added
If dependencies changed, "deps only" image rebuilt and "latest_deps_only_image.txt" file updated
Resolves # (delete if none)