Skip to content

Pcs#41

Open
nathanchenseanwalter wants to merge 16 commits into
masterfrom
pcs
Open

Pcs#41
nathanchenseanwalter wants to merge 16 commits into
masterfrom
pcs

Conversation

@nathanchenseanwalter

Copy link
Copy Markdown
Collaborator

Update critical components and up-to-date documentation

renierts and others added 6 commits April 14, 2026 19:03
This is reflected in the C implementation k2c_activations.c
This resolved test_LSTM1, test_ProfilePredictorConv2D
Made average pooling more robust. There was a zero division with improper padding. This is resolved and solves test_AveragePooling1D2, test_AveragePooling2D2
Keras3 changed the name 'swish' to 'silu'. This fixes the test test_swish
The graph construction of bidirectional layers changed from Keras2 to Keras3 and led to extra layers, hence, numerical instabilities. This is now resolved.
Also, smaller fixes in test_split_layers.py.
…s were not resolved properly.

This is now fixed and tests included in the test suite.
Added one more test for activation after a dense layer.
* Algorithmic C optimizations + batch norm folding

- affine_matmul: i-k-j loop reorder with bias init fusion
- batch_norm: precomputed scale/bias, eliminates integer division
- relu/ReLU: branchless ternary for vectorization
- Code generator: fold batch_norm into adjacent Dense/Conv1D layers
- Fix: BatchNorm negative axis handling for Keras 3

* Add __restrict and __attribute__((hot)) annotations (no flag changes)

* Add static const codegen + fix ReLU codegen bug

* Remove unneeded files.

* Add optimization tests and deep cloning.

---------

Co-authored-by: Matthew Waller <hello@cephalopod.studio>
* Algorithmic C optimizations + batch norm folding

- affine_matmul: i-k-j loop reorder with bias init fusion
- batch_norm: precomputed scale/bias, eliminates integer division
- relu/ReLU: branchless ternary for vectorization
- Code generator: fold batch_norm into adjacent Dense/Conv1D layers
- Fix: BatchNorm negative axis handling for Keras 3

* Add __restrict and __attribute__((hot)) annotations (no flag changes)

* Add static const codegen + fix ReLU codegen bug

* Remove unneeded files.

* Add optimization tests and deep cloning.

* Add defensive callouts for threading.

---------

Co-authored-by: Matthew Waller <hello@cephalopod.studio>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants