Qualcomm AI Engine Direct - Support SLC allocator feature #17302

shewu-quic · 2026-02-09T07:15:18Z

Summary:

Support SLC allocator feature setting
Support spill-fill buffer for hybrid mode
Fixed redundant convert_op in 8w8a quantization config

Test Plan:

Check the debug log to confirm if the SLC allocator is enabled.

 I [Qnn ExecuTorch]: QnnDsp <V> SLC_ALLOCATOR is set to 1 for graph 0

Check the debug log whether the spill-fill buffer is configured for the LLM in hybrid mode.

 python examples/qualcomm/oss_scripts/llama/llama.py -b build-android -H ${HOST} -s ${DEVICE} -m SM8750 --temperature 0 --model_mode hybrid --max_seq_len 128 --prefill_ar_len 32 --decoder_model smollm2_135m --prompt "I would like to learn python, could you teach me with a simple example?"  --artifact {ARTIFACTS}

Summary: - Support SLC allocator feature setting - Support spill-fill buffer for hybrid mode

pytorch-bot · 2026-02-09T07:15:23Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17302

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures

As of commit a880a42 with merge base efe4f0c ():

NEW FAILURES - The following jobs have failed:

pull / test-moshi-linux / linux-job (gh)
RuntimeError: Command docker exec -t 7bce8bd900667ce6c189f73ac3610982aa0d6d81762e9242618a733b616d4f06 /exec failed with exit code 1
pull / test-samsung-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t 2cef151c11123ad08830ca6420215f61b96cf031fbf00d85262c545bef931bcd /exec failed with exit code 1
pull / test-samsung-quantmodels-linux / linux-job (gh)
RuntimeError: Command docker exec -t f75dd97f5f0c547e60d59fcc271add0d93d4164df361a4898ece150c6a306a1d /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

shewu-quic · 2026-02-09T07:22:36Z

Hi @cccclai, @billmguo,

This PR introduces the use_slc_allocator option, which enables users to utilize the System Level Cache Allocator for a specific graph. This feature can help save overall bandwidth in the use case.

When you have time, could you help review this PR?
Thanks

billmguo · 2026-02-09T18:16:26Z

Look good to me.

shewu-quic · 2026-02-10T06:54:44Z

@pytorchbot label "release notes: qualcomm"

cccclai · 2026-02-10T17:43:46Z

backends/qualcomm/serialization/qc_compiler_spec.fbs

+  /// Allows user to enable the usage of the System Level Cache Allocator for a given graph. 
+  /// It will help the by reducing overall bandwith on the use case.
+  /// The feature is only supported by specific SOCs.
+  use_slc_allocator:bool;


looks like we're adding a new feature, can we update the read me regarding how to use it?

Possibly a bit more explanation on System Level Cache Allocator

Yes, this is a new feature. Users just need to set use_slc_allocator=True in compile_spec to enable it.
https://github.com/pytorch/executorch/pull/17302/changes#diff-0439f6a7c1a3a3cfb222cd6409b6754f17a1ce782dd231de1d12bbf957d588f7R1000

System Level Cache Allocator is a shared cache at the system level of a SoC, serving as the last caching layer before external DDR memory. Its primary purpose is to optimize memory bandwidth, thereby potentially improving performance and reducing power consumption. However, it is model-dependent, so it is not guaranteed to be effective in all cases.

meta-codesync · 2026-02-10T17:50:59Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this in D92849545.

Qualcomm AI Engine Direct - Support SLC allocator feature

af45ed0

Summary: - Support SLC allocator feature setting - Support spill-fill buffer for hybrid mode

shewu-quic requested a review from cccclai as a code owner February 9, 2026 07:15

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 9, 2026

billmguo self-requested a review February 9, 2026 18:17

billmguo approved these changes Feb 9, 2026

View reviewed changes

Fixed rebase error

a880a42

pytorch-bot bot added the release notes: qualcomm Changes to the Qualcomm backend delegate label Feb 10, 2026

cccclai reviewed Feb 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qualcomm AI Engine Direct - Support SLC allocator feature #17302

Qualcomm AI Engine Direct - Support SLC allocator feature #17302

Uh oh!

shewu-quic commented Feb 9, 2026

Uh oh!

pytorch-bot bot commented Feb 9, 2026 •

edited

Loading

Uh oh!

shewu-quic commented Feb 9, 2026

Uh oh!

billmguo commented Feb 9, 2026

Uh oh!

shewu-quic commented Feb 10, 2026

Uh oh!

cccclai Feb 10, 2026

Uh oh!

cccclai Feb 10, 2026

Uh oh!

shewu-quic Feb 11, 2026

Uh oh!

meta-codesync bot commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Qualcomm AI Engine Direct - Support SLC allocator feature #17302

Are you sure you want to change the base?

Qualcomm AI Engine Direct - Support SLC allocator feature #17302

Uh oh!

Conversation

shewu-quic commented Feb 9, 2026

Uh oh!

pytorch-bot bot commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17302

❌ 3 New Failures

Uh oh!

shewu-quic commented Feb 9, 2026

Uh oh!

billmguo commented Feb 9, 2026

Uh oh!

shewu-quic commented Feb 10, 2026

Uh oh!

cccclai Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

cccclai Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

shewu-quic Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

meta-codesync bot commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot bot commented Feb 9, 2026 •

edited

Loading