Skip to content

Conversation

@shewu-quic
Copy link
Collaborator

Summary:

  • Support SLC allocator feature setting
  • Support spill-fill buffer for hybrid mode
  • Fixed redundant convert_op in 8w8a quantization config

Test Plan:

  • Check the debug log to confirm if the SLC allocator is enabled.
 I [Qnn ExecuTorch]: QnnDsp <V> SLC_ALLOCATOR is set to 1 for graph 0
  • Check the debug log whether the spill-fill buffer is configured for the LLM in hybrid mode.
 python examples/qualcomm/oss_scripts/llama/llama.py -b build-android -H ${HOST} -s ${DEVICE} -m SM8750 --temperature 0 --model_mode hybrid --max_seq_len 128 --prefill_ar_len 32 --decoder_model smollm2_135m --prompt "I would like to learn python, could you teach me with a simple example?"  --artifact {ARTIFACTS}

Summary:
- Support SLC allocator feature setting
- Support spill-fill buffer for hybrid mode
@shewu-quic shewu-quic requested a review from cccclai as a code owner February 9, 2026 07:15
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 9, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17302

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures

As of commit a880a42 with merge base efe4f0c (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 9, 2026
@shewu-quic
Copy link
Collaborator Author

Hi @cccclai, @billmguo,

This PR introduces the use_slc_allocator option, which enables users to utilize the System Level Cache Allocator for a specific graph. This feature can help save overall bandwidth in the use case.

When you have time, could you help review this PR?
Thanks

@billmguo
Copy link
Contributor

billmguo commented Feb 9, 2026

Look good to me.

@billmguo billmguo self-requested a review February 9, 2026 18:17
@shewu-quic
Copy link
Collaborator Author

@pytorchbot label "release notes: qualcomm"

@pytorch-bot pytorch-bot bot added the release notes: qualcomm Changes to the Qualcomm backend delegate label Feb 10, 2026
/// Allows user to enable the usage of the System Level Cache Allocator for a given graph.
/// It will help the by reducing overall bandwith on the use case.
/// The feature is only supported by specific SOCs.
use_slc_allocator:bool;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like we're adding a new feature, can we update the read me regarding how to use it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly a bit more explanation on System Level Cache Allocator

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a new feature. Users just need to set use_slc_allocator=True in compile_spec to enable it.
https://github.com/pytorch/executorch/pull/17302/changes#diff-0439f6a7c1a3a3cfb222cd6409b6754f17a1ce782dd231de1d12bbf957d588f7R1000

System Level Cache Allocator is a shared cache at the system level of a SoC, serving as the last caching layer before external DDR memory. Its primary purpose is to optimize memory bandwidth, thereby potentially improving performance and reducing power consumption. However, it is model-dependent, so it is not guaranteed to be effective in all cases.

@meta-codesync
Copy link
Contributor

meta-codesync bot commented Feb 10, 2026

@cccclai has imported this pull request. If you are a Meta employee, you can view this in D92849545.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: qualcomm Changes to the Qualcomm backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants