-
Notifications
You must be signed in to change notification settings - Fork 801
Arm backend: Improve configuration of skipping quantization for nodes #16688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Prior to this change, nodes excluded for quantization based on the global quantization config were not tagged properly. This resulted in affected nodes got transformed for annotation (and possibly decomposed) by mistake when were not quantized. This patch fixes this issue. Signed-off-by: Martin Lindström <Martin.Lindstroem@arm.com> Change-Id: Ibc498ad49586a1c0fec99bb0ae40d2c2f02b9729
`TosaQuantizer._set_disallow_tfa_for_nodes` now checks `self.module_name_config` such that nodes can be excluded for "transform for annotation" when they are referred to via their owning modules' names. Signed-off-by: Martin Lindström <Martin.Lindstroem@arm.com> Change-Id: Ide52d1127bd77247f9938229326da88ef97eb92d
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16688
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 1 Cancelled Job, 1 Unrelated FailureAs of commit 847ea55 with merge base 7492d0d ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot label ciflow/trunk |
|
@pytorchbot label "partner: arm" |
This PR needs a
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR improves the configuration mechanism for skipping quantization of specific nodes in the Arm backend. Prior to this change, nodes excluded from quantization based on the global config were not being tagged properly with the disallow_tfa flag, causing them to be incorrectly transformed for annotation when they should remain unquantized.
Changes:
- Refactored
_set_disallow_tfa_for_nodesto properly handle global, module type, and module name configurations with correct precedence - Added support for
Noneas a global quantization config to exclude all nodes by default - Added comprehensive test coverage for partial quantization scenarios
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| backends/arm/quantizer/arm_quantizer.py | Refactored _set_disallow_tfa_for_nodes to check global config, module type config, and module name config in order of precedence; removed unused _get_composite_filter function; added _for_each_filtered_node helper; updated set_global to accept None |
| backends/arm/test/quantizer/test_partial_quantization.py | Added new test models (LinearSoftmaxModel, TwoSigmoidsModel); updated _run_quantization_pipeline to support configurable global, module type, and module name configs; added _assert_disallow_flags helper; added comprehensive tests for partial quantization scenarios including precedence rules |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| node.meta[DISALLOW_TFA_META_KEY] = composite_filter(node) | ||
| node.meta[DISALLOW_TFA_META_KEY] = self.global_config is None | ||
|
|
||
| # Next, override using to module type config to take precendence over global config |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling error: "precendence" should be "precedence".
| # Next, override using to module type config to take precendence over global config | |
| # Next, override using to module type config to take precedence over global config |
| """Set quantization_config for submodules not matched by other filters. | ||
| Args: |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation for the quantization_config parameter should be updated to reflect that it can now be None. When None, nodes not matched by other filters will be excluded from quantization and marked with disallow_tfa.
| unquantized_submodules: Iterable[type[torch.nn.Module]], | ||
| global_config: QuantizationConfig | None = None, | ||
| module_type_configs: ( | ||
| Iterable[tuple[type[torch.nn.Module], QuantizationConfig]] | None |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type annotation should allow None for QuantizationConfig in the tuples, as None is used to skip quantization for specific module types. The type should be Iterable[tuple[type[torch.nn.Module], QuantizationConfig | None]] | None to match the actual usage (e.g., line 123 passes None values).
| Iterable[tuple[type[torch.nn.Module], QuantizationConfig]] | None | |
| Iterable[tuple[type[torch.nn.Module], QuantizationConfig | None]] | None |
| module_type_configs: ( | ||
| Iterable[tuple[type[torch.nn.Module], QuantizationConfig]] | None | ||
| ) = None, | ||
| module_name_configs: Iterable[tuple[str, QuantizationConfig]] | None = None, |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type annotation should allow None for QuantizationConfig in the tuples, as None is used to skip quantization for specific module names. The type should be Iterable[tuple[str, QuantizationConfig | None]] | None to match the actual usage (e.g., line 192 passes None values).
| module_name_configs: Iterable[tuple[str, QuantizationConfig]] | None = None, | |
| module_name_configs: Iterable[tuple[str, QuantizationConfig | None]] | None = None, |
| self.linear = torch.nn.Linear(10, 10) | ||
| self.softmax = torch.nn.Softmax(dim=1) | ||
| """Ensure a softmax and linear skipped for quantization are not decomposed | ||
| and have their disallow_tfa` set. |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing backtick at the end of disallow_tfa. Should be disallow_tfa setinstead ofdisallow_tfa set..
| and have their disallow_tfa` set. | |
| and have their `disallow_tfa` set. |
| def _for_each_filtered_node( | ||
| model: GraphModule, | ||
| filter_fn: Callable[[Node], bool], | ||
| ): |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function is missing a return type annotation. Since this is a generator function that yields nodes, it should have a return type annotation like -> Generator[Node, None, None] (requires importing Generator from typing or collections.abc).
zingo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK the marge after the other comments hare handled (marked resolved) and Arm tests pass (OK to re-trigger 1-2 times if it's flakye to prove it's works)
Prior to this change, nodes excluded for quantization based on the global quantization config were not tagged properly. This resulted in affected nodes got transformed for annotation (and possibly decomposed) by mistake when were not quantized. This patch fixes this issue.
Also,
TosaQuantizer._set_disallow_tfa_for_nodesnow checksself.module_name_configsuch that nodes can be excluded for "transformfor annotation" when they are referred to via their owning modules' names.cc @freddan80 @per @zingo @oscarandersson8218 @digantdesai