Arm backend: Improve configuration of skipping quantization for nodes #16688

martinlsm · 2026-01-19T08:12:48Z

Prior to this change, nodes excluded for quantization based on the global quantization config were not tagged properly. This resulted in affected nodes got transformed for annotation (and possibly decomposed) by mistake when were not quantized. This patch fixes this issue.

Also, TosaQuantizer._set_disallow_tfa_for_nodes now checks self.module_name_config such that nodes can be excluded for "transformfor annotation" when they are referred to via their owning modules' names.

cc @freddan80 @per @zingo @oscarandersson8218 @digantdesai

Prior to this change, nodes excluded for quantization based on the global quantization config were not tagged properly. This resulted in affected nodes got transformed for annotation (and possibly decomposed) by mistake when were not quantized. This patch fixes this issue. Signed-off-by: Martin Lindström <Martin.Lindstroem@arm.com> Change-Id: Ibc498ad49586a1c0fec99bb0ae40d2c2f02b9729

`TosaQuantizer._set_disallow_tfa_for_nodes` now checks `self.module_name_config` such that nodes can be excluded for "transform for annotation" when they are referred to via their owning modules' names. Signed-off-by: Martin Lindström <Martin.Lindstroem@arm.com> Change-Id: Ide52d1127bd77247f9938229326da88ef97eb92d

pytorch-bot · 2026-01-19T08:12:54Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16688

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Cancelled Job, 1 Unrelated Failure

As of commit 847ea55 with merge base 7492d0d ():

NEW FAILURES - The following jobs have failed:

Apple / build-benchmark-app / macos-job (gh)
RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 128
pull / test-selective-build-linux / linux-job (gh)
AttributeError: module 'urllib' has no attribute 'URLopener'

CANCELLED JOB - The following job was cancelled. Please retry:

trunk / test-models-macos-cpu (resnet50, xnnpack-quantization-delegation) / macos-job (gh)
##[error]The operation was canceled.

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / test-arm-backend-ethos-u (test_smaller_stories_llama) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

martinlsm · 2026-01-19T08:13:03Z

@pytorchbot label ciflow/trunk

martinlsm · 2026-01-19T08:13:12Z

@pytorchbot label "partner: arm"

github-actions · 2026-01-19T08:13:55Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copilot

Pull request overview

This PR improves the configuration mechanism for skipping quantization of specific nodes in the Arm backend. Prior to this change, nodes excluded from quantization based on the global config were not being tagged properly with the disallow_tfa flag, causing them to be incorrectly transformed for annotation when they should remain unquantized.

Changes:

Refactored _set_disallow_tfa_for_nodes to properly handle global, module type, and module name configurations with correct precedence
Added support for None as a global quantization config to exclude all nodes by default
Added comprehensive test coverage for partial quantization scenarios

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File	Description
backends/arm/quantizer/arm_quantizer.py	Refactored `_set_disallow_tfa_for_nodes` to check global config, module type config, and module name config in order of precedence; removed unused `_get_composite_filter` function; added `_for_each_filtered_node` helper; updated `set_global` to accept `None`
backends/arm/test/quantizer/test_partial_quantization.py	Added new test models (LinearSoftmaxModel, TwoSigmoidsModel); updated `_run_quantization_pipeline` to support configurable global, module type, and module name configs; added `_assert_disallow_flags` helper; added comprehensive tests for partial quantization scenarios including precedence rules

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-19T08:18:28Z

backends/arm/quantizer/arm_quantizer.py

-            node.meta[DISALLOW_TFA_META_KEY] = composite_filter(node)
+            node.meta[DISALLOW_TFA_META_KEY] = self.global_config is None
+
+        # Next, override using to module type config to take precendence over global config


Spelling error: "precendence" should be "precedence".

Suggested change

# Next, override using to module type config to take precendence over global config

# Next, override using to module type config to take precedence over global config

Copilot · 2026-01-19T08:18:29Z

backends/arm/quantizer/arm_quantizer.py

        """Set quantization_config for submodules not matched by other filters.

        Args:


The documentation for the quantization_config parameter should be updated to reflect that it can now be None. When None, nodes not matched by other filters will be excluded from quantization and marked with disallow_tfa.

Copilot · 2026-01-19T08:18:29Z

backends/arm/test/quantizer/test_partial_quantization.py

-    unquantized_submodules: Iterable[type[torch.nn.Module]],
+    global_config: QuantizationConfig | None = None,
+    module_type_configs: (
+        Iterable[tuple[type[torch.nn.Module], QuantizationConfig]] | None


The type annotation should allow None for QuantizationConfig in the tuples, as None is used to skip quantization for specific module types. The type should be Iterable[tuple[type[torch.nn.Module], QuantizationConfig | None]] | None to match the actual usage (e.g., line 123 passes None values).

Suggested change

Iterable[tuple[type[torch.nn.Module], QuantizationConfig]] | None

Iterable[tuple[type[torch.nn.Module], QuantizationConfig | None]] | None

Copilot · 2026-01-19T08:18:29Z

backends/arm/test/quantizer/test_partial_quantization.py

+    module_type_configs: (
+        Iterable[tuple[type[torch.nn.Module], QuantizationConfig]] | None
+    ) = None,
+    module_name_configs: Iterable[tuple[str, QuantizationConfig]] | None = None,


The type annotation should allow None for QuantizationConfig in the tuples, as None is used to skip quantization for specific module names. The type should be Iterable[tuple[str, QuantizationConfig | None]] | None to match the actual usage (e.g., line 192 passes None values).

Suggested change

module_name_configs: Iterable[tuple[str, QuantizationConfig]] | None = None,

module_name_configs: Iterable[tuple[str, QuantizationConfig | None]] | None = None,

Copilot · 2026-01-19T08:18:29Z

backends/arm/test/quantizer/test_partial_quantization.py

-            self.linear = torch.nn.Linear(10, 10)
-            self.softmax = torch.nn.Softmax(dim=1)
+    """Ensure a softmax and linear skipped for quantization are not decomposed
+    and have their disallow_tfa` set.


Missing backtick at the end of disallow_tfa. Should be disallow_tfa setinstead ofdisallow_tfa set..

Suggested change

and have their disallow_tfa` set.

and have their `disallow_tfa` set.

Copilot · 2026-01-19T08:18:30Z

backends/arm/quantizer/arm_quantizer.py

+def _for_each_filtered_node(
+    model: GraphModule,
+    filter_fn: Callable[[Node], bool],
 ):


The function is missing a return type annotation. Since this is a generator function that yields nodes, it should have a return type annotation like -> Generator[Node, None, None] (requires importing Generator from typing or collections.abc).

zingo

OK the marge after the other comments hare handled (marked resolved) and Arm tests pass (OK to re-trigger 1-2 times if it's flakye to prove it's works)

Martin Lindström added 2 commits January 19, 2026 09:04

Copilot AI review requested due to automatic review settings January 19, 2026 08:12

martinlsm requested a review from digantdesai as a code owner January 19, 2026 08:12

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 19, 2026

pytorch-bot bot added the ciflow/trunk label Jan 19, 2026

pytorch-bot bot added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Jan 19, 2026

Copilot started reviewing on behalf of martinlsm January 19, 2026 08:13 View session

Copilot AI reviewed Jan 19, 2026

View reviewed changes

zingo approved these changes Jan 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Arm backend: Improve configuration of skipping quantization for nodes #16688

Arm backend: Improve configuration of skipping quantization for nodes #16688

martinlsm commented Jan 19, 2026 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Jan 19, 2026 •

edited

Loading

Uh oh!

martinlsm commented Jan 19, 2026

Uh oh!

martinlsm commented Jan 19, 2026

Uh oh!

github-actions bot commented Jan 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 19, 2026

Uh oh!

Copilot AI Jan 19, 2026

Uh oh!

Copilot AI Jan 19, 2026

Uh oh!

Copilot AI Jan 19, 2026

Uh oh!

Copilot AI Jan 19, 2026

Uh oh!

Copilot AI Jan 19, 2026

Uh oh!

zingo left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	# Next, override using to module type config to take precendence over global config
	# Next, override using to module type config to take precedence over global config

		"""Set quantization_config for submodules not matched by other filters.

		Args:

	Iterable[tuple[type[torch.nn.Module], QuantizationConfig]] \| None
	Iterable[tuple[type[torch.nn.Module], QuantizationConfig \| None]] \| None

	module_name_configs: Iterable[tuple[str, QuantizationConfig]] \| None = None,
	module_name_configs: Iterable[tuple[str, QuantizationConfig \| None]] \| None = None,

	and have their disallow_tfa` set.
	and have their `disallow_tfa` set.

Arm backend: Improve configuration of skipping quantization for nodes #16688

Are you sure you want to change the base?

Arm backend: Improve configuration of skipping quantization for nodes #16688

Conversation

martinlsm commented Jan 19, 2026 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16688

❌ 2 New Failures, 1 Cancelled Job, 1 Unrelated Failure

Uh oh!

martinlsm commented Jan 19, 2026

Uh oh!

martinlsm commented Jan 19, 2026

Uh oh!

github-actions bot commented Jan 19, 2026

This PR needs a release notes: label

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

zingo left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

martinlsm commented Jan 19, 2026 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Jan 19, 2026 •

edited

Loading

This PR needs a `release notes:` label