Skip to content

Conversation

@martinlsm
Copy link
Collaborator

@martinlsm martinlsm commented Jan 19, 2026

Prior to this change, nodes excluded for quantization based on the global quantization config were not tagged properly. This resulted in affected nodes got transformed for annotation (and possibly decomposed) by mistake when were not quantized. This patch fixes this issue.

Also, TosaQuantizer._set_disallow_tfa_for_nodes now checks self.module_name_config such that nodes can be excluded for "transformfor annotation" when they are referred to via their owning modules' names.

cc @freddan80 @per @zingo @oscarandersson8218 @digantdesai

Martin Lindström added 2 commits January 19, 2026 09:04
Prior to this change, nodes excluded for quantization based on the
global quantization config were not tagged properly. This resulted in
affected nodes got transformed for annotation (and possibly decomposed)
by mistake when were not quantized. This patch fixes this issue.

Signed-off-by: Martin Lindström <Martin.Lindstroem@arm.com>
Change-Id: Ibc498ad49586a1c0fec99bb0ae40d2c2f02b9729
`TosaQuantizer._set_disallow_tfa_for_nodes` now checks
`self.module_name_config` such that nodes can be excluded for "transform
for annotation" when they are referred to via their owning modules'
names.

Signed-off-by: Martin Lindström <Martin.Lindstroem@arm.com>
Change-Id: Ide52d1127bd77247f9938229326da88ef97eb92d
Copilot AI review requested due to automatic review settings January 19, 2026 08:12
@pytorch-bot
Copy link

pytorch-bot bot commented Jan 19, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16688

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Cancelled Job, 1 Unrelated Failure

As of commit 847ea55 with merge base 7492d0d (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 19, 2026
@martinlsm
Copy link
Collaborator Author

@pytorchbot label ciflow/trunk

@martinlsm
Copy link
Collaborator Author

@pytorchbot label "partner: arm"

@pytorch-bot pytorch-bot bot added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Jan 19, 2026
@github-actions
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the configuration mechanism for skipping quantization of specific nodes in the Arm backend. Prior to this change, nodes excluded from quantization based on the global config were not being tagged properly with the disallow_tfa flag, causing them to be incorrectly transformed for annotation when they should remain unquantized.

Changes:

  • Refactored _set_disallow_tfa_for_nodes to properly handle global, module type, and module name configurations with correct precedence
  • Added support for None as a global quantization config to exclude all nodes by default
  • Added comprehensive test coverage for partial quantization scenarios

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
backends/arm/quantizer/arm_quantizer.py Refactored _set_disallow_tfa_for_nodes to check global config, module type config, and module name config in order of precedence; removed unused _get_composite_filter function; added _for_each_filtered_node helper; updated set_global to accept None
backends/arm/test/quantizer/test_partial_quantization.py Added new test models (LinearSoftmaxModel, TwoSigmoidsModel); updated _run_quantization_pipeline to support configurable global, module type, and module name configs; added _assert_disallow_flags helper; added comprehensive tests for partial quantization scenarios including precedence rules

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

node.meta[DISALLOW_TFA_META_KEY] = composite_filter(node)
node.meta[DISALLOW_TFA_META_KEY] = self.global_config is None

# Next, override using to module type config to take precendence over global config
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error: "precendence" should be "precedence".

Suggested change
# Next, override using to module type config to take precendence over global config
# Next, override using to module type config to take precedence over global config

Copilot uses AI. Check for mistakes.
Comment on lines 385 to 387
"""Set quantization_config for submodules not matched by other filters.
Args:
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for the quantization_config parameter should be updated to reflect that it can now be None. When None, nodes not matched by other filters will be excluded from quantization and marked with disallow_tfa.

Copilot uses AI. Check for mistakes.
unquantized_submodules: Iterable[type[torch.nn.Module]],
global_config: QuantizationConfig | None = None,
module_type_configs: (
Iterable[tuple[type[torch.nn.Module], QuantizationConfig]] | None
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type annotation should allow None for QuantizationConfig in the tuples, as None is used to skip quantization for specific module types. The type should be Iterable[tuple[type[torch.nn.Module], QuantizationConfig | None]] | None to match the actual usage (e.g., line 123 passes None values).

Suggested change
Iterable[tuple[type[torch.nn.Module], QuantizationConfig]] | None
Iterable[tuple[type[torch.nn.Module], QuantizationConfig | None]] | None

Copilot uses AI. Check for mistakes.
module_type_configs: (
Iterable[tuple[type[torch.nn.Module], QuantizationConfig]] | None
) = None,
module_name_configs: Iterable[tuple[str, QuantizationConfig]] | None = None,
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type annotation should allow None for QuantizationConfig in the tuples, as None is used to skip quantization for specific module names. The type should be Iterable[tuple[str, QuantizationConfig | None]] | None to match the actual usage (e.g., line 192 passes None values).

Suggested change
module_name_configs: Iterable[tuple[str, QuantizationConfig]] | None = None,
module_name_configs: Iterable[tuple[str, QuantizationConfig | None]] | None = None,

Copilot uses AI. Check for mistakes.
self.linear = torch.nn.Linear(10, 10)
self.softmax = torch.nn.Softmax(dim=1)
"""Ensure a softmax and linear skipped for quantization are not decomposed
and have their disallow_tfa` set.
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing backtick at the end of disallow_tfa. Should be disallow_tfa setinstead ofdisallow_tfa set..

Suggested change
and have their disallow_tfa` set.
and have their `disallow_tfa` set.

Copilot uses AI. Check for mistakes.
Comment on lines +345 to 348
def _for_each_filtered_node(
model: GraphModule,
filter_fn: Callable[[Node], bool],
):
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is missing a return type annotation. Since this is a generator function that yields nodes, it should have a return type annotation like -> Generator[Node, None, None] (requires importing Generator from typing or collections.abc).

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@zingo zingo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK the marge after the other comments hare handled (marked resolved) and Arm tests pass (OK to re-trigger 1-2 times if it's flakye to prove it's works)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants