Skip to content

Conversation

@mr-c
Copy link
Contributor

@mr-c mr-c commented Jan 9, 2026

Fixes: #26042

@sbc100 sbc100 changed the title arm_neon.h: enable additional native aliases arm_neon.h: Enable additional native aliases Jan 9, 2026
@sbc100 sbc100 requested a review from tlively January 9, 2026 16:59
@sbc100
Copy link
Collaborator

sbc100 commented Jan 9, 2026

Is there a test for this header that we can update ?

@mr-c
Copy link
Contributor Author

mr-c commented Jan 9, 2026

@sbc100 @tlively There is a test case in #26042 (comment) that could be added, but I've never done that for this repo before so I might take me longer than either of you

@mr-c
Copy link
Contributor Author

mr-c commented Jan 9, 2026

@sbc100 @tlively I adapted a test for vdotq_s32 from SIMDe

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but I'll leave the final review to @tlively since I'm not really familiar with this function of the test code.

@mr-c
Copy link
Contributor Author

mr-c commented Jan 10, 2026

@tlively @sbc100 Any advice about this error from the test I added?

/root/project/test/neon/test_neon_wasm_simd.cpp:354:5: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
  354 | int main() {
      |     ^

https://app.circleci.com/pipelines/github/emscripten-core/emscripten/48300/workflows/ce123da9-47f2-4dc0-a2a0-27ee22720ec3/jobs/1103937

@sbc100
Copy link
Collaborator

sbc100 commented Jan 10, 2026

No idea I'm afraid. Maybe if you add -fno-inline-functions the error would at least be more precise?

@mr-c mr-c force-pushed the arm_neon_compat_native_aliases branch from 1dfc420 to 6a5537c Compare January 10, 2026 19:34
@mr-c mr-c force-pushed the arm_neon_compat_native_aliases branch from 8ea042b to 144856c Compare January 11, 2026 12:43
@sbc100
Copy link
Collaborator

sbc100 commented Jan 11, 2026

Maybe we could disable the auto-vectorizer for this change, along with a TODO linked to a new LLVM bug regarding error: loop not vectorized?

@mr-c
Copy link
Contributor Author

mr-c commented Jan 12, 2026

Maybe we could disable the auto-vectorizer for this change, along with a TODO linked to a new LLVM bug regarding error: loop not vectorized?

A good plan, but I don't know how to do that; can you help?

@mr-c mr-c force-pushed the arm_neon_compat_native_aliases branch from 144856c to a19fc5a Compare January 12, 2026 09:45
@mr-c
Copy link
Contributor Author

mr-c commented Jan 12, 2026

@sbc100 I fixed the test by turning off SIMDe level function inlining for this test only.

For submitting the llvm optimization bug, the following command can be used as a reproducer:

em++ test/neon/test_neon_wasm_simd.cpp -o test_neon_wasm_simd.js -sNO_DEFAULT_TO_CXX -Wclosure -Werror -Wno-limited-postlink-optimizations -O1 -msimd128 -fno-lax-vector-conversions -Wno-c++11-narrowing -mfpu=neon -msimd128

Fails for -O1, -O2, and -O3 at pass 734 transform-warning on main:

error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]

From our side, we run the entire SIMDe test suite (with complete coverage of the NEON intrinsics) using emscripten tot at -O2, and we have no issues.

@mr-c mr-c force-pushed the arm_neon_compat_native_aliases branch from a19fc5a to d38b193 Compare January 24, 2026 12:34
@mr-c
Copy link
Contributor Author

mr-c commented Jan 24, 2026

I think this is ready for merging @sbc100 @tlively ; the test failures are unrelated. Thanks 🙏 !

@sbc100
Copy link
Collaborator

sbc100 commented Jan 24, 2026

Can you rebase one more them? I'd like the codesize tests at least to pass

@mr-c mr-c force-pushed the arm_neon_compat_native_aliases branch from d38b193 to c970b11 Compare January 24, 2026 17:24
@mr-c
Copy link
Contributor Author

mr-c commented Jan 24, 2026

@sbc100 Done

  1. 🔍 Codesize Checks almost passed (1 byte too few?)
  2. ✔️ test-bun passed when before it had a transient failure before
  3. ✔️ test-windows passes when it did not before.
  4. ✔️ everything else passes

@sbc100
Copy link
Collaborator

sbc100 commented Jan 24, 2026

Sadly we do want those codesize tests to pass before landing. We could update it in a followup but its better to update it as part of this PR.

Step to update:

$ emsdk install tot
$ ./test/runner codesize --rebase

(The above command should produce zero changes on main and the one-byte change on your branch. You can also try ./emcc --clear-cache between runs).

@sbc100
Copy link
Collaborator

sbc100 commented Jan 24, 2026

Its the codesize stuff is not working as expected for you let me know and I can push those change to this PR (maybe on monday).

@mr-c mr-c force-pushed the arm_neon_compat_native_aliases branch from c970b11 to 15aa503 Compare January 25, 2026 09:44
@mr-c
Copy link
Contributor Author

mr-c commented Jan 25, 2026

@sbc100 I pushed the one byte change, but the Codesize Checks failed for main before it even tests the codesizes for this PR: https://github.com/emscripten-core/emscripten/actions/runs/21330588322/job/61394711670?pr=26068#step:5:867 (same failure from before: https://github.com/emscripten-core/emscripten/actions/runs/21318802004/job/61365560492#step:5:867 )

I can open a PR with just the Codesize fixes, but the CI test will still fail as the target branch (main) is out of date.

Since I don't have permission to dispatch gh workflow run rebaseline-tests.yml, perhaps you would do so? Or you can force-merge this PR :-D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

emscripten arm_neon.h doesn't define vdotq_s32

2 participants