Skip to content

Conversation

@Shuang-cnt
Copy link
Collaborator

Description

This change builds upon PR#2866 and PR#2926 to add functionality for printing logical axes

Tests

Command:
python -m MaxText.train_compile MaxText/configs/base.yml compile_topology=v5p-1024 compile_topology_num_slices=1 model_name=deepseek3-671b per_device_batch_size=1 ici_tensor_parallelism=8 ici_expert_parallelism=8 log_config=false debug_sharding=true

Output example:

I0121 16:06:59.015100 139280391421056 max_utils.py:1048] Mesh Axes: (data: 1, stage: 1, fsdp: 8, fsdp_transpose: 1, sequence: 1, context: 1, context_autoregressive: 1, tensor: 8, tensor_transpose: 1, tensor_sequence: 1, expert: 8, autoregressive: 1)
I0121 16:06:59.015291 139280391421056 maxtext_utils.py:1233] Parameter Path
I0121 16:06:59.015321 139280391421056 maxtext_utils.py:1234] Shape
I0121 16:06:59.015341 139280391421056 maxtext_utils.py:1235] Logical Axes
I0121 16:06:59.015359 139280391421056 maxtext_utils.py:1236] Physical PartitionSpec
I0121 16:06:59.015376 139280391421056 maxtext_utils.py:1237] ------------------------------------------------------------------------------------------------------------------------
I0121 16:06:59.015479 139280391421056 maxtext_utils.py:1281] params/decoder/decoder_norm/scale
I0121 16:06:59.015504 139280391421056 maxtext_utils.py:1282] float32[7168]
I0121 16:06:59.015521 139280391421056 maxtext_utils.py:1283] Partitionspec({activation_embed | activation_vocab | norm})
I0121 16:06:59.015552 139280391421056 maxtext_utils.py:1284] ('tensor',)
I0121 16:06:59.015567 139280391421056 maxtext_utils.py:1285] ------------------------------------------------------------------------------------------------------------------------

I0121 16:06:59.015658 139280391421056 maxtext_utils.py:1281] params/decoder/dense_layers/mlp/wi_0/kernel
I0121 16:06:59.015681 139280391421056 maxtext_utils.py:1282] float32[7168,3,18432]
I0121 16:06:59.015704 139280391421056 maxtext_utils.py:1283] Partitionspec({embed}, '(None,)', {mlp})
I0121 16:06:59.015722 139280391421056 maxtext_utils.py:1284] (('fsdp', 'expert'), None, 'tensor')
I0121 16:06:59.015738 139280391421056 maxtext_utils.py:1285] ------------------------------------------------------------------------------------------------------------------------

I0121 16:06:59.015788 139280391421056 maxtext_utils.py:1281] params/decoder/dense_layers/mlp/wi_1/kernel
I0121 16:06:59.015806 139280391421056 maxtext_utils.py:1282] float32[7168,3,18432]
I0121 16:06:59.015821 139280391421056 maxtext_utils.py:1283] Partitionspec({embed}, '(None,)', {mlp})
I0121 16:06:59.015835 139280391421056 maxtext_utils.py:1284] (('fsdp', 'expert'), None, 'tensor')
I0121 16:06:59.015850 139280391421056 maxtext_utils.py:1285] ------------------------------------------------------------------------------------------------------------------------

I0121 16:06:59.015919 139280391421056 maxtext_utils.py:1281] params/decoder/dense_layers/mlp/wo/kernel
I0121 16:06:59.015939 139280391421056 maxtext_utils.py:1282] float32[18432,3,7168]
I0121 16:06:59.015955 139280391421056 maxtext_utils.py:1283] Partitionspec({mlp}, '(None,)', {embed})
I0121 16:06:59.015970 139280391421056 maxtext_utils.py:1284] ('tensor', None, ('fsdp', 'expert'))
I0121 16:06:59.015985 139280391421056 maxtext_utils.py:1285] ------------------------------------------------------------------------------------------------------------------------

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link

codecov bot commented Jan 21, 2026

Codecov Report

❌ Patch coverage is 10.52632% with 68 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/MaxText/maxtext_utils.py 4.91% 58 Missing ⚠️
src/MaxText/max_utils.py 14.28% 6 Missing ⚠️
src/MaxText/train_compile.py 50.00% 3 Missing ⚠️
src/MaxText/train_utils.py 50.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant