-
Notifications
You must be signed in to change notification settings - Fork 453
feat: update knowledge distillation tutorial for using vllm with Qwen model #2960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
14091ae to
2fb1059
Compare
| export USERNAME_OR_ORG = <Owner of Hugging Face repository> | ||
| export RUN_NAME = <unique name for the run> | ||
| export HF_TOKEN=<your-hf-token> # e.g., hf_BA6... | ||
| export BASE_DIRECTORY=<your-base-directory> # e.g., knowledge-distillation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: # e.g., gs://
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. We use a mounted Hyperdisk because writing large model files and many small I/O ops directly to gs:// is often much slower. The tutorial writes to /mnt/hyperdisk for performance and reproducibility, and I fixed the duplicated env export in the doc.
| uv pip install -r dependencies/requirements/requirements.txt | ||
| To install MaxText and its dependencies for post-training (including vLLM for the teacher), run the following: | ||
|
|
||
| 1. Follow the [MaxText installation instructions](https://maxtext.readthedocs.io/en/latest/install_maxtext.html#install-maxtext): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maxtext).
|
|
||
| We will use vLLM to generate the dataset from the teacher model. | ||
|
|
||
| Create a python script named `generate_distillation_data_vllm.py` with the following content (this script writes a Parquet dataset compatible with MaxText SFT): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this script available in MaxText like this: https://github.com/AI-Hypercomputer/maxtext/blob/main/tools/data_generation/generate_distillation_data.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I’ve already created the script.
2fb1059 to
8005986
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
8005986 to
84aa2ed
Compare
84aa2ed to
eb215d2
Compare
Description
This pull request significantly updates and modernizes the knowledge distillation tutorial for MaxText, aligning it with current best practices and tooling. The guide now uses Qwen3-32B as the teacher model (via vLLM) and Llama-3.1-8B as the student, streamlines the setup with Hyperdisk storage, and provides new scripts and commands for dataset generation and fine-tuning. The instructions have been clarified, unnecessary conversion steps removed for the teacher, and the fine-tuning process updated for the latest MaxText and vLLM workflows.
Tests
Manually triggered the distillation pipeline and monitored the execution flow step-by-step. Confirmed that the training loop finished and resources were released.

Checklist
Before submitting this PR, please make sure (put X in square brackets):
gemini-reviewlabel.