Exploring-transformers

A small, learning-focused repository for understanding Transformer internals by implementing them from scratch.

What’s inside

Byte Pair Encoding (BPE) tokenizer (train / encode / decode)

currently has a basic notebook level implementation. Probably better to stick to tiktoken
Work on adding and handling special tokens for training and using llms

Rotary Positional Embeddings (RoPE) (complex-number implementation)

implementation based on llama code, src: https://github.com/meta-llama/llama3/blob/main/llama/model.py

Attention implmentation

trying to implement function code from paper, including gpt2/llama style llm(decoder only).

Working on finishing the model, then play around with training, evaluation and finetuning pipelines/scripts.

Purpose

This repo prioritizes clarity over optimization. It is meant for exploration, experimentation, and mapping theory → code — not for production use.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
blocks		blocks
layers		layers
models		models
scripts		scripts
tokenizer		tokenizer
.gitignore		.gitignore
README.md		README.md
bpe_tokenizer.txt		bpe_tokenizer.txt
profile_CUDA_run.py		profile_CUDA_run.py
test_run.py		test_run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploring-transformers

What’s inside

Byte Pair Encoding (BPE) tokenizer (train / encode / decode)

Rotary Positional Embeddings (RoPE) (complex-number implementation)

Attention implmentation

Purpose

About

Uh oh!

Releases

Packages

Languages

sarabesh/exploring-transformers

Folders and files

Latest commit

History

Repository files navigation

Exploring-transformers

What’s inside

Byte Pair Encoding (BPE) tokenizer (train / encode / decode)

Rotary Positional Embeddings (RoPE) (complex-number implementation)

Attention implmentation

Purpose

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages