Skip to content

A typical repo, to contain code I am doing to learn transformers architecture, based on gpt2 and using some newer features in places.

Notifications You must be signed in to change notification settings

sarabesh/exploring-transformers

Repository files navigation

Exploring-transformers

A small, learning-focused repository for understanding Transformer internals by implementing them from scratch.

What’s inside

Byte Pair Encoding (BPE) tokenizer (train / encode / decode)

  • currently has a basic notebook level implementation. Probably better to stick to tiktoken
  • Work on adding and handling special tokens for training and using llms

Rotary Positional Embeddings (RoPE) (complex-number implementation)

Attention implmentation

  • trying to implement function code from paper, including gpt2/llama style llm(decoder only).

Working on finishing the model, then play around with training, evaluation and finetuning pipelines/scripts.

Purpose

This repo prioritizes clarity over optimization. It is meant for exploration, experimentation, and mapping theory → code — not for production use.

About

A typical repo, to contain code I am doing to learn transformers architecture, based on gpt2 and using some newer features in places.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages