Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 3 additions & 43 deletions src/tutorials/basics/01-first-pipeline.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,7 @@
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# A Simple Pipeline\n",
"\n",
"This tutorial introduces DataJoint by building a simple research lab database. You'll learn to:\n",
"\n",
"- Define tables with primary keys and dependencies\n",
"- Insert and query data\n",
"- Use the four core operations: restriction, projection, join, aggregation\n",
"- Understand the schema diagram\n",
"\n",
"We'll work with **Manual tables** only—tables where you enter data directly. Later tutorials introduce automated computation.\n",
"\n",
"For complete working examples, see:\n",
"- [University Database](../examples/university.ipynb) — Academic records with complex queries\n",
"- [Blob Detection](../examples/blob-detection.ipynb) — Image processing with computation"
]
"source": "# A Simple Pipeline\n\nThis tutorial introduces DataJoint by building a simple research lab database. You'll learn to:\n\n- Define tables with primary keys and dependencies\n- Insert and query data\n- Use the four core operations: restriction, projection, join, aggregation\n- Understand the schema diagram\n\nWe'll work with **Manual tables** only—tables where you enter data directly. Later tutorials introduce automated computation.\n\nFor complete working examples, see:\n- [University Database](../examples/university/) — Academic records with complex queries\n- [Blob Detection](../examples/blob-detection/) — Image processing with computation"
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -2698,32 +2683,7 @@
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Summary\n",
"\n",
"You've learned the fundamentals of DataJoint:\n",
"\n",
"| Concept | Description |\n",
"|---------|-------------|\n",
"| **Tables** | Python classes with a `definition` string |\n",
"| **Primary key** | Above `---`, uniquely identifies rows |\n",
"| **Dependencies** | `->` creates foreign keys |\n",
"| **Restriction** | `&` filters rows |\n",
"| **Projection** | `.proj()` selects/computes columns |\n",
"| **Join** | `*` combines tables |\n",
"| **Aggregation** | `.aggr()` summarizes groups |\n",
"\n",
"### Next Steps\n",
"\n",
"- [Schema Design](02-schema-design.ipynb) — Primary keys, relationships, table tiers\n",
"- [Queries](04-queries.ipynb) — Advanced query patterns\n",
"- [Computation](05-computation.ipynb) — Automated processing with Imported/Computed tables\n",
"\n",
"### Complete Examples\n",
"\n",
"- [University Database](../examples/university.ipynb) — Complex queries on academic records\n",
"- [Blob Detection](../examples/blob-detection.ipynb) — Image processing pipeline with computation"
]
"source": "## Summary\n\nYou've learned the fundamentals of DataJoint:\n\n| Concept | Description |\n|---------|-------------|\n| **Tables** | Python classes with a `definition` string |\n| **Primary key** | Above `---`, uniquely identifies rows |\n| **Dependencies** | `->` creates foreign keys |\n| **Restriction** | `&` filters rows |\n| **Projection** | `.proj()` selects/computes columns |\n| **Join** | `*` combines tables |\n| **Aggregation** | `.aggr()` summarizes groups |\n\n### Next Steps\n\n- [Schema Design](02-schema-design/) — Primary keys, relationships, table tiers\n- [Queries](04-queries/) — Advanced query patterns\n- [Computation](05-computation/) — Automated processing with Imported/Computed tables\n\n### Complete Examples\n\n- [University Database](../examples/university/) — Complex queries on academic records\n- [Blob Detection](../examples/blob-detection/) — Image processing pipeline with computation"
},
{
"cell_type": "code",
Expand Down Expand Up @@ -2764,4 +2724,4 @@
},
"nbformat": 4,
"nbformat_minor": 4
}
}
111 changes: 3 additions & 108 deletions src/tutorials/basics/02-schema-design.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1299,39 +1299,7 @@
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Reading the Diagram\n",
"\n",
"DataJoint diagrams show tables as nodes and foreign keys as edges. The notation conveys relationship semantics at a glance.\n",
"\n",
"**Line Styles:**\n",
"\n",
"| Line | Style | Relationship | Meaning |\n",
"|------|-------|--------------|---------|\n",
"| ━━━ | Thick solid | Extension | FK **is** entire PK (one-to-one) |\n",
"| ─── | Thin solid | Containment | FK **in** PK with other fields (one-to-many) |\n",
"| ┄┄┄ | Dashed | Reference | FK in secondary attributes (one-to-many) |\n",
"\n",
"**Visual Indicators:**\n",
"\n",
"| Indicator | Meaning |\n",
"|-----------|---------|\n",
"| **Underlined name** | Introduces new dimension (new PK attributes) |\n",
"| Non-underlined name | Inherits all dimensions (PK entirely from FKs) |\n",
"| **Green** | Manual table |\n",
"| **Gray** | Lookup table |\n",
"| **Red** | Computed table |\n",
"| **Blue** | Imported table |\n",
"| **Orange dots** | Renamed foreign keys (via `.proj()`) |\n",
"\n",
"**Key principle:** Solid lines mean the parent's identity becomes part of the child's identity. Dashed lines mean the child maintains independent identity.\n",
"\n",
"**Note:** Diagrams do NOT show `[nullable]` or `[unique]` modifiers—check table definitions for these constraints.\n",
"\n",
"See [How to Read Diagrams](../../how-to/read-diagrams.ipynb) for diagram operations and comparison to ER notation.\n",
"\n",
"## Insert Test Data and Populate"
]
"source": "### Reading the Diagram\n\nDataJoint diagrams show tables as nodes and foreign keys as edges. The notation conveys relationship semantics at a glance.\n\n**Line Styles:**\n\n| Line | Style | Relationship | Meaning |\n|------|-------|--------------|---------|\n| ━━━ | Thick solid | Extension | FK **is** entire PK (one-to-one) |\n| ─── | Thin solid | Containment | FK **in** PK with other fields (one-to-many) |\n| ┄┄┄ | Dashed | Reference | FK in secondary attributes (one-to-many) |\n\n**Visual Indicators:**\n\n| Indicator | Meaning |\n|-----------|---------|\n| **Underlined name** | Introduces new dimension (new PK attributes) |\n| Non-underlined name | Inherits all dimensions (PK entirely from FKs) |\n| **Green** | Manual table |\n| **Gray** | Lookup table |\n| **Red** | Computed table |\n| **Blue** | Imported table |\n| **Orange dots** | Renamed foreign keys (via `.proj()`) |\n\n**Key principle:** Solid lines mean the parent's identity becomes part of the child's identity. Dashed lines mean the child maintains independent identity.\n\n**Note:** Diagrams do NOT show `[nullable]` or `[unique]` modifiers—check table definitions for these constraints.\n\nSee [How to Read Diagrams](../../how-to/read-diagrams/) for diagram operations and comparison to ER notation.\n\n## Insert Test Data and Populate"
},
{
"cell_type": "code",
Expand Down Expand Up @@ -1562,80 +1530,7 @@
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Best Practices\n",
"\n",
"### 1. Choose Meaningful Primary Keys\n",
"- Use natural identifiers when possible (`subject_id = 'M001'`)\n",
"- Keep keys minimal but sufficient for uniqueness\n",
"\n",
"### 2. Use Appropriate Table Tiers\n",
"- **Manual**: Data entered by operators or instruments\n",
"- **Lookup**: Configuration, parameters, reference data\n",
"- **Imported**: Data read from files (recordings, images)\n",
"- **Computed**: Derived analyses and summaries\n",
"\n",
"### 3. Normalize Your Data\n",
"- Don't repeat information across rows\n",
"- Create separate tables for distinct entities\n",
"- Use foreign keys to link related data\n",
"\n",
"### 4. Use Core DataJoint Types\n",
"\n",
"DataJoint has a three-layer type architecture (see [Type System Specification](../reference/specs/type-system.md)):\n",
"\n",
"1. **Native database types** (Layer 1): Backend-specific types like `INT`, `FLOAT`, `TINYINT UNSIGNED`. These are **discouraged** but allowed for backward compatibility.\n",
"\n",
"2. **Core DataJoint types** (Layer 2): Standardized, scientist-friendly types that work identically across MySQL and PostgreSQL. **Always prefer these.**\n",
"\n",
"3. **Codec types** (Layer 3): Types with `encode()`/`decode()` semantics like `<blob>`, `<attach>`, `<object@>`.\n",
"\n",
"**Core types used in this tutorial:**\n",
"\n",
"| Type | Description | Example |\n",
"|------|-------------|---------|\n",
"| `uint8`, `uint16`, `int32` | Sized integers | `session_idx : uint16` |\n",
"| `float32`, `float64` | Sized floats | `reaction_time : float32` |\n",
"| `varchar(n)` | Variable-length string | `name : varchar(100)` |\n",
"| `bool` | Boolean | `correct : bool` |\n",
"| `date` | Date only | `date_of_birth : date` |\n",
"| `datetime` | Date and time (UTC) | `created_at : datetime` |\n",
"| `enum(...)` | Enumeration | `sex : enum('M', 'F', 'U')` |\n",
"| `json` | JSON document | `task_params : json` |\n",
"| `uuid` | Universally unique ID | `experimenter_id : uuid` |\n",
"\n",
"**Why native types are allowed but discouraged:**\n",
"\n",
"Native types (like `int`, `float`, `tinyint`) are passed through to the database but generate a **warning at declaration time**. They are discouraged because:\n",
"- They lack explicit size information\n",
"- They are not portable across database backends\n",
"- They are not recorded in field metadata for reconstruction\n",
"\n",
"If you see a warning like `\"Native type 'int' used; consider 'int32' instead\"`, update your definition to use the corresponding core type.\n",
"\n",
"### 5. Document Your Tables\n",
"- Add comments after `#` in definitions\n",
"- Document units in attribute comments\n",
"\n",
"## Key Concepts Recap\n",
"\n",
"| Concept | Description |\n",
"|---------|-------------|\n",
"| **Primary Key** | Attributes above `---` that uniquely identify rows |\n",
"| **Secondary Attributes** | Attributes below `---` that store additional data |\n",
"| **Foreign Key** (`->`) | Reference to another table, imports its primary key |\n",
"| **One-to-Many** | FK in primary key: parent has many children |\n",
"| **One-to-One** | FK is entire primary key: exactly one child per parent |\n",
"| **Master-Part** | Compositional integrity: master and parts inserted/deleted atomically |\n",
"| **Nullable FK** | `[nullable]` makes the reference optional |\n",
"| **Lookup Table** | Pre-populated reference data |\n",
"\n",
"## Next Steps\n",
"\n",
"- [Data Entry](03-data-entry.ipynb) — Inserting, updating, and deleting data\n",
"- [Queries](04-queries.ipynb) — Filtering, joining, and projecting\n",
"- [Computation](05-computation.ipynb) — Building computational pipelines"
]
"source": "## Best Practices\n\n### 1. Choose Meaningful Primary Keys\n- Use natural identifiers when possible (`subject_id = 'M001'`)\n- Keep keys minimal but sufficient for uniqueness\n\n### 2. Use Appropriate Table Tiers\n- **Manual**: Data entered by operators or instruments\n- **Lookup**: Configuration, parameters, reference data\n- **Imported**: Data read from files (recordings, images)\n- **Computed**: Derived analyses and summaries\n\n### 3. Normalize Your Data\n- Don't repeat information across rows\n- Create separate tables for distinct entities\n- Use foreign keys to link related data\n\n### 4. Use Core DataJoint Types\n\nDataJoint has a three-layer type architecture (see [Type System Specification](../../reference/specs/type-system/)):\n\n1. **Native database types** (Layer 1): Backend-specific types like `INT`, `FLOAT`, `TINYINT UNSIGNED`. These are **discouraged** but allowed for backward compatibility.\n\n2. **Core DataJoint types** (Layer 2): Standardized, scientist-friendly types that work identically across MySQL and PostgreSQL. **Always prefer these.**\n\n3. **Codec types** (Layer 3): Types with `encode()`/`decode()` semantics like `<blob>`, `<attach>`, `<object@>`.\n\n**Core types used in this tutorial:**\n\n| Type | Description | Example |\n|------|-------------|---------|\n| `uint8`, `uint16`, `int32` | Sized integers | `session_idx : uint16` |\n| `float32`, `float64` | Sized floats | `reaction_time : float32` |\n| `varchar(n)` | Variable-length string | `name : varchar(100)` |\n| `bool` | Boolean | `correct : bool` |\n| `date` | Date only | `date_of_birth : date` |\n| `datetime` | Date and time (UTC) | `created_at : datetime` |\n| `enum(...)` | Enumeration | `sex : enum('M', 'F', 'U')` |\n| `json` | JSON document | `task_params : json` |\n| `uuid` | Universally unique ID | `experimenter_id : uuid` |\n\n**Why native types are allowed but discouraged:**\n\nNative types (like `int`, `float`, `tinyint`) are passed through to the database but generate a **warning at declaration time**. They are discouraged because:\n- They lack explicit size information\n- They are not portable across database backends\n- They are not recorded in field metadata for reconstruction\n\nIf you see a warning like `\"Native type 'int' used; consider 'int32' instead\"`, update your definition to use the corresponding core type.\n\n### 5. Document Your Tables\n- Add comments after `#` in definitions\n- Document units in attribute comments\n\n## Key Concepts Recap\n\n| Concept | Description |\n|---------|-------------|\n| **Primary Key** | Attributes above `---` that uniquely identify rows |\n| **Secondary Attributes** | Attributes below `---` that store additional data |\n| **Foreign Key** (`->`) | Reference to another table, imports its primary key |\n| **One-to-Many** | FK in primary key: parent has many children |\n| **One-to-One** | FK is entire primary key: exactly one child per parent |\n| **Master-Part** | Compositional integrity: master and parts inserted/deleted atomically |\n| **Nullable FK** | `[nullable]` makes the reference optional |\n| **Lookup Table** | Pre-populated reference data |\n\n## Next Steps\n\n- [Data Entry](03-data-entry/) — Inserting, updating, and deleting data\n- [Queries](04-queries/) — Filtering, joining, and projecting\n- [Computation](05-computation/) — Building computational pipelines"
},
{
"cell_type": "code",
Expand Down Expand Up @@ -1676,4 +1571,4 @@
},
"nbformat": 4,
"nbformat_minor": 4
}
}
22 changes: 2 additions & 20 deletions src/tutorials/basics/03-data-entry.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1588,25 +1588,7 @@
"cell_type": "markdown",
"id": "cell-42",
"metadata": {},
"source": [
"## Quick Reference\n",
"\n",
"| Operation | Method | Use Case |\n",
"|-----------|--------|----------|\n",
"| Insert one | `insert1(row)` | Adding single entity |\n",
"| Insert many | `insert(rows)` | Bulk data loading |\n",
"| Update one | `update1(row)` | Surgical corrections only |\n",
"| Delete | `delete()` | Removing entities (cascades) |\n",
"| Delete quick | `delete_quick()` | Internal cleanup (no cascade) |\n",
"| Validate | `validate(rows)` | Pre-insert check |\n",
"\n",
"See the [Data Manipulation Specification](../reference/specs/data-manipulation.md) for complete details.\n",
"\n",
"## Next Steps\n",
"\n",
"- [Queries](04-queries.ipynb) — Filtering, joining, and projecting data\n",
"- [Computation](05-computation.ipynb) — Building computational pipelines"
]
"source": "## Quick Reference\n\n| Operation | Method | Use Case |\n|-----------|--------|----------|\n| Insert one | `insert1(row)` | Adding single entity |\n| Insert many | `insert(rows)` | Bulk data loading |\n| Update one | `update1(row)` | Surgical corrections only |\n| Delete | `delete()` | Removing entities (cascades) |\n| Delete quick | `delete_quick()` | Internal cleanup (no cascade) |\n| Validate | `validate(rows)` | Pre-insert check |\n\nSee the [Data Manipulation Specification](../../reference/specs/data-manipulation/) for complete details.\n\n## Next Steps\n\n- [Queries](04-queries/) — Filtering, joining, and projecting data\n- [Computation](05-computation/) — Building computational pipelines"
},
{
"cell_type": "code",
Expand Down Expand Up @@ -1648,4 +1630,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}
Loading