Skip to content

Conversation

@dimitri-yatsenko
Copy link
Member

Overview

This PR updates all DataJoint documentation to remove unsigned integer types (uint8, uint16, uint32, uint64) from the core type system, aligning with DataJoint 2.0.0a22 which removed these types for better PostgreSQL compatibility.

Related: datajoint/datajoint-python#1335

Changes

1. Type System Updates (48 files total)

Type Replacements:

  • uint8int16 (promoted to larger signed type)
  • uint16int32 (promoted to larger signed type)
  • uint32int64 (promoted to larger signed type)
  • uint64int64

Updated Documentation:

  • ✅ All tutorials (basics, advanced, examples, domain)
  • ✅ All how-to guides
  • ✅ All explanations
  • ✅ All reference specs
  • ✅ Type system specification

2. Notebook Execution (19 notebooks)

Executed notebooks with updated type system to validate changes:

How-To (2):

  • model-relationships.ipynb
  • read-diagrams.ipynb

Basic Tutorials (5):

  • 01-first-pipeline.ipynb
  • 02-schema-design.ipynb
  • 03-data-entry.ipynb
  • 04-queries.ipynb
  • 05-computation.ipynb

Advanced Tutorials (4):

  • custom-codecs.ipynb
  • distributed.ipynb
  • json-type.ipynb
  • sql-comparison.ipynb

Domain Tutorials (3):

  • allen-ccf.ipynb
  • calcium-imaging.ipynb
  • electrophysiology.ipynb

Example Tutorials (5):

  • blob-detection.ipynb
  • fractal-pipeline.ipynb
  • hotel-reservations.ipynb
  • languages.ipynb
  • university.ipynb

Skipped (require object store setup):

  • 06-object-storage.ipynb
  • ephys-with-npy.ipynb

Rationale

Why Remove Unsigned Types?

  1. PostgreSQL Compatibility: PostgreSQL does not have native unsigned integer types
  2. Type System Simplification: Reduces core types from 11 to 7 numeric types
  3. Cross-Database Portability: Ensures identical behavior across MySQL and PostgreSQL
  4. Migration Path: Users moving from 0.14.6 → 2.0 need clear guidance

What About Native Unsigned Types?

Native MySQL types like int unsigned are still allowed as pass-through types but are:

  • Discouraged for new pipelines
  • MySQL-specific (not portable to PostgreSQL)
  • Documented with warnings about portability

Example:

# Discouraged (MySQL-only, not portable):
value : int unsigned  

# Recommended (portable, explicit size):
value : int64

Migration Guide

For Existing Pipelines

Option 1: Use larger signed types (recommended)

# Before (0.14.6):
session_idx : uint16
count : uint32

# After (2.0):
session_idx : int32  # promoted
count : int64        # promoted

Option 2: Use native MySQL types (not recommended)

# MySQL-specific (generates warning):
session_idx : smallint unsigned
count : int unsigned

Type Promotion Strategy

The documentation uses conservative promotion to avoid overflow:

  • uint8int16 (doubles range)
  • uint16int32 (doubles range)
  • uint32int64 (doubles range)
  • uint64int64 (may lose upper range, but uint64 values >2^63 are rare in practice)

Testing

  • ✅ All 19 executed notebooks run successfully
  • ✅ Table definitions parse correctly
  • ✅ Data insertion and retrieval work as expected
  • ✅ No breaking changes observed

Documentation Impact

Files Modified:

  • 29 content files (markdown + notebooks)
  • 19 executed notebooks with new outputs
  • Total: 48 files with 2,735 insertions, 2,354 deletions

Backward Compatibility

Breaking Change: Yes, for pipelines using uint8, uint16, uint32, uint64

Mitigation:

  1. Migration guide included in documentation
  2. Clear type replacement table provided
  3. Native unsigned types remain available (with warnings)
  4. All examples and tutorials updated to show correct patterns

Checklist

  • All unsigned types removed from table definitions
  • Type system spec updated
  • Migration guide included
  • Examples and tutorials updated
  • Notebooks executed and validated
  • Native unsigned types documented as MySQL-specific
  • Related PR in datajoint-python merged

Preview

Documentation preview will be available once PR is merged. Key pages to review:

  • Type System Specification
  • Getting Started tutorials
  • Migration guide

Version: Targets DataJoint 2.0.0a22+
Branch: pre/v2.0main
Related Issues: datajoint/datajoint-python#1335

Remove uint8, uint16, uint32, and uint64 from the core type specification.
These types are MySQL-specific and not portable to PostgreSQL.

Changes:
- Remove unsigned types from core numeric types table
- Update native type conversion recommendations (use larger signed types)
- Update UNSIGNED modifier status to "Discouraged"
- Update architecture overview examples

Users can still use unsigned types as native types (with warnings),
but they are no longer part of DataJoint's portable core type system.

Matches implementation changes in datajoint-python pre/v2.0.
Replace all uint8, uint16, uint32, uint64 with signed equivalents:
- uint8 → int16 (promoted to larger signed type)
- uint16 → int32 (promoted to larger signed type)
- uint32 → int64 (promoted to larger signed type)
- uint64 → int64

Unsigned integers are no longer core types in DataJoint 2.0.
Native unsigned types (e.g., 'int unsigned') are still allowed
as MySQL-specific pass-through types but discouraged.

Updated files:
- All tutorials (basics, advanced, examples, domain)
- All how-to guides
- All explanations
- All reference specs
- Type system spec
Executed 19 notebooks and saved outputs after removing unsigned types:

How-To:
- model-relationships.ipynb
- read-diagrams.ipynb

Basic Tutorials:
- 01-first-pipeline.ipynb
- 02-schema-design.ipynb
- 03-data-entry.ipynb
- 04-queries.ipynb
- 05-computation.ipynb

Advanced Tutorials:
- custom-codecs.ipynb
- distributed.ipynb
- json-type.ipynb
- sql-comparison.ipynb

Domain Tutorials:
- allen-ccf.ipynb
- calcium-imaging.ipynb
- electrophysiology.ipynb

Example Tutorials:
- blob-detection.ipynb
- fractal-pipeline.ipynb
- hotel-reservations.ipynb
- languages.ipynb
- university.ipynb

Skipped (require object store config):
- 06-object-storage.ipynb
- ephys-with-npy.ipynb

All executed notebooks validate that uint → int type replacements
work correctly with current DataJoint 2.0 implementation.
@dimitri-yatsenko dimitri-yatsenko changed the title Remove unsigned integer types from documentation for DataJoint 2.0 Remove unsigned integer types from core types for DataJoint 2.0 Jan 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants