A unified metadata standard for drone-based wildlife datasets
The FAIR² Drones Data Standard provides a comprehensive framework for documenting drone-based wildlife datasets, ensuring they are Findable, Accessible, Interoperable, and Reusable, AI-Ready and are compliant with Darwin Core biodiversity data standards. This standard bridges ecology, robotics, and computer vision communities by providing unified metadata specifications that enable cross-domain dataset reuse.
Field data collection using aerial and underwater drones represents substantial investment in time, expertise, and resources. However, most datasets serve only single research communities, limiting interdisciplinary potential. The FAIR² Drones standard addresses this by:
- Standardizing metadata across ecology, robotics, and computer vision domains
- Integrating Darwin Core biodiversity standards for ecological compliance
- Documenting platform specifications essential for robotics research
- Specifying annotation formats required for AI/ML applications
- Enabling multimodal linkages to complementary sensor data
- Modular template system supporting detection, tracking, behavior recognition, and robotics benchmarking
- Darwin Core compliance with Event and Occurrence records for GBIF integration
- Comprehensive platform metadata including telemetry, sensors, and mission parameters
- Multi-task annotation support for object detection, tracking, segmentation, and behavior analysis
- Validation tools for ensuring standard compliance
- Reference implementations demonstrating real-world applications
- TEMPLATE.md: Full dataset card template with detailed field descriptions
- QUICKSTART_GUIDE.md: Checklist-based guide for rapid implementation
- examples/: Reference implementations on real-world datasets
- KABR Behavior Telemetry: Complete example with GPS extraction, Darwin Core events, and processing scripts
- Validation scripts: Tools for checking standard compliance (coming soon)
- Review the Quick-Start Guide for a checklist-based approach
- Select your template based on primary use case (detection, tracking, behavior, robotics)
- Complete the dataset card following the full template
- Validate compliance using provided tools
- Publish your dataset with FAIR² Drones documentation
Estimated completion time: 2-4 hours depending on dataset complexity
- Dataset identification and attribution
- Licensing and citation information
- Data structure and file organization
- Dataset splits and statistics
- Event records (survey locations, dates, protocols)
- Occurrence records (species observations, taxonomic hierarchy)
- Sampling effort and coverage metrics
- Geographic coordinates with uncertainty
- UAV/UUV hardware details
- Sensor specifications (camera, thermal, LiDAR, etc.)
- Flight parameters and telemetry
- Autonomy modes and mission planning
- Task-specific formats (COCO, MOT, ethograms)
- Quality metrics and inter-annotator agreement
- Annotation difficulty and coverage statistics
- Label sets and class distributions
Many datasets require processing raw telemetry and metadata before documentation:
- GPS Extraction: Extract coordinates from flight logs (SRT files, EXIF data, telemetry logs)
- Event Aggregation: Aggregate video-level data to mission/session-level Darwin Core events
- Occurrence Generation: Link species detections to biodiversity occurrence records
- Statistics Calculation: Compute coverage metrics, annotation counts, and class distributions
See the Kenyan Animal Behavior Recognition Dataset with Telemetry for an example dataset that is FAIR² Drones compliant. See also the KABR processing scripts for Python examples of GPS extraction, event aggregation, and Darwin Core generation.
- Ecologists: Documenting wildlife surveys for biodiversity databases and research publications
- Computer Vision Researchers: Creating benchmark datasets for algorithm development
- Robotics Engineers: Developing autonomous systems and testing perception pipelines
- Conservation Practitioners: Sharing monitoring data across organizations
- Data Scientists: Training and evaluating machine learning models
If you use this standard or template, please cite:
@misc{fair_drone_standard,
title={FAIR² Drones Data Standard for Wildlife Datasets},
author={Jenna Kline, Elizabeth Campolongo},
year={2026},
publisher={GitHub},
howpublished={\\url{https://github.com/Imageomics/fair_drones}}
}We welcome contributions to improve and extend this standard:
- Report issues or unclear documentation via GitHub Issues
- Submit example dataset cards
- Propose extensions for additional domains or modalities
- Contribute validation tools and utilities
This standard and documentation are licensed under CC BY 4.0.
This work builds upon:
- FAIR Principles for scientific data management
- Darwin Core biodiversity data standards
- Hugging Face Dataset Cards for ML datasets
- Imageomics Dataset Card Template for biodiversity and computer vision dataset documentation
- UAV best practices from Barnas et al. 2020
For questions, comments, or concerns:
- Open an issue on GitHub
- Refer to the Quick-Start Guide for implementation guidance
- Review example implementations for reference
Project Status: Active development | Version 1.0 (2025)