HOT3D
A comprehensive egocentric dataset for 3D hand and object pose tracking, captured from Project Aria glasses. Features high-fidelity ground truth annotations and multi-modal sensor data.
A new benchmark dataset to better understand how humans use their hands
We use our hands to communicate with others, interact with objects, and handle tools. Yet reliable understanding of how people use their hands to manipulate objects remains a key challenge for computer vision research.
The HOT3D dataset and benchmark will unlock new opportunities within this research area, such as transferring manual skills from experts to less experienced users or robots, helping an AI assistant to understand user’s actions, or enabling new input capabilities for AR/VR users, such as turning any physical surface to a virtual keyboard or any pencil to a multi-functional magic wand.
Dataset Specifications
TOTAL SEQUENCE
Total Size
Video Resolution
Frame Rate
Recording Duration
Unique Objects
Accurate ground-truth 3D poses of hands and objects
High-fidelity 3D object models
Comprehensive tools to load and visualize data easily
We provide python tools that enable researchers to interact with egocentric hands and objects tracking in 3D on multi-view image streams.
An API and code samples provide ways to easily access and visualize the image streams and high-quality ground-truth 3D poses and shapes of hands and objects.
Key Features
Access HOT3D Dataset and accompanying Tools
By submitting your email and accessing the HOT3D dataset, you agree to abide by the dataset license agreement and to receive emails in relation to the dataset.
Citation
@article{hot3d2024,
title={HOT3D: Hand-Object Tracking in 3D},
author={Dexterous Research Team},
journal={arXiv preprint},
year={2024}
}