Puzzle-STAMPS: A Multimodal Motion-Physiology-Speech Dataset for Studying Team Collaboration and Leadership in Puzzle Solving
Abstract
Research on team collaboration is often limited by a scarcity of datasets that jointly measure motion,
physiology, and speech during complex, physical interactions. To address this, we introduce
Puzzle-STAMPS (Synchronized Team Analytics of Motion, Physiology, and Speech), a multimodal dataset
capturing N participants (N teams of four) engaged in a controlled
“puzzle-box” experiment. Teams collaborated to solve a portable escape-style game
comprising 13 physical puzzles across eight timed segments that integrate visual search, decoding,
and object manipulation. Puzzle dependencies are mediated by a physical toolbox and a timer-driven
three-level hint system, including controlled skipping after the final hint, to standardize
progression across teams while eliciting diverse leadership behaviors and coordination strategies
under time pressure. Puzzle-STAMPS offers N hours of granular, synchronized data. We
captured physiology (ECG, respiration, SpO2) using L.I.F.E. Italia Healer R2 vests;
motion and head pose via 9-DOF IMUs and OptiTrack Prime 17W (360 Hz); and speech through
Rode Lavalier II microphones. These streams are augmented by multi-angle room video, game-state
logs, and standardized psychometric assessments. By providing high-fidelity data with realistic
noise artifacts (e.g., cross-talk), Puzzle-STAMPS enables robust research into leader emergence,
role specialization, interpersonal synchrony, and multimodal performance prediction. The dataset
and ethics documentation are released at [URL] under [LICENSE].