Skip to main content

Motion data format

Understanding the motion capture data format is crucial for integrating Move API output into your applications. This guide explains the structure and content of the motion capture data returned by the API.

Overview

Motion capture data from the Move API contains 3D skeletal animation information that can be used in games, animations, analysis, and other applications. The data is structured to be both human-readable and machine-processable.

Data structure

Take metadata

Each take includes metadata about the capture:

{
"id": "take_789012",
"duration": 5.2,
"frame_count": 156,
"frame_rate": 30,
"model_used": "s1",
"created_at": "2024-01-15T10:35:00Z",
"coordinate_system": {
"origin": [0, 0, 0],
"units": "meters",
"up_axis": "Y"
}
}

Frame data

Motion capture data is organized by frames, with each frame containing the 3D positions of all skeletal joints:

{
"frames": [
{
"frame_number": 0,
"timestamp": 0.0,
"joints": {
"Hips": {
"position": [0.0, 1.0, 0.0],
"rotation": [0.0, 0.0, 0.0, 1.0],
"confidence": 0.95
},
"Spine": {
"position": [0.0, 1.2, 0.0],
"rotation": [0.0, 0.0, 0.0, 1.0],
"confidence": 0.92
}
}
}
]
}

Joint structure

Standard humanoid joints

The Move API uses a standard humanoid joint hierarchy:

Hips
├── Spine
│ ├── Chest
│ │ ├── Neck
│ │ │ └── Head
│ │ ├── LeftShoulder
│ │ │ ├── LeftArm
│ │ │ │ ├── LeftForeArm
│ │ │ │ │ └── LeftHand
│ │ │ │ └── LeftHandIndex1
│ │ │ └── LeftHandThumb1
│ │ └── RightShoulder
│ │ ├── RightArm
│ │ │ ├── RightForeArm
│ │ │ │ └── RightHand
│ │ │ └── RightHandIndex1
│ │ └── RightHandThumb1
│ ├── LeftHip
│ │ ├── LeftUpLeg
│ │ │ ├── LeftLeg
│ │ │ │ └── LeftFoot
│ │ │ └── LeftToeBase
│ │ └── LeftToeEnd
│ └── RightHip
│ ├── RightUpLeg
│ │ ├── RightLeg
│ │ │ └── RightFoot
│ │ └── RightToeBase
│ └── RightToeEnd

Joint data format

Each joint contains:

  • Position: 3D coordinates [x, y, z] in meters
  • Rotation: Quaternion [x, y, z, w] representing orientation
  • Confidence: Confidence score (0.0 to 1.0) for tracking accuracy

Export formats

The Move API supports multiple output formats for different use cases:

FBX format

FBX (Filmbox) is a proprietary format widely used in 3D animation:

  • Applications: Maya, Blender, Unity, Unreal Engine
  • Content: Skeletal animation with mesh data
  • Advantages: Industry standard, rich metadata
  • File Size: Larger due to binary format

BVH format

BVH (Biovision Hierarchy) is a text-based motion capture format:

  • Applications: Motion analysis, research, some 3D software
  • Content: Hierarchical skeletal data
  • Advantages: Human-readable, compact
  • File Size: Smaller than FBX

USD formats

USD (Universal Scene Description) is an open-source format for 3D scene data:

  • USDC: Binary USD format for efficient storage and transmission
  • USDZ: Compressed USD format optimized for iOS and AR applications
  • Applications: Maya, Blender, Houdini, Omniverse, custom pipelines
  • Content: Skeletal animation with scene composition capabilities
  • Advantages: Open standard, highly composable, efficient for large scenes
  • File Size: Optimized for complex scenes and pipelines

GLB format

GLB (GL Binary) is the binary format for glTF:

  • Applications: Web applications, mobile apps, AR/VR
  • Content: 3D models with animations
  • Advantages: Compact, web-optimized, widely supported
  • File Size: Efficient binary format

Blend format

Blend is the native format for Blender:

  • Applications: Blender 3D software
  • Content: Complete scene data with animations
  • Advantages: Native Blender format, preserves all data
  • File Size: Varies based on scene complexity

C3D format

C3D (Coordinate 3D) is a standard format for motion capture:

  • Applications: Biomechanics, sports analysis, research
  • Content: 3D coordinate data with analog data support
  • Advantages: Industry standard for motion analysis
  • File Size: Efficient for coordinate data

JSON format

JSON provides programmatic access to motion capture data:

  • Applications: Custom applications, analysis, web integration
  • Content: Raw motion capture data with metadata
  • Advantages: Easy to parse, flexible structure
  • File Size: Moderate, depends on frame count

Video outputs

  • Render Video: Preview video showing the motion capture data
  • Render Overlay Video: Preview video with motion capture data overlaid on the original video (single camera only)

Other outputs

  • Sync Data: Timing information about video offsets (.pkl format)
  • Motion Data: Raw motion capture data in JSON format

Coordinate system

World coordinates

The Move API uses a right-handed coordinate system:

  • X-axis: Left to right
  • Y-axis: Up (vertical)
  • Z-axis: Forward (depth)

Units

  • Distance: Meters
  • Rotation: Radians (quaternions)
  • Time: Seconds
  • Confidence: 0.0 to 1.0 (no units)

Data quality

Confidence scores

Each joint includes a confidence score indicating tracking quality:

  • 0.9-1.0: Excellent tracking
  • 0.7-0.9: Good tracking
  • 0.5-0.7: Fair tracking
  • 0.0-0.5: Poor tracking or occluded

Quality factors

Tracking quality depends on:

  • Model Used: s2 and m2 provide higher accuracy
  • Camera Setup: Multi-camera setups improve quality
  • Lighting: Good lighting improves tracking
  • Occlusion: Hidden body parts reduce confidence
  • Motion Speed: Very fast movements may reduce accuracy

Working with motion capture data

Python example

import json

# Load motion capture data
with open("motion_data.json", "r") as f:
motion_data = json.load(f)

# Access frame data
for frame in motion_data["frames"]:
frame_num = frame["frame_number"]
timestamp = frame["timestamp"]

# Access joint positions
hips_pos = frame["joints"]["Hips"]["position"]
spine_pos = frame["joints"]["Spine"]["position"]

print(f"Frame {frame_num}: Hips at {hips_pos}")

JavaScript example

// Load motion capture data
fetch('motion_data.json')
.then(response => response.json())
.then(data => {
// Process frames
data.frames.forEach(frame => {
const hips = frame.joints.Hips;
const confidence = hips.confidence;

if (confidence > 0.8) {
console.log(`High confidence frame: ${frame.frame_number}`);
}
});
});

Data processing

Filtering by Confidence

def filter_high_confidence_frames(motion_data, threshold=0.8):
filtered_frames = []

for frame in motion_data["frames"]:
# Check if all joints have high confidence
all_high_confidence = all(
joint["confidence"] > threshold
for joint in frame["joints"].values()
)

if all_high_confidence:
filtered_frames.append(frame)

return filtered_frames

Converting to Different Formats

def convert_to_custom_format(motion_data):
custom_data = {
"animation": [],
"metadata": {
"duration": motion_data["duration"],
"frame_rate": motion_data["frame_rate"]
}
}

for frame in motion_data["frames"]:
frame_data = {
"time": frame["timestamp"],
"positions": {},
"rotations": {}
}

for joint_name, joint_data in frame["joints"].items():
frame_data["positions"][joint_name] = joint_data["position"]
frame_data["rotations"][joint_name] = joint_data["rotation"]

custom_data["animation"].append(frame_data)

return custom_data

Best practices

Data Validation

  • Check Confidence: Filter out low-confidence frames
  • Validate Coordinates: Ensure positions are within expected ranges
  • Check Completeness: Verify all expected joints are present

Performance Optimization

  • Frame Sampling: Use every Nth frame for real-time applications
  • Joint Filtering: Only process joints relevant to your use case
  • Caching: Cache processed motion capture data for repeated use

Integration

  • Coordinate System: Ensure your application uses the same coordinate system
  • Units: Convert units if your application uses different measurements
  • Frame Rate: Handle frame rate differences between source and target

Next steps