Home › Robotics › FAST: Efficient Action Tokenization for Vision-Language-Action Models

FAST: Efficient Action Tokenization for Vision-Language-Action Models - AI Robotics Tool

Overview

FAST (Efficient Action Tokenization for Vision-Language-Action Models) is a universal action tokenizer published on Hugging Face by the Physical Intelligence group. It is designed to map sequences of robot actions (continuous control signals, discrete commands, or mixed modalities) into compact, discrete token sequences suitable for training autoregressive vision-language-action models. The repository provides a pretrained tokenizer bundle alongside tooling to train custom tokenizers on user-collected robot data, enabling scalable language-conditioned policy learning and replayable action sequence modeling. FAST targets the practical bottleneck of representing high-dimensional, continuous robot trajectories for large-scale sequence models. By discretizing actions into a dense token vocabulary, FAST makes it straightforward to combine action tokens with visual and language tokens in transformer-style autoregressive training setups. According to the Hugging Face repository, FAST ships with training utilities, dataset conversion helpers, and encode/decode APIs so teams can either use the provided pretrained tokenizer or fit a new tokenizer on their own action datasets (https://huggingface.co/physical-intelligence/fast).

Model Statistics

Likes: 161
Pipeline: robotics

License: apache-2.0

Model Details

FAST is a toolkit and pretrained tokenizer that converts variable-length and multimodal robot action sequences into dense discrete tokens for autoregressive modeling. The repository focuses on: (1) tokenization primitives for continuous and hybrid action spaces, (2) compact vocabulary construction for long-horizon sequence modeling, and (3) integration utilities to pair action tokens with vision and language tokens in multimodal transformer pipelines. The package includes a training pipeline to build a tokenizer from action datasets (e.g., trajectory fragments, joint-angle sequences, end-effector poses). The pipeline uses quantization and clustering approaches to allocate a fixed vocabulary across the continuous action manifold, and provides encode/decode methods to translate between raw actions and token sequences for training and deployment. FAST is distributed under an Apache-2.0 license and is published in the Hugging Face “robotics” pipeline, with a downloadable pretrained tokenizer for immediate use (https://huggingface.co/physical-intelligence/fast).

Key Features

Pretrained action tokenizer for immediate integration with VLA models
Tools to train a custom tokenizer on user-collected robot action datasets
Encode/decode APIs to map between raw actions and discrete tokens
Quantization/clustering based vocabulary construction for continuous actions
Designed to combine action tokens with visual and language tokens

Example Usage

Example (python):

from huggingface_hub import snapshot_download
import os

# Download the repository snapshot (tokenizer files and scripts)
repo_id = 'physical-intelligence/fast'
repo_dir = snapshot_download(repo_id)
print('Repo downloaded to', repo_dir)

# Typical usage pattern (the repository provides encode/decode tooling).
# The concrete class name or API can vary; many HF tokenizers support a from_pretrained pattern.
# Adjust the import to match the repo's provided API (see the repo README).
try:
    # Hypothetical high-level API shown for illustration
    from fast import FASTTokenizer

    tokenizer = FASTTokenizer.from_pretrained(repo_dir)

    # Example: encode a short action sequence (replace with your real action arrays)
    action_sequence = [
        {'type': 'joint_positions', 'values': [0.0, 0.1, -0.05]},
        {'type': 'gripper', 'values': [0.0]},
    ]

    token_ids = tokenizer.encode(action_sequence)
    print('Encoded token ids:', token_ids)

    decoded_actions = tokenizer.decode(token_ids)
    print('Decoded actions:', decoded_actions)
except Exception as e:
    print('Adjust example to use the repository API. See the repo README for exact usage. Error:', e)