What is EatAble?
EatAble is a voice-controlled robotic assistant designed to empower people with upper limb disabilities to eat independently β restoring dignity, freedom, and equality through accessible AI and robotics.
At its core, EatAble allows a person to simply say what they want to eat, and the robot will find that item on the table and gently feed them.
"We don't just build robots. We build freedom β one meal at a time."
The Mission
Millions of people live with physical disabilities that make daily tasks β even eating β a challenge. For many, this means:
- Relying on others for basic needs
- Losing a sense of independence
- Feeling isolated in everyday life
EatAble is designed to help restore independence and dignity using affordable, real-time technology that can work at home, in hospitals, or in care centers.
Voice-Controlled
Simply say what you want to eat. Natural language commands make it intuitive and accessible.
AI-Powered Vision
Multi-view camera system with SmolVLA model for robust object detection and manipulation.
Restore Independence
Empower people with upper limb disabilities to eat independently and regain dignity.
Versatile Deployment
Works at home, in hospitals, rehabilitation centers, and care homes.
How It Works
Example Interaction:
User: "I want to eat beef."
β Robot detects beef
β Picks it up using robotic arm
β Brings it to the user's mouth
The Complete Pipeline
- Voice Input: User speaks natural language command (e.g., "I want to eat carrots")
- Speech Recognition: Google Speech Recognition API converts audio to text
- Intent Understanding: OpenAI GPT-4o-mini with structured output parsing determines action intent
- Voice Feedback: ElevenLabs voice model provides natural voice responses
- Task Execution: Robot receives task instruction and:
- Captures multi-view observations from cameras
- SmolVLA model selects action based on visual observations and task
- Executes manipulation with custom action threshold detection
- Continues until task complete or timeout (45 seconds)
Technology Stack
AI & Machine Learning
- SmolVLA (Small Vision-Language-Action) Policy: Pre-trained model from LeRobot Hugging Face
- Base Model:
lerobot/smolvla_basetrained on teleoperated demonstrations - Training Infrastructure: AMD Instinctβ’ MI300X GPU on AMD Developer Cloud
- Training: 40,000 steps with batch size 64 (approximately 8 hours)
Voice & Language
- Speech Recognition: Google Speech Recognition API
- Intent Understanding: OpenAI GPT-4o-mini with structured output parsing
- Voice Synthesis: ElevenLabs multilingual voice model
Robotics Framework
- LeRobot: Hugging Face framework for robot learning
- Multi-view Camera System: 3 cameras for robust visual perception
- Custom Action Threshold Detection: Automatic task completion detection
SmolVLA Model
Vision-language-action model combining visual understanding with robotic control.
Natural Language
Voice commands with flexible expression - no memorization needed.
Multi-View Vision
Three-camera system for robust object detection across different angles.
AMD GPU Powered
Trained on AMD Instinctβ’ MI300X GPU for high-performance inference.
Use Cases
- π©β𦽠Supporting people with upper limb disabilities
- π§ Assisting elderly individuals who struggle with mobility
- π₯ Deploying in hospitals, rehabilitation centers, and care homes
- π Enabling independent living at home with affordable robotics
Voice Commands
EatAble understands natural language in various forms:
Feeding Requests:
- "I want to eat carrots"
- "Feed me"
- "I'm hungry"
- "Can you feed me?"
- "I want to eat beef"
- "I feel like eating vegetables"
General Conversation:
- Questions about menu
- Casual chat
- Inquiries about capabilities
Exit Requests:
- "exit", "quit", "goodbye"
- "I'm done", "I'm full", "that's enough"
Architecture
Training Pipeline
- Dataset Collection: Teleoperated demonstrations captured with multi-view cameras
- Model Training: Fine-tuned SmolVLA on AMD Instinctβ’ MI300X GPU
- Model Deployment: Trained model pushed to Hugging Face Hub
Inference Pipeline
- Modular Architecture: Separated voice assistant, robot control, and model inference
- Dummy Mode: Supports testing without physical robot hardware
- Configurable Parameters: Camera indices, robot ports, model parameters
- Real-time Processing: Continuous action monitoring with automatic completion detection
Generalization & Flexibility
Task Generalization:
- SmolVLA model can be extended to various manipulation tasks beyond feeding
- Multi-view camera system provides robust perception across different setups
- Natural language interface allows flexible expression
Hardware Portability:
- Standard LeRobot framework
- Adaptable to different robot platforms
- Configurable for various camera and robot setups
Explore EatAble
EatAble is open source. Explore our implementation and contribute to building accessible robotics technology.
Our Hackathon Journey
EatAble won 3rd Prize π₯ at the AMD Robotics Hackathon 2025! Read about our journey building this voice-controlled robotic assistant and how we're helping restore independence through accessible AI and robotics.
The Team
This project was built by the amazing Tihado team for the AMD Robotics Hackathon 2025, winning 3rd Prize π₯:
From a meal... to a life of independence.
Winner: 3rd Prize π₯ at AMD Robotics Hackathon 2025