Skip to main content

Capstone Project: Autonomous Simulated Humanoid

Build an end-to-end autonomous humanoid robot system integrating all course concepts.

Project Overview

Create a humanoid robot that can:

  • ✅ Receive and understand voice commands
  • ✅ Generate task plans using LLM reasoning
  • ✅ Navigate around obstacles autonomously
  • ✅ Identify objects using computer vision
  • ✅ Manipulate objects in simulation

System Architecture

User Voice → Whisper → Text Command

GPT-4 Task Planner

[navigate, detect, grasp, ...]

┌───────────┴───────────┐
↓ ↓
Navigation Module Manipulation Module
(Nav2 + LiDAR) (Vision + IK + Gripper)
↓ ↓
Isaac Sim Humanoid Robot

Requirements

1. Voice Command Reception

Use OpenAI Whisper to transcribe user speech into text commands.

2. LLM-Based Task Planning

Process commands with GPT-4 to generate structured action sequences.

import openai

def generate_task_plan(user_command):
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are a robot task planner."},
{"role": "user", "content": user_command}
]
)
return parse_task_plan(response.choices[0].message['content'])

3. Obstacle-Aware Navigation

Implement Nav2 with costmap-based path planning.

4. Computer Vision Object Identification

Use YOLOv8 for real-time object detection.

5. Object Manipulation

Grasp and manipulate using inverse kinematics and motion planning.

Example Scenario

Command: "Go to the living room and bring me the blue bottle"

Execution Steps:

  1. Speech Recognition: Whisper converts audio → text
  2. Task Planning: GPT generates action sequence
  3. Navigation: Nav2 plans path, avoids obstacles
  4. Perception: YOLOv8 detects blue bottle
  5. Manipulation: IK solution, grasp execution
  6. Return: Navigate back, handover to user

Implementation Checklist

  • Set up Isaac Sim with humanoid robot model
  • Integrate Whisper ASR node in ROS 2
  • Create GPT-4 task planning service
  • Configure Nav2 with costmaps
  • Train/deploy YOLOv8 for object detection
  • Implement IK-based manipulation pipeline
  • Test end-to-end system

Deliverables

  1. Working simulated humanoid in Isaac Sim
  2. ROS 2 package with all modules
  3. Video demonstration of voice-commanded tasks
  4. Technical report documenting architecture
  5. Code repository with documentation

Grading Criteria

  • Functionality (40%): System works as specified
  • Integration (25%): All components work together
  • Code Quality (15%): Clean, documented code
  • Documentation (10%): Clear technical report
  • Innovation (10%): Creative problem-solving

Good luck! This capstone demonstrates everything you've learned throughout the 13-week course.