ThesisJan – Apr 2025

Robotic Telemanipulation

NLP-controlled robotic system for mobile device interaction using computer vision and AI. MSc thesis project at NUS.

3-stage

Pipeline

tap, swipe, type, read

Interactions

MSc @ NUS

Thesis

2 yrs robotics

Foundation

Architecture

Three-stage pipeline: NLP intent parsing → computer vision UI detection → robotic motion execution
ROS for robot communication, OpenCV for vision processing
Natural language → structured action → G-code motor commands
Camera-based screen understanding for dynamic UI element identification

Structured intermediate representation between NLP and robot

Why: Going directly from text to motor commands is too fragile

Tradeoff: Additional parsing step adds latency, but reliability is non-negotiable for physical systems

Proof-of-concept scope (4-month thesis timeline)

Why: MSc timeline limited scope — prioritized working demo over production robustness

Tradeoff: Not production-grade, but demonstrates the full pipeline end-to-end

PythonComputer VisionNLPROSRobotics

The gap between 'works in the lab' and 'works reliably' is enormous for physical systems — lighting, camera angles, and reflections break vision pipelines.
NLP-to-robot translation needs a structured intermediate representation; going directly from text to motor commands is too fragile.
Two years of robotics experience at Mozark (delta robots, CV pipelines) was the foundation that made a 4-month thesis feasible.