MVP & Prototyping

Real-Time Sign Language Translation with Wearable Sensors

Client

Academic Research Team

Industry

Education

Timeline

6 months

Type

MVP & Prototyping

Overview

A university research team partnered with us to build a gesture-to-speech system for real-world sign language accessibility. Wearable sensor data streamed through Amazon API Gateway WebSocket API and AWS Lambda, where JSON payloads were transformed into model-ready features and predictions were sent back to end devices. The final user experience was a simple Android app that displayed recognized text and performed local text-to-speech. During model validation, the training workflow reached 97.48% accuracy on the dataset.

Challenge

The team was dealing with three linked constraints:

Tooling gap: Existing communication tools were not designed for real-time sign language translation in low-resource setups.
Signal complexity: Sensor data from hand motion and finger bend needed to be captured, synchronized, and interpreted quickly enough to be useful.
Cost constraints: The solution had to remain lightweight and affordable for iterative academic testing.

Without a reliable pipeline, the project risked becoming a disconnected demo instead of a credible research artifact. The core challenge was to turn raw sensor streams into stable predictions and usable output through a system that engineers and researchers could inspect, improve, and deploy incrementally.

Solution

We implemented a modular architecture that separated sensor ingestion, streaming transport, inference services, and application delivery.

On-device capture used microcontrollers with flex sensors and IMUs to collect gesture-relevant signals.
Streaming transport used Amazon API Gateway WebSocket API to route JSON payloads in real time.
Dataset engineering converted timestamped, device-level nested JSON payloads into a clean labeled training dataset for repeatable model development.
Inference service ran on AWS Lambda and handled async prediction requests.
Application layer used FastAPI for async APIs and native WebSocket handling, which simplified concurrent device communication.
Device delivery pushed predicted gestures to the Android client for on-device text display and local TTS playback.

This design supported rapid experimentation while keeping the stack understandable for software engineers and research collaborators. We deliberately favored a transparent, debuggable architecture over a black-box approach so the team could validate each stage independently.

Key Features

Real-time gesture recognition pipeline from wearable sensors to text and speech output.
Sign language to speech conversion workflow for accessibility testing.
WebSocket-based streaming architecture for low-latency sensor transport.
Structured data capture and dataset creation pipeline for repeatable model training and retraining.
Prototype-ready integration path for edge AI and microcontroller deployment.

Technical Implementation

Backend & Infrastructure

The backend was designed for low-cost cloud deployment and async message handling:

FastAPI services ingested WebSocket sensor streams and managed async flow.
Amazon API Gateway WebSocket API handled bidirectional messaging between devices and cloud services.
The ML model was hosted on a lightweight AWS instance for cost-effective, always-on inference.
Incoming JSON payloads were normalized into model-ready feature vectors before prediction.

This structure made the system easier to scale from prototype traffic to multi-device testing without rewriting the core inference path.

Data & AI Components

The ML pipeline was designed for repeatable experimentation:

Sensor features from hand metadata, IMU readings, and flex channels were assembled into a labeled dataset.
Random Forest was used as the baseline classifier due to strong tabular performance and quick retraining cycles.
In the research prototype flow, server-side preprocessing converted richer nested payloads into the compact feature schema expected by the model.

Dataset creation was an explicit deliverable, not a side task. We defined a consistent payload-to-dataset schema, cleaned session noise, and maintained class-balanced labeling so the training set could be reused across experiments and future retraining cycles.

This approach prioritized short feedback loops and interpretability over early deep-learning complexity, which is often the right tradeoff in applied research prototypes.

Frontend & User Experience

The end application was a lightweight Android client designed for clarity and low friction. It received prediction updates from the AWS-hosted service, rendered recognized words as text, and played local text-to-speech output in real time.

This gave users immediate visual and audio feedback while keeping voice synthesis close to the device for responsiveness.

Security & Reliability

Given the research context, the implementation prioritized functional reliability and iteration speed first:

Connection handling and reconnection logic were implemented in the WebSocket flow.
Data was persisted incrementally to avoid losing capture sessions.
The architecture leaves clear extension points for future hardening controls such as authentication, encryption, and observability.

Results

Built a cloud-connected sign language translation flow from sensor input to text and speech output.
Built a reusable, labeled multimodal gesture dataset from live device streams for model training and iteration.
Validated ML feasibility with 97.48% test accuracy on the labeled dataset.
Implemented cost-effective inference with async WebSocket ingestion and model-ready transformation of nested payloads.
Delivered a practical assistive technology foundation with Android-based text display and local TTS for real-world testing.

Client Testimonial

Arc Systems stayed highly professional and deeply research-oriented throughout. They showed a strong ability to translate complex papers into working real-world implementations. They took a problem statement, broke it into practical artifacts, and brought it to life effectively.

— Research Supervisor

Technology Stack

AI/ML: scikit-learn, NumPy, pandas
Backend: Python, FastAPI, Amazon API Gateway WebSocket API, AWS Lambda
Frontend: Kotlin
Infrastructure: ESP32 firmware, IMU6050 and Flex sensor integration

Interested in Similar Results?

Let's discuss how we can craft a custom solution for your business challenges.

Start a Conversation View More Projects