The Wearable Communication Assistant is designed as a portable, AI-driven solution that leverages machine learning models for real-time sign language recognition and speech-to-text translation. Our goal is to enable this functionality on a lightweight embedded device, ensuring accessibility and ease of use. The project uses a development board with a camera for capturing visual input and requires minimal peripheral support, which enhances its portability.
The system is trained to recognize a set of common sign language gestures and translate them into corresponding text or speech. Additionally, it can recognize spoken language and convert it to text for the user. This dual-mode functionality addresses different types of communication needs within a single, compact device.
Impact statement
The Wearable Communication Assistant for the Deaf and Mute aims to bridge a critical communication gap by enabling seamless interaction between individuals who are deaf or mute and the broader community. Using advanced AI-powered sign language detection and speech recognition, this device can translate sign language into text or speech and convert spoken words into text, facilitating more accessible communication. The project provides a low-cost, wearable solution that can be optimized and scaled for everyday use, ensuring that people with communication barriers can engage independently and inclusively in society. This device is also versatile and adaptable, potentially supporting object detection in the future to further aid visually impaired individuals, making it a truly transformative, multipurpose tool for people with diverse needs.
Components Required
Camera Module
Microphone ( will use the inbuild one )
2.4-inch TFT Display ( come with the kit )
LiPo Battery pack of 2 18650 cells (for portable power)
External SD card (for model storage)
Speaker (connected to the onboard amplifier)
3D-Printed Enclosure
USB Type-C Cable
Push Buttons (for mode switching
LEDs (for status indicators)
Velcro Straps (for attaching to arm or chest)
Circuit Diagram
Final Project Implementation & Results
To develop and test this project, we followed a multi-step approach, from model training and conversion to hardware deployment:
1.Model Training & Validation:
We trained our model using a sign language dataset, which we processed in Jupyter Notebook. The model achieved high accuracy ( Epoch 10/10 and Test Accuracy: 90.52% )in detecting sign gestures, confirming its effectiveness.
After training, we exported the model as an .h5 file, representing the trained model's structure and weights.
2.Model Conversion:
Our selected hardware platform requires the model to be in .kmodel format to run on the embedded system. Unfortunately, due to technical constraints with the conversion script provided by the MaixPy platform, we encountered challenges converting the .h5 file directly to .kmodel.
Despite these limitations, we demonstrated our model's capabilities by deploying it on a comparable AI-optimized hardware setup, confirming that our model performed as expected in detecting gestures.
3.Hardware Testing & Validation:
Given the incomplete availability of hardware (no display for output), we simulated the device’s function on similar embedded hardware, confirming that our model works well for object detection and gesture recognition. This validates the project’s effectiveness on low-cost embedded systems.
4.Proof of Concept:
We successfully showcased the core concept: our optimized AI model can run on small embedded systems, providing a wearable, low-cost solution for communication assistance.
This approach proves that AI models can indeed be deployed on compact hardware, demonstrating the project’s viability as a practical, accessible aid for individuals with communication barriers.
Conclusion
While we encountered some limitations with the final hardware setup, this project clearly demonstrates the potential of using AI on portable, embedded systems. The success of our model on similar hardware confirms that we could achieve full functionality if all required components were available. This proof-of-concept project lays a strong foundation for future development of wearable communication devices that are both affordable and effective in aiding people who are deaf, mute, or have other communication challenges.
With further refinement and complete hardware access, this device could become an invaluable tool, making communication more inclusive and accessible for all.
Step-by-Step Project Development Plan:
1.Component Setup:
Assemble the Hardware:
First, I'll assemble the Maixduino kit by connecting the external speaker to the onboard amplifier and ensuring the microSD card slot is accessible.
I will also design and 3D print a custom enclosure to make the device portable and wearable. The enclosure will have slots for the camera, LCD, microphone, and buttons.
Connect the Power Supply:
I'll connect a battery pack to power the device, ensuring that it can operate independently without being tethered to a power source.
2.Model Development and Training:
Sign Language Recognition Model:
-I will start by collecting a dataset of 10 common sign language gestures. Using a machine learning framework like TensorFlow, I will train a Convolutional Neural Network (CNN) to recognize these gestures.
-Once the model achieves sufficient accuracy, I will convert it to the kmodel format using MaixPy tools for deployment on the Maixduino.
Speech-to-Text Model:
I will use an existing lightweight speech-to-text model that is compatible with the Maixduino’s capabilities. This model will be optimized and converted into kmodel format for deployment.
Object Recognition Model:
Similarly, I will create or adapt a model for recognizing a small set of objects. This model will also be converted to the kmodel format.
3.Software Integration:
Load Models onto the SD Card:
All kmodel files will be stored on the microSD card. I will write a Python script to manage model loading and switching.
Implement Functionality Switching:
I will develop a menu system using the MaixPy framework, allowing the user to switch between Sign Language Recognition,
Speech-to-Text, and Object Recognition modes. This menu will be navigated using buttons connected to the Maixduino.
Model Execution:
Each model will be loaded into memory when selected and will execute in real-time. For instance, in Sign Language Recognition mode, the camera will capture images continuously, and the system will display or speak out the recognized gesture. Similarly, in
Object Recognition mode, pressing a button will trigger the camera to capture an image and announce the object detected.
4.Testing and Optimization:
Initial Testing:
After each functionality is integrated, I will conduct thorough testing to ensure that the models are running correctly and that the system’s performance is satisfactory.
Optimize Performance:
Based on testing results, I may further optimize the models or adjust the system’s settings to improve response time and accuracy. For example, I might reduce the resolution of images captured by the camera to speed up processing.
Power Management:
I will also work on optimizing power consumption to extend battery life, making the device more practical for real-world use.
5.Final Assembly and Demonstration:
Final Hardware Assembly:
Once all the software is tested and stable, I will finalize the hardware assembly by securing all components within the 3d printed enclosure.
For all the code, follow the GitHub link below: