Amogh Khaparde
Class of 2025
About
Projects
- "To what extent can deep neural networks help hearing-impaired individuals communicate better online with hearing individuals by translating fingerspelling into realistic human voice, without a human sign language interpreter?" with mentor Clark (Nov. 18, 2023)
Amogh's Symposium Presentation
Project Portfolio
To what extent can deep neural networks help hearing-impaired individuals communicate better online with hearing individuals by translating fingerspelling into realistic human voice, without a human sign language interpreter?
Started Mar. 28, 2023
Abstract or project description
Nearly 20% of the world’s population is hearing impaired. Many of these people require a sign language interpreter to help them communicate with hearing individuals. However, as AI and deep neural networks become more efficient and powerful, these tools can be used to help hard-of-hearing people communicate. Hiring a human interpreter usually requires prodigious amounts of money over time and may not always be available when needed. However, one can carry a machine learning model in their devices at all times to help them communicate much more easily. To create such a system, I made a large custom image dataset, consisting of approximately 50,000 images of each ASL character, and trained a raw convolutional neural network model which had a 97% accuracy. Later, I also trained a background cropping convolutional neural network model with a different, but equal in size dataset that boasted a 98.5% accuracy. The background cropping did slightly better than the raw CNN model, achieving a 1.5% accuracy increase. To further help hard-of-hearing individuals communicate with hearing individuals, I created an application surrounding this machine learning model. This application consisted of a text-to-speech system using Play.HT API and a ChatGPT API system to correct any misclassifications the model makes and ease communication by letting the user listen rather than read text. The user interface is very intuitive, consisting of the predicted probability, predicted alphabet, user webcam input, and the classified characters in a sentence. Lastly, the application included a manual spacebar and delete key to allow individuals to change any errors they make while signing. In conclusion, I met my goals of creating a high-performance and feasible AI application to help ease communication between hard-of-hearing and hearing individuals.