Polygence Scholar2024

Amogh Khaparde

Class of 2025

About

Projects

"Encoder-only Transformer model to Predict Sentiment of a Product Using Social Media Posts" with mentor Joe (Sept. 9, 2024)

"To what extent can deep neural networks help hearing-impaired individuals communicate better online with hearing individuals by translating fingerspelling into realistic human voice, without a human sign language interpreter?" with mentor Clark (Nov. 18, 2023)

Amogh's Symposium Presentation

Project Portfolio

Encoder-only Transformer model to Predict Sentiment of a Product Using Social Media Posts

Started May 7, 2024

Abstract or project description

Companies often use data mining techniques to garner information abouts products they release. Doing this allows these companies to make educated decisions on how they choose to market their products and informs them of the the public sentiment of their product after release. With the advancing field of AI and machine learning, there has been a significant shift towards using more complex architectures that can better understand and process text data. With machine learning models such as encoder only transformers, encoder decoder transformers, and decoder only transformers, the power of natural language processing with AI is rapidly advancing. This research paper explores the usage of specifically encoder only transformers built from scratch in product sentiment as it can lead to easily quantifiable data for companies to use. To train this model, data was mined from Amazon reviews with the text as the input and the output as the 1-5 star rating given for each product on the website. The model was evaluated using certain precision metrics such as f1-score, recall, precision, accuracy, and loss. The hyperparameters were tuned as well allowing higher performance. This research represents a step towards building a better tool for companies to assess different trends with their products and understand the public sentiment. This way, companies can change their products in a way that benefits both the company’s growth and the customer’s satisfaction. This project can be improved by training the data on a more specialized dataset that includes specific terminology, emojis to better understand human emotion through text and augmenting the data to increase the number of samples for the machine learning model to train on.

Project Portfolio

To what extent can deep neural networks help hearing-impaired individuals communicate better online with hearing individuals by translating fingerspelling into realistic human voice, without a human sign language interpreter?

Started Mar. 28, 2023

Abstract or project description

Nearly 20% of the world’s population is hearing impaired. Many of these people require a sign language interpreter to help them communicate with hearing individuals. However, as AI and deep neural networks become more efficient and powerful, these tools can be used to help hard-of-hearing people communicate. Hiring a human interpreter usually requires prodigious amounts of money over time and may not always be available when needed. However, one can carry a machine learning model in their devices at all times to help them communicate much more easily. To create such a system, I made a large custom image dataset, consisting of approximately 50,000 images of each ASL character, and trained a raw convolutional neural network model which had a 97% accuracy. Later, I also trained a background cropping convolutional neural network model with a different, but equal in size dataset that boasted a 98.5% accuracy. The background cropping did slightly better than the raw CNN model, achieving a 1.5% accuracy increase. To further help hard-of-hearing individuals communicate with hearing individuals, I created an application surrounding this machine learning model. This application consisted of a text-to-speech system using Play.HT API and a ChatGPT API system to correct any misclassifications the model makes and ease communication by letting the user listen rather than read text. The user interface is very intuitive, consisting of the predicted probability, predicted alphabet, user webcam input, and the classified characters in a sentence. Lastly, the application included a manual spacebar and delete key to allow individuals to change any errors they make while signing. In conclusion, I met my goals of creating a high-performance and feasible AI application to help ease communication between hard-of-hearing and hearing individuals.