Polygence Scholar2025

Aryan Dhir

Class of 2026Bangalore, Karnataka

About

Projects

"How can AI-based speech processing systems automatically detect and remove stutters and speech impediments from recorded audio to generate fluent and natural-sounding speech?" with mentor Husni (Aug. 10, 2025)

Project Portfolio

How can AI-based speech processing systems automatically detect and remove stutters and speech impediments from recorded audio to generate fluent and natural-sounding speech?

Started Mar. 19, 2025

Abstract or project description

This project explores how an AI-powered system can be developed to detect and remove speech impediments such as stuttering, filler words and repetitions from spoken audio recordings. This particular system integrates automatic speech recognition (ASR) using models like Whisper Turbo model to transcribe audio. Then a second model BERT will comb through this transcription identifying the speech impediments and removing them to generate a clean transcript. Once this transcript is clean, this is converted back into speech using a speech to text model. The accuracy of this system will be determined by word error rate and other statistics. This AI system aims to produce fluent natural sounding speech while preserving the speaker's intent and emotional tone which can be applicable in video editing and in podcasting.