Class of 2023Casablanca, Morocco
- "GPT-4: The Parrot Paradox Decoding the Intricacies of Statistical Vocabulary Patterns and Pioneering a Solution" with mentor Efthimios (Dec. 5, 2023)
GPT-4: The Parrot Paradox Decoding the Intricacies of Statistical Vocabulary Patterns and Pioneering a Solution
Started Sept. 13, 2023
Abstract or project description
This paper investigates the linguistic characteristics of GPT-4, focusing on its emulation of statistical human vocabulary traits and biases towards certain words. It delves into GPT-4's ability to adapt its writing style based on prompts given, examining the model's responsiveness to various linguistic contexts. Notably, GPT-4 was tasked with generating both Shakespearean and Dickensian-styled texts, and their outputs were systematically compared to authentic Shakespearean and Dickensian texts. The comparison involved assessing vocabulary traits and statistical characteristics of the generated text against the established patterns observed in the respective literary styles. Using mathematical vocabulary laws such as Zipf's law, the study uncovers patterns and trends within GPT-4's language generation. During the research, we were able to establish our own methods for an AI content-generation detector. By combining theoretical exploration and practical applications, this paper contributes to a comprehensive understanding of GPT-4's language capabilities and provides valuable insights into natural language processing. Chatbot APIs, Text datasets from Kaggle , and CopyLeaks AI  were used.