Polypilot product mascot

Introducing PolyPilot:

Our AI-Powered Mentorship Program

Learn More
profile picture

Clayton G

- Research Program Mentor

PhD at Universität des Saarlandes

Expertise

data science, computer science, computational linguistics, natural language processing, web scraping (Twitter) and analytics, probability theory, statistics, calculus

Bio

Hi, I'm Clayton Greenberg and I was a professor in the Computer and Information Science department at the University of Pennsylvania. Students are sometimes quite surprised to learn that there is much more to computer science than programming, but this is really an opportunity in disguise. It means that you are capable and welcome to do cool computational projects even if you didn't learn how to program during kindergarten :) My favorite classes to teach provide students with the mathematical background that they need to be successful in computer science and its young cousin, data science. In my spare time, I like to troll Siri. (My Ph.D. dissertation, Evaluating Humanness in Language Models, focuses on this.) And I like to sing. I did a lot of singing in college, leading to YouTube videos that my students "discover" every semester. Perhaps it was a little too much singing and not enough studying, but this leads to my best advice for college: pick the program that fits you, rather than changing yourself to fit the program.

Project ideas

Project ideas are meant to help inspire student thinking about their own project. Students are in the driver seat of their research and are free to use any or none of the ideas shared by their mentors.

Detecting bots on Twitter

Now that computers are good enough to generate very convincing text completely on their own, people have become quite concerned about "fake news". In this project, we will investigate how easy it is to detect Tweets that have been written by computers in four steps: 1) Collect some data, some possibly labelled already as "fake". 2) Look at the statistical properties of "real" Tweets versus "fake" Tweets. 3) Write a computer program, for example a Naive Bayes classifier, for labelling new Tweets as "real" or "fake". 4) Evaluate how good the program is using a sensible metric.

Coding skills

Python (and Jupyter notebook), MATLAB, R, UNIX scripting, Java, C

Languages I know

German, intermediate; Spanish, intermediate

Teaching experience

I have been on the teaching faculty at the University of Pennsylvania for 4 years. My favorite courses to teach are Mathematical Foundations of Computer Science, Computational Data Exploration, and Computational Linguistics. Previously, I taught Syntactic Theory (Linguistics) and some other courses at Saarland University in Germany. I also taught test prep for SAT, ACT, and GRE.

Credentials

Work experience

University of Pennsylvania (2018 - Current)
Lecturer and Projects Director for Masters in Data Science
Randstad Staffing on loan to Vanguard (2019 - 2021)
Research Data Science Contractor
Saarland University (2014 - 2018)
Research Associate

Education

Princeton University
BA Bachelor of Arts (2013)
Linguistics and Computation
Universität des Saarlandes
MS Master of Science (2015)
Language Science and Technology
Universität des Saarlandes
PhD Doctor of Philosophy
Computer Science

Interested in working with expert mentors like Clayton?

Apply now