profile picture

Alex C

- Research Program Mentor

PhD candidate at University of California Santa Cruz (UCSC)


machine learning, data science, NLP, language generation, data journalism, deep learning, web scraping, creative coding, narrative generation, AI


Hi! I'm Alex. I specialized in natural language processing in my Master's degree, and am pursuing a PhD in Computational Media. I'm fascinated in creating systems that can understand human language for new use cases in storytelling, art, journalism, and social computing. I am eager to engage with projects that use bleeding-edge NLP for humanistic purposes, often to understand culture, or to produce it. I am experienced in full-stack ML engineering, as well as data science techniques including data collection and model training. I am an expert in web scraping, and also am highly experienced in using large deep learning models and using modern language models such as GPT-3 for content generation or fine-tuned transformers for named entity recognition, semantic analysis, or other data structuring purposes. Otherwise, I love to surf, travel, and play games.

Project ideas

Project ideas are meant to help inspire student thinking about their own project. Students are in the driver seat of their research and are free to use any or none of the ideas shared by their mentors.

Build an AI clone

Let's train an AI to mirror a specific style, such as the work of Mahatma Ghandi, or your chat history. We will gain experience collecting training data (which may be stored in hard-to-access forms such as chat logs or PDF's), training a machine model using this data, and potentially even deploying it somewhere. Along the way, we will learn how to use language models to interrogate the data that we trained it on. An example is this project I worked on, where we trained a language model to write like an Instagram influencer: https://www.instagram.com/myfriendsylvia/?hl=en

Create a graph database from unstructured data

Journalists and social scientists are turning to NLP to understand complex relationships among documents and the entities they reference. An example is a story I worked on for WIRED, for which we built a graph database out of Twitter posts, and used this to discover mistakes in Twitter's disinformation policy. https://www.wired.com/story/how-americans-wound-up-on-twitters-list-of-russian-bots/

Coding skills

python, sklearn, pytorch, spark, selenium, flask, p5, javascript, web design, APIs, others!

Teaching experience

I have extensive tutoring experience for undergraduate and graduate computer science and math courses. I've also TA'ed for classes such as AI and UI Design at Columbia University and Game AI and Creative Coding at UCSC.


Work experience

Bloomberg LLC (2020 - 2021)
Senior News Automation Engineer
Columbia University (2019 - 2020)
Research Scholar
ProductionPro Technologies (2018 - 2020)
Data Engineering Consultant


Montana State University - Bozeman
BSE Bachelor of Science in Engineering (2017)
Computer Science (minors in math and English)
Columbia University
MS Master of Science (2019)
Natural Language Processing, Journalism
University of California Santa Cruz (UCSC)
PhD Doctor of Philosophy candidate
Computational Media

