Polypilot product mascot

Introducing PolyPilot:

Our AI-Powered Mentorship Program

Start your trial today

Learn More
profile picture

Tony W

- Research Program Mentor

PhD candidate at Stanford University

Expertise

big data analytics, crypto (ethereum)

Bio

I am interested in creating the next generation big data analytics systems focused on two areas: 1) performance in new compute settings, i.e. cloud 2) usability in new application domains, i.e. machine learning, blockchain etc. I am building an open-source data analytics framework called Quokka. I am also interested in crypto, especially exploring DeFi applications on Ethereum. In particular I am interested in looking at these things from a traditional finance perspective and thinking about things like miner-extracted value, security flaws and smart contract construction.

Project ideas

Project ideas are meant to help inspire student thinking about their own project. Students are in the driver seat of their research and are free to use any or none of the ideas shared by their mentors.

Fast blockchain data analytics

Make something like this: https://dune.com/rchen8/opensea that can update in real time. This is definitely not cutting-edge research, but will be a stepping stone to more useful things, for example answering the question what accounts did account X interact with as quickly as possible when X is not known ahead of time. This is useful for money laundering forensics for example. Hopefully after this project you learn how to interact with ethereum data through web3 API and some basics on streaming data analytics.

Optimizing SQL queries in a Python dataflow engine

I am building a Python-based dataflow query execution engine that can get near-peak performance on SQL queries if you use it in the right way: https://github.com/marsupialtail/quokka. There is a big list of queries that people want to run very quickly: https://github.com/Agirish/tpcds. Indeed if you can build a database that runs these things very quick you can raise $1bn: https://databricks.com/company/newsroom/press-releases/databricks-raises-1-billion-series-g-investment-at-28-billion-valuation. In this project you will try to implement some of SQL queries in this execution engine, and hopefully get performance 2x better than a lot of commercial offerings. You can also think about how to do this in a more general way, i.e. how you would build a compiler to automatically implement these SQL queries. I am actively working on this. Hopefully in the end you come away with some knowledge on what makes code fast and scalable. This knowledge is ever more important as hardware fails to become faster on its own.

Coding skills

Python (wouldn't want to mentor in any other language)

Languages I know

French intermediate

Teaching experience

Mentored students in high school and college. Working with undergrads at Stanford as PhD

Credentials

Work experience

Hedge Fund (2021 - Current)
Consultant

Education

Massachusetts Institute of Technology
BS Bachelor of Science (2019)
Computer Sceince
Massachusetts Institute of Technology
MEng Master of Engineering (2020)
Computer Science
Stanford University
PhD Doctor of Philosophy candidate
computer science

Interested in working with expert mentors like Tony?

Apply now