High school research student Ayaan proposes a novel solution to common AI/ML classification problem
Ayaan is a junior in high school from Saratoga, California, who worked with his mentor, a PhD candidate at Princeton, to conduct high-level research on Artificial Intelligence and Machine Learning. After reading several papers, Ayaan formulated his own research idea on image generation and classification. While working on this project, he was able to create an algorithm to help generate new images for classification. Read more to learn about Ayaan’s experience and the advice he has for future students.
What was your experience with research before Polygence and why did you decide to apply to Polygence?
Before Polygence, I didn't really have any experience in research. I did my stuff around AI and machine learning, and I had some experience with that, but I never did anything like research or running experiments and writing papers. When I found out about Polygence, I thought it was a good way to get introduced to research and a good opportunity.
When I found out about Polygence, I thought it was a good way to get introduced to research and a good opportunity.
Can you tell us a little bit about your project?
Basically, I did some work with this one type of machine learning or AI algorithm that when you give it a bunch of images, it learns how to create new images that look just like the originals. The data set that we used had images of house numbers. I’d feed these images to the algorithm to teach it how to generate completely new images that look just like the original images.
In machine learning, there's a classifier where someone will take a data set, and try to classify the individual images in the set. But you need a lot of data for a good algorithm. So we thought if you can generate new images, that should, in theory, help a classifier because you generate new images that look real. The use case is a situation when you have a specific problem where you have very little data, but you want to use the classifier. We thought that if we can generate new images that look almost real, it should, in theory, help the classifier because now it has more data to train the algorithm.
Is this a project that you knew you wanted to work on before Polygence?
Actually, no. I actually didn't know what I wanted to do at all. In the first week, my mentor gave me some blogs and papers to look at and see what I wanted to do. And I saw that there's this model that generates images. I read a little bit about it, and the first thing that came to my mind was if you can make new images, it should help the classification task. When I researched it, it turned out there wasn't actually much work on it. From my research, there wasn't a lot of stuff about how you could take images that are generated and input it into a classifier. So what we used is called a GAN, a generative adversarial network. There's been some work using GANs for classification, but they are different from our final algorithm, as they don’t actually take the images and use it for classification. So it turned out to be a pretty interesting and novel idea.
Can you tell us about a typical session? What would you and your mentor do?
In the beginning, it was basically just getting introduced to this. My mentor knew a lot about GANs, so he was able to help me a lot. He taught me a lot about the general intuition of how they work. I'd say the first three to four, maybe even five sessions, it was a lot of him helping me out where I would learn from him, and he would guide me through specific material. And then once I got the hang of it, I thought of new ideas of how to make the entire algorithm better. Whenever I had an idea, I'd usually be able to do most of it on my own, but there were a few times when I had the idea and I wasn’t sure how to do it. So in our later sessions before I started writing the paper, we’d either spend time cleaning up the code I wrote or walking through how to implement my ideas.
How is your mentor different from other teachers or tutors you've had in the past?
Most other teachers can't really help in a one-on-one situation; they also tell you what to do. But with my mentor, he let me pick the idea and come up with new ideas. And instead of him doing the project, it was really my own and something he would help me. My mentor's expertise was really helpful; it aligned with a lot of stuff I'm interested in. So he was a really good mentor for me because he was able to help me with pretty much anything that I wanted to do.
What skills did you develop during this project?
Definitely, how to do high-level AI research. Before Polygence, I didn't really know what computer science research really meant. But now I know what it means and actually how to do it at a high level. And then obviously I learned a lot about AI and ML, the basic concepts and more advanced concepts as well. And then coding; I learned a lot about Pytorch, which is the Python package that we used. I was good with Python before, but I learned a lot about this package, and I actually turned out to like it a lot.
Before Polygence, I didn't really know what computer science research really meant. But now I know what it means and actually how to do it at a high level.
What advice would you give someone who's just starting Polygence?
One of the main things is that you don’t get this kind of opportunity very often. With Polygence, you get a specific advisor to help you and just you; so take advantage of that. Your mentor is smart and accomplished, so definitely take advantage of that.
Another thing: at the beginning I was trying to do a meeting every week, but then I realized it's probably better to spread it out. Because I think if you try to do a session every week or go faster, you rush yourself. And your mentor ends up helping you with things that you could have done on your own.
You can find Ayaan's paper published on here on arXiv.