Michael C
- Research Program Mentor
PhD candidate at University of California San Diego (UCSD)
Expertise
genetics, cellular/molecular biology, computer science, neuroscience
Bio
I am a PhD Candidate in the Bioinformatics and Systems Biology Graduate Program at UC San Diego and a National Science Foundation Graduate Research Fellow. I am passionate about understanding human disease, and strongly believe that biomedical research with DNA sequencing and related technologies can enable this. Collaboration and interdisciplinary research are central to my approach, and I am dedicated to use reproducible, rigorous, and data-driven methods in these settings to make scientific discoveries. Outside the lab, I enjoy exploring San Diego County through hiking, swimming, and playing tennis, embracing the area’s natural beauty with an active lifestyle. I find joy in cooking and experimenting with new recipes, often drawing inspiration from various cuisines around the world, especially India and Italy. These activities provide a balance to my academic pursuits and keep me energized and creative.Project ideas
Diagnosing Breast Cancer Subtypes Using Cancer Genome Sequencing Data
I specialize in disease biology, genomics, bioinformatics, and data analysis. I can help students explore topics such as cancer genomics, bioinformatics techniques, and the application of machine learning in healthcare. Knowledge and Skills to be Learned • Basics of cancer biology, particularly breast cancer subtypes • Principles of genome sequencing and data interpretation • Bioinformatics tools and software (e.g., R, Python) • Machine learning algorithms for DNA variant calling • Data visualization and scientific reporting Students will: 1. Learn about Breast Cancer Subtypes: Gain foundational knowledge about the biology of cancer and breast cancer subtypes (e.g., HER2-positive, triple-negative, hormone receptor-positive). 2. Explore Genome Sequencing: Understand the basics of genome sequencing technologies and the types of data generated. 3. Data Acquisition: Obtain publicly available cancer genome sequencing datasets from resources such as The Cancer Genome Atlas (TCGA). 4. Data Processing: Use bioinformatics tools to preprocess and clean the raw sequencing data. 5. Variant Calling: Apply machine learning algorithms to classify breast cancer subtypes based on the genetic data. 6. Visualization and Reporting: Create visualizations to represent findings and compile the results into a scientific research paper. Potential Student Outcomes • Scientific Research Paper: A detailed report outlining the methodology, analysis, and findings of the project. • Data Visualizations: Graphs and charts that illustrate key aspects of the data and the results of the machine learning models. • Presentation: A PowerPoint or poster presentation summarizing the project for academic or science fair settings. • Code Repository: A well-documented codebase for data processing and analysis, potentially shared on platforms like GitHub for community use. This project will provide students with hands-on experience in cancer genomics research, enhance their computational skills, and give them a taste of real-world applications of bioinformatics and machine learning in healthcare.