Class of 2025
- "Investigating Embeddings Produced by Multimodal models" with mentor Efthimios (Feb. 12, 2024)
Investigating Embeddings Produced by Multimodal models
Started Nov. 22, 2023
Abstract or project description
The advent of LLMs and multimodal models in the past year has given machine learning models the ability to treat language and vision data interchangeably. We propose a case study investigating the properties of embeddings produced by multimodal models. In particular, we plan to compare the performance of models trained on raw inputs (e.g. pixels) compared to the embeddings produced from models like CLIP. We also plan to investigate the embeddings produced by language descriptions of the images, and evaluate how they can be incorporated into classification to get better performance.