I'm interested in compression, interpretability, and embedded applications, currently working on LLMs. I have been working with Elias Stengel-Eskin and Yi-Lin Sung on LLM Quantization and Interpretability.
Using interpretability informed saliency scores based on task-specific information to localize important weights to preserve during model compression, yielding improvements for both general and task specific quantization