Tom Hartvigsen
Assistant Professor
Data Science
University of Virginia
Visiting Assistant Professor
CSAIL
MIT
★ News ★
Two papers accepted to NeurIPS'23
I am the General Co-Chair of ML4H 2023
Upcoming talk at Cornell
Talks at Oxford and UVA CS
EAMMO'23 paper on Drawing lessons from Aviation Safety for Health AI
Best Paper at IMLH Workshop @ ICML 2023
Preprint on unified ethical in-context learning
Co-chairing ICML 2023 Workshop
SERC paper on algorithmic fairness in chest x-ray diagnosis
Preprint on classifying irregularly-sampled time series
AAAI'23 paper on multi-label knowledge amalgamation
Invited talk at AJCAI'22 Workshop on Toxic Language Detection
Co-chairing NeurIPS '22 workshop
Invited talks at WPI CS Colloquium and Northeastern
Four workshop papers accepted at NeurIPS'22
ML4H'22 paper on detecting stress from wearable devices
ICDM'22 paper explainability for deep time series classifiers
Hi! I'm an Assistant Professor of Data Science at the University of Virginia. I am spending the 2023-2024 academic year in Cambridge, MA where I am a Visiting Assistant Professor at MIT. Previously, I was a postdoc at MIT working with Marzyeh Ghassemi. Before that, I did my PhD in Data Science at WPI where I was advised by Elke Rundensteiner and Xiangnan Kong.
I am recruiting highly-motivated students to join my group at the University of Virginia. Please email me with your CV if you feel you're a good fit for my group!
Research
I work on making machine learning responsible and trustworthy enough for deployment in shifting environments, especially those in healthcare. To achieve this, I work on building:
Robust methods for learning from data and labels that are are biased, missing, or noisy
Models that generalize and adapt to ever-shifting distributions and requirements
Tools to make models meet strong human requirements like early warnings, explainability, updatability, and safety/fairness.
Currently, I am most excited to work on:
post-training interventions for large [language] models (especially model editing)
pre-training for time series and multi-modal models
automatically detecting implicit bias in natural and AI-generated language at scale
Keywords: Updatable Machine Learning, Time Series, Large Language Models, Robustness, Interpretability, Fairness.
Recent highlights
GRACE: Continually editing the behavior of large language models during deployment (NeurIPS'23 + code)
ToxiGen: Constructing large, diverse hate speech datasets with large language models to train better hate speech classifiers (ACL'22 + dataset)
Impact: ToxiGen has been used while training Llama2, Code Llama, phi-1.5, and more, and to detect toxicity in Econ Forums and Laws.
Reinforcement Learning for early warning systems on time series (see KDD'19; KDD'20; CIKM'22)
Robustness to uncertain/incomplete data/labels (see preprint'23, AAAI'23; AAAI'22; SDM'22; CIKM'22; AAAI'21)
Explainability for time series and NLP models (see NeurIPS'23; FAccT'22; ICDM'22; ACL'20; CIKM'21)
In the News
Our work on ToxiGen was covered by TechCrunch and Microsoft Research
Our work on Fair Explainability was covered by MIT News
Misc
Outside of research, I enjoy bouldering, biking, books (science fiction/science fact), birding, juggling, vegan cooking, and guitar. I also spent a summer living at BioSphere 2 in Arizona.