Tom Hartvigsen
Assistant Professor
Data Science
University of Virginia
(working from Cambridge, MA until July'24)
★ News ★
[Apr'24] Nature Medicine paper on bias in computational pathology
[Apr'24] New preprints on time series reasoning with language models, categorical knowledge editing for LLMs, and time series foundation models
Invited talks at UMass Amherst, IBM Research, and Arizona State University on model editing
[Feb'24] Paper accepted to Knowledge and Information Systems on explaining multi-class time series classifiers
[Feb'24] New preprints on label noise in time series, black-box NLP robustness, and generating math word problems
[Jan'24] Talks at Dartmouth CS, UCSF/UC Berkeley, and the University of Alabama, Birmingham
MIT News covered our perspective drawing lessons from aviation safety for AI in health
Workshop accepted to ICLR'24 on Time Series for Health
[Dec'23] Two NeurIPS'23 papers
PI on $200k grant to work on editing and debiasing LLMs (recruiting postdoc)
General Co-Chair of ML4H 2023
Paper accepted to npj Digital Medicine
Hi! I'm an Assistant Professor of Data Science at the University of Virginia. I am spending the 2023-2024 academic year in Cambridge, MA where I am a Visiting Assistant Professor at MIT. Previously, I was a postdoc at MIT working with Marzyeh Ghassemi. Before that, I did my PhD in Data Science at WPI where I was advised by Elke Rundensteiner and Xiangnan Kong.
Research
I'm broadly interested in machine learning and natural language processing. I work to enable responsible model deployment in ever-changing environments, especially for health.
These days, I mostly focus on:
Continually editing and adapting large language models
Time series foundation models
Pre-training multi-modal models
Detecting and mitigating harmful social biases in natural language
Healthcare applications: NLP for scientific medical literature, learning from ICU time series, fair computational pathology, understanding mental health via wearable devices
Recent highlights
Model Editing:
Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adapters (NeurIPS'23 + code + blog post)
TAXI: Evaluating categorical knowledge editing for language models (arXiv'24 + data)
ToxiGen: Using LLMs to detect and mitigate implicit social biases (ACL'22 + dataset)
Impact: ToxiGen has been used while training Llama2, Code Llama, phi-1.5, phi-2, and other LLMs, and to detect toxicity in Econ Forums and Laws.
Can language models reason about time series data? (arXiv'24)
Robustness to uncertain/incomplete data/labels (see arXiv'24, arXiv'23, AAAI'23; AAAI'22; SDM'22; CIKM'22; AAAI'21)
Explainability for time series and NLP models (see NeurIPS'23; FAccT'22; ICDM'22; ACL'20; CIKM'21)
In the News
Our work drawing lessons from aviation safety for health AI was covered by MIT News and Innovate Healthcare
GRACE was featured in the Microsoft Research blog
ToxiGen was covered by TechCrunch and Microsoft Research
Our work on Fair Explainability was covered by MIT News
Misc
Outside of research, I enjoy bouldering, biking, books (science fiction/science fact), birding, juggling, vegan cooking, and playing guitar. I also spent a summer living at BioSphere 2 in Arizona.