Tom Hartvigsen

Assistant Professor

Data Science

University of Virginia

Contact: hartvigsen@virginia.edu

(Office: 1919 Ivy Rd., Rm. 339)

Department Website

★ News ★

July'25:

Paper accepted to COLM'25!
- PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages
Organizing NeurIPS'25 Workshop on Learning from Time Series for Healthcare

June'25:

Paper accepted to Rehabilitation Psychology on biased portrayals of disability by VLMs
New preprints:
- ModelCitizens: Representing Community Voices in Online Safety
- KScope: A Framework for Characterizing the Knowledge Status of Language Models
Featured in faculty research spotlight

May'25:

4 papers accepted to ACL'25!
3 papers accepted to ICML'25!
- WikiBigEdit: Understanding the limits of lifelong knowlege editing
- BalancEdit: Dynamically Balancing the Generality-Locality Trade-off in Multi-modal Model Editing
- Medical Large Language Model Benchmarks Should Prioritize Construct Validity (Oral)

March'25:

Our AAAI'25 KnowFM Workshop paper was awarded the Outstanding Paper Award.
New preprints:

Feb'25:

New preprints on empirical investigations of sparse autoencoders and lifelong model editing
Paper accepted to CPAL'25 on Sparse MoE

Jan'25:

I have been awarded a CapitalOne Faculty Fellowship to work on Time Series Reasoning!
2 papers accepted to ICLR'25!
- Composable Interventions for Language Models - Congrats, Arinbjorn!
- Learning from Time Series under Label Noise
Paper on sequential knowledge editing accepted to Workshop on Knowledgeable Foundation Models at AAAI 25
New preprint on lifelong model editing

Dec'24:

Invited talks at UVA's Genome Sciences Seminar Series and UVA's Darden Business School
New preprint on foundation models for protein phenotypes

Nov'24:

Lab member Xu Ouyang was awarded an iPRIME PhD Fellowship --- congrats Xu!
New preprint on scaling laws for LLM quantization

Oct'24:

New preprints:
Paper accepted to IEEE BigData on spike train classification

Sep'24:

3 papers accepted to NeurIPS'24!
- Are Language Models Actually Useful for Time Series Forecasting? (Spotlight!) - Congrats, Mingtian!
- Test-Time Debiasing of Vision-Language Embeddings
- UniTS: A Unified Multi-Task Time Series Model
3 papers accepted to EMNLP'24!
MATHWELL: Generating Educational Math Word Problems with Teacher Annotations - Congrats, Bryan!
Language Models Still Struggle to Zero-shot Reason about Time Series
Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks

Aug'23:

Paper accepted to TMLR on using LLMs for robust text classification

July'24:

Paper accepted to COLM'24 on multilingual toxicity in LLMs.
Paper accepted to AIES'24 on detecting implicit social biases in VL models

July'24:

New preprints:
- composable interventions for LLMs
- extracting social determinants of health with LLMs
Paper accepted to MICCAI'24 on federated learning for medical imaging

May'24:

Paper accepted to ACL'24 on categorical knowledge editing for LLMs

Apr'24:

Nature Medicine paper on bias in computational pathology

Spring'24: Invited talks at Dartmouth, IBM Research, UCSF/UC Berkeley, and the University of Alabama, Birmingham

Hi! I'm a tenure-track Assistant Professor of Data Science and, by courtesy, Computer Science at the University of Virginia. I also have appointments in UVA's Comprehensive Cancer Center and National Security & Data Policy Institute. Before joining UVA in Fall 2023, I was a postdoc at MIT CSAIL working with Marzyeh Ghassemi. I received my PhD in Data Science from WPI where I was advised by Elke Rundensteiner and Xiangnan Kong.

Research

My research group works on machine learning and natural language processing. We work to enable responsible model deployment in ever-changing environments, especially for healthcare.

Active directions and highlights:

Continually monitoring and updating big models
- Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adapters (NeurIPS'23) (blog post)
- TAXI: Evaluating Categorical Knowledge Editing for Language Models (ACL'24)
- Composable Interventions for Language Models (ICLR'25)
- Understanding the Limits of Lifelong Knowledge Editing (ICML'25) (Available on EasyEdit!)
- BalancEdit: Dynamically Balancing the Generality-Locality Trade-off in Multi-modal Model Editing (ICML'25)
- Model Editing with External Graph-Based Memory (ACL'25)
- Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes (ACL'25)
Time series and multi-modality
- Are Language Models Actually Useful for Time Series Forecasting? (NeurIPS'24 🌟Spotlight🌟)
- Language Models Still Struggle to Reason about Time Series (EMNLP'24)
- UniTS: A Unified Multi-Task Time Series Model (NeurIPS'24)
- Learning under Temporal Label Noise (ICLR'25)
- Inferring Events from Time Series using Language Models
Detecting and mitigating harmful biases in language and language models
- PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages (COLM'25)
- ModelCitizens: Representing Community Voices in Online Safety (preprint)
- PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models (COLM'24) // Leaderboard // blog post
- ToxiGen: Using LLMs to detect and mitigate implicit social biases (ACL'22). ToxiGen was used to train Llama2, Code Llama, phi-1.5, phi-2, and other LLMs, and to detect toxicity in Econ Forums and Laws.
Healthcare & Biomedical Data Science
- Medical Large Language Model Benchmarks Should Prioritize Construct Validity (ICML'25 🌟Oral Presentation🌟)
- Demographic Bias in Misdiagnosis by Computational Pathology Models (Nature Medicine)
- Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks (EMNLP'24)
- Dissecting the Heterogeneity of "In-the-Wild Stress" from Multimodal Sensor Data (npj Digital Medicine)
- MedBrowseComp: Benchmarking Medical Deep Research and Computer Use

Google Sites

Report abuse