Tom Hartvigsen, Ph.D.
Postdoc
Massachusetts Institute of Technology
CSAIL
Pronouns: He/Him
★ News ★
I'll be co-chairing a workshop at ICML 2023 on Generative AI
I'll be serving as the General Chair of ML4H 2023
Paper published in SERC on algorithmic fairness in chest x-ray diagnosis
New preprint on classifying irregularly-sampled time series
Paper accepted at AAAI'23 on multi-label knowledge amalgamation
Invited talk at AJCAI'22 Workshop on Toxic Language Detection
Chairing NeurIPS '22 workshop
Invited talk at WPI CS Colloquium
Invited talk at Northeastern
Four workshop papers accepted at NeurIPS'22
Paper accepted at ML4H'22 on detecting stress from wearable devices
Paper accepted at ICDM'22 on explainability for deep time series classifiers
Paper accepted at CIKM'22 on early classification of irregular time series
Paper accepted at CIKM'22 on recurrent classifier chains with missing labels
Invited talks at Google Research, Tufts, and the University of Rochester
Paper accepted at ACL'22 on constructing datasets with language models.
Hi! I'm a postdoc at MIT working with Prof. Marzyeh Ghassemi. My research spans machine learning, data mining, and applications to improve healthcare.
I received my PhD in Data Science from WPI in 2021, where I was a GAANN PhD Fellow advised by Professors Elke Rundensteiner and Xiangnan Kong.
Research
I'm making machine learning and data mining methods deployable for healthcare. I focus on core problems in time series and natural language data when:
Data and labels are missing and noisy
Models must adapt to quickly shifting distributions and requirements
We have strong human-facing requirements like early predictions, interpretability, and safety/fairness.
The projects that excite me the most: (1) robustly model dynamic environments through time series, (2) prevent perpetuating bias via machine learning and/or (3) have impact through real-world deployment.
Recent highlights:
GRACE: Continually editing pre-trained models thousands of times in a row during deployment (preprint)
ToxiGen: Constructing large, diverse hate speech datasets with large language models to train better hate speech classifiers (ACL'22 + dataset)
Timely and Actionable Time Series Models (see KDD'19; KDD'20; CIKM'22)
Robustness to uncertain/incomplete labels (see AAAI'23; AAAI'22; SDM'22; CIKM'22; AAAI'21; preprint)
Explainability for time series and NLP models ( see ACL'20; CIKM'21; ICDM'22)
Service
Organizer:
General Chair, ML4H 2023
Co-Chair, ICML 2023 Workshop on Challenges in Deploying Generative AI
Co-Chair, NeurIPS 2022 Workshop on Learning from Time Series for Health
CHIL 2023
Led tutorial on Deep Learning with PyTorch for undergrads
Program Committee: AAAI, WSDM, CVPR, ICCV, ACL, EMNLP, NAACL, KDD, CHIL, NeurIPS Datasets & Benchmarks Track
In the News
Our work on TOXIGEN was covered by TechCrunch and Microsoft Research
Our work on Fair Explainability was covered by MIT News
Misc
Outside of research, I enjoy bouldering, biking, books (science fiction/science fact), birding, juggling, vegan cooking, and guitar. I also spent a summer living at BioSphere 2 in rural Arizona.