Tom Hartvigsen
Postdoc
Massachusetts Institute of Technology
Pronouns: He/Him
★ News ★
AAAI'23 paper on multi-label knowledge amalgamation
Invited talk at AJCAI'22 Workshop on Toxic Language Detection
Chairing NeurIPS '22 workshop
Invited talk at WPI CS Colloquium
Invited talk at Northeastern
Four workshop papers accepted at NeurIPS'22
ML4H'22 paper on detecting stress from wearable devices
ICDM'22 paper on explainability for deep time series classifiers
CIKM'22 paper on early classification of irregular time series
CIKM'22 paper on recurrent classifier chains with missing labels
Invited talks at Google Research, Tufts, and the University of Rochester
ACL'22 paper on constructing datasets with language models.
Invited talk at MIT Horizons
AAAI'22 paper on positive unlabeled learning.
SDM'22 paper on positive unlabeled learning.
Hi! I'm a postdoc at the Massachusetts Institute of Technology working with Prof. Marzyeh Ghassemi. My research spans machine learning, data mining, and applications to improve healthcare.
I received my PhD in Data Science from WPI in 2021, where I was a GAANN Fellow advised by Professors Elke Rundensteiner and Xiangnan Kong.
I am on the academic job market for tenure-track positions in CS/ECE/Data Science
Research
I'm making machine learning and data mining methods deployable in complex, time-varying environments. I focus on core problems in time series and natural language data that arise in healthcare, where:
Data and labels are missing and noisy
Models must adapt to quickly shifting distributions and requirements
We have strong human-facing requirements like early predictions, interpretability, and safety/fairness.
The projects that excite me the most: (1) robustly model dynamic environments through time series, (2) prevent perpetuating bias via machine learning and/or (3) have impact through real-world deployment.
Recent highlights:
GRACE: Continually editing pre-trained models thousands of times in a row during deployment (preprint)
ToxiGen: Constructing large, diverse hate speech datasets with large language models to train better hate speech classifiers (ACL'22 + dataset)
Timely and Actionable Time Series Models (see KDD'19; KDD'20; CIKM'22)
Robustness to uncertain/incomplete labels (see AAAI'23; AAAI'22; SDM'22; CIKM'22; AAAI'21)
Explainability for time series and NLP models ( see ACL'20; CIKM'21; ICDM'22)
Service
Organizer:
NeurIPS 2022 Workshop on Learning from Time Series for Health
CHIL 2023
Led tutorial on Deep Learning with PyTorch for undergrads
Program Committee: AAAI, WSDM, CVPR, ICCV, ACL, EMNLP, NAACL, KDD, NeurIPS Datasets & Benchmarks Track
In the News
Our work on TOXIGEN was covered by TechCrunch and Microsoft Research
Our work on Fair Explainability was covered by MIT News
Misc
Outside of research, I enjoy bouldering, biking, books (science fiction/science fact), birding, juggling, vegan cooking, and guitar. I also spent a summer living at BioSphere 2 in rural Arizona.