Improving Model Delivery: My model's accurate, why won't you use it? Useful models are...


In time-sensitive, time-varying systems, machine learning models need to give people enough time to react. We're building models that learn to use as little data as possible to make accurate predictions in various contexts:

  • How few timesteps can we collect to accurately classify an ongoing time series? (KDD'19)

  • How can class predictions reinforce each other to make early classifications with multiple labels? (KDD'20)

  • How little time can we use to classify ongoing irregularly-sampled time series? (CIKM'22)


Machine learned models perpetuate societal biases baked into their training data, often harming historically-disadvantaged people. We build methods that detect and mitigate such bias in many settings:

  • How can we detect implicit hate towards disadvantaged groups? (ACL'22)

  • How fair are explainability models? (FAccT'22)

  • Can good label acquisition guarantee fair downstream classifiers? (coming soon!)

and Interpretable

It's hard to trust a machine's prediction without knowing its reasoning. We've been building explainability methods and running human-computer interaction studies to figure out how to help people trust trustworthy machine learning models:

  • Which portions of a time series does your classifier use most? (CIKM'21, ICDM'22)

  • Do people and machines agree on which words are most-important for classification? (ACL'20)

  • How can we train interpretable models with very limited supervision? (BigData'21)

Continuous Health: Modeling day-to-day health outside the doctor's office

Most of us only go to the doctor when we're already sick, so little is known about in-the-wild health at scale. We are building machine learning methods for learning from wearable device time series and self-reported measures to better understand what it means to be healthy.

Modeling Data- and Label-Collection Behavior

Machine learning for human-subject data is hard: labels are often noisy, data are often missing, and signals in the data can be unique. To solve these problems, we've been working on the following directions:

  • Modeling self-reported labeling mechanisms for time series (AAAI'22, SDM'22)

  • Finding sparse signals in long, irregularly-sampled time series (coming soon!)

  • Modeling multi-label relationships between data and labels (NeurIPS'21)

  • Monitoring spread of Foodborne Illnesses via Twitter (LREC'22)

Learning from Wearable Device Data

We've all experienced stress and other subjective, adverse health events, but what does they mean physiologically? What are the consequences? How do they differ by person? Can we predict these events ahead of time or categorize them afterwards? These questions are largely unanswered; subjective health problems are pervasive yet poorly understood. Using in-the-wild wearable sensor data, we are quantifying and characterizing health in people's daily lives:

  • Identifying stress periods from wearable devices

  • Change point detection methods for digital health