Unlocking Patterns in Exposure Data with Shapelets

Date:

Poster Preview

This project develops the first large-scale, open-source EPA shapelet library to identify recurring temporal patterns (“shapelets”) in environmental exposure data. The goal is to create an interpretable, reproducible resource that enables researchers to study how air-quality fluctuations influence health outcomes.

Year 1 accomplishments include:

  • Processed 20 years (2004–2024) of EPA air-quality data across 600 U.S. counties and 20 pollutants.
  • Extracted over 48 million shapelets, capturing both short-term (acute) and long-term (chronic) exposure patterns.
  • Developed scalable tools for motif discovery, imputation of missing data, and spatiotemporal visualization.
  • Published the interactive library at https://ehie-shapelets.ctsi.utah.edu

Next Steps:

  • Link exposure motifs to health outcomes such as sports performance, stillbirth risk, pneumonitis, and suicide.
  • Extend the framework to multi-exposure modeling and AI-driven interpretability studies.
  • Conduct usability testing (SUS) to evaluate accessibility for environmental health researchers.

[Download the presentation here]