Schedule: Monday, June 23rd, 2025, 13:30 – 17:00
Topic | Duration | Time |
---|---|---|
Welcome and General Introduction | 10 minutes | 13:30 |
Temporal Phenotyping in (Post-) COVID | 20 minutes | 13:40 |
Detailed Presentation PASC algorithm | 30 minutes | 14:00 |
Questions/Discussion about the algorithm | 15 minutes | 14:30 |
Setup/Start Docker-Containers with the algorithm and synthetic data | 15 minutes | 14:45 |
Break & Troubleshoot Start Docker Containers | 30 minutes | 15:00 |
Guided application of the algorithm on data | 60 minutes | 15:30 |
Questions/Discussion about the application | 15 minutes | 16:30 |
How to apply to identify other unexplained chronic conditions not directly linked to PASC | 10 minutes | 16:45 |
Feedback and WrapUp | 5 minutes | 16:55 |
END |
Preparation
We ask the participants that they have Docker already preinstalled (Link to official Docker Installation Guide: ) and downloaded the Docker Image (Link to Docker Image: TBA!) including the RStudio Instance with all required Packages and synthetic Data before the workshop. Please ensure that you can run/start containers using the hello world example from the above mentioned guide.
Topics
Longitudinal EHR Data
Longitudinal Electronic Health Record (EHR) data offers a opportunity to track patient health trajectories over time enabling the analysis of disease progression, treatment outcomes, and patient responses to interventions. Longitudinal data can be used to identify patterns, trends, and correlations in diseases and patient journeys. Researchers can leverage longitudinal data to address complex research questions, and develop predictive models that enhance patient care and outcomes.
Precision phenotyping
Precision phenotyping refers to the detailed characterization of diseases on a patient level. Therefore these approaches relay on large datasets to allow researchers and clinicians to get a better understanding (of the heterogeneity) of diseases. Precision phenotyping holds the potential to enhance personalized medicine by providing nuance insights and contributing to more precise, effective and individualized healthcare.
Synthetic Data
In the scope of the tutorial we will be using synthetic data based on the Synthea data sets, which resemble the population of Massachusetts. This allows to present an algorithm, which normally requires a large amount of sensitive data and allows us to share the data with the participants and enables the participants to apply the algorithm by them self in the tutorial.
Attention Mechanism
The algorithm that will be presented in this tutorial uses an attention mechanism to identify patient-specific PASC symptoms. For each possible PASC symptom, the attention mechanism checks the patient history of the patient, if another entry is associated with this entry based on the temporal distance between both entries. If an association is identified, the current symptom might not be a PASC symptom for this patient but associated with another condition of the patient.
Post-acute sequelae of COVID-19 (PASC)
PASC, also called long or Post COVID, is a complex new disease that describes chronic conditions after a COVID-19 infection. The World Health Organization defines PASC as: “Post COVID-19 condition occurs in individuals with a history of probable or confirmed SARS CoV-2 infection, usually 3 months from the onset of COVID-19 with symptoms and that last for at least 2 months and cannot be explained by an alternative diagnosis.[...]” [2]. This definition of exclusion is challenging to implement. Nevertheless, it is necessary to develop automatic approaches to identify patient specific PASC symptoms and patients in large real-world data warehouses to build symptom-specific cohorts and run retrospective studies.
References
1. Azhir A, Hügel J, et al. Precision phenotyping for curating research cohorts of patients with unexplained post-acute sequelae of COVID-19. Med (N Y). 2024 Nov 2. DOI: 10.1016/j.medj.2024.10.009
2. Soriano JB, Murthy S, et al. A clinical case definition of post-COVID-19 condition by a Delphi consensus. Lancet Infect Dis. 2022 Apr;22(4):e102–7. DOI: 10.1016/S1473-3099(21)00703-9
Motivation
Our proposed tutorial will showcase a novel algorithm designed to enhance the diagnosis of Long COVID by leveraging artificial intelligence and large-scale medical record analysis. Traditional diagnostic methods rely on a process of elimination, requiring clinicians to systematically rule out all other conditions before identifying Long COVID—a time-consuming and often biased approach that disproportionately affects marginalized communities. Our computational tool streamlines this process by analyzing extensive patient data, identifying subtle temporal patterns that link COVID-19 infections to lingering symptoms, and systematically excluding other potential causes. By utilizing AI-driven temporal association mining, the algorithm detects complex, often overlooked connections between symptoms and prior infections, enabling a more precise and individualized diagnosis while reducing biases inherent in conventional diagnostic coding systems.
The tutorial will enable participants to build their own precision cohorts using their own EHR data and analyze it. Furthermore, implementing it can provide synergies with other already ongoing PASC related projects. Through demonstrating in the end how the algorithm can be modified to use it to identify unexplained chronic conditions in general, we provided participants with a tool to build their own precision cohorts of patients with unexplained chronic conditions. These cohorts can than used as a bases for further analysis.
The tutorial covers aspects from the following areas, which are also scopes of the conference: machine learning and big data analytics, clinical decision support systems, and precision medicine.
Chairs
Jonas Hügel1,2, Arianna Dagliati3, Spiros Denaxas4,5, Ulrich Sax1,2, Shawn Murphy6,7, Hossein Estiri6,7
1 University Medical Center Göttingen, Department of Medical Informatics, Göttingen, Germany
2 University of Göttingen, Campus Institute Data Science, Section of Medical Data Science, Göttingen, Germany
3 University of Pavia, Department of Electrical, Computer and Biomedical Engineering, Pavia Italy
4 Institute of Health Informatics, University College London, London UK
5 British Heart Foundation Data Science Centre, HDR UK
6 Harvard Medical School, Boston, US
7 Massachusetts General Hospital, Boston, US
Chairs
Please register for the workshop during the conference registration. The registration is already open. We are looking forward to see you in Pavia.