Public Health Informatics Educational Resources
These resources consist of exercises, code, and course materials that are accessible to instructors in informatics training programs at educational institutions to enhance or supplement their existing course curriculum. These exercises are supported by two de-identified datasets created by the Regenstrief Institute which contain big data from real -world population health data sources, the Indiana Network for Patient Care and the Indiana Public Health Emergency Surveillance System. These materials are available at no cost to help train the public and population health informatics workforce. Read further to learn more about what is available for use.
Indiana Network for Patient Care: Type 2 Diabetes Population
The Type 2 Diabetes Population dataset consists of a sample of approximately 78,000 patient medical records drawn from the Indiana Network for Patient Care. When used in combination with the exercises, students will discover methods for describing, exploring, managing, and analyzing population health data. The dataset contains information on patient demographics, laboratory test results, medications dispensed, encounters, and diagnostic codes including comorbidities.
Data Files
To access the data files, please fill out this survey to begin the data agreement process.
Note: Educational institutions must complete a data agreement before downloading the data. All faculty at an institution can utilize the datasets for educational purposes outlined in the agreement once signed.
Teaching materials
Teaching materials, such as student exercises and a data dictionary, are available in the Regenstrief Github Repository.
Public Health Emergency Surveillance System (PHESS)
PHESS is a syndromic surveillance system for the State of Indiana designed to detect potential disease outbreaks using data gathered from emergency department information systems. This dataset consists of a de-identified, random sample of 100,000 messages drawn from the operational PHESS system. When used in combination with the exercises, students can explore and describe real-world syndromic data as an epidemiologist would, with exercises such as investigating temporal trends in diagnosis codes and mapping diseases to geographic regions.
Data Files
To access the data files, please fill out this survey to begin the data agreement process.
Note: Educational institutions must complete a data agreement before downloading the data. All faculty at an institution can utilize the datasets for educational purposes outlined in the agreement once signed.
Teaching materials
Teaching materials, such as student exercises, are available in the Regenstrief Github Repository.
General Resources
- Integrating Data Science into T32 Training Programs at IU Indianapolis
- Data Science Competencies List
- Send us feedback on these resources.
The Indiana Training Program in Public & Population Health Informatics prepares research scientists to design, implement, and evaluate the impact of information technologies on population health. The program directors and the Regenstrief Institute, with the support of the Indiana University Richard M. Fairbanks School of Public Health and the National Library of Medicine T15 training program, developed these educational resources. This project was supported by the National Library of Medicine of the National Institutes of Health under Award Number T15LM012502. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The authors on this project are as follows: Dr. Saurabh Rahurkar, Dr. Spencer Lourens, Dr. Uzay Kirbiyik, Ashley Wiensch, and Dr. Brian E. Dixon.
*Important note: The datasets and materials are designed to be used for educational purposes only and should not be used to conduct or publish research. Sampling methods and perturbation of the data to enable release of the datasets likely introduced bias and errors into the datasets that would violate most statistical assumptions. Individuals interested in conducting research using the data sources described can contact the Regenstrief Institute for further discussion.
If you have additional questions or need assistance with the data agreement process, please contact Ashley Wiensch at (317) 274 – 9025 or awiensch@regenstrief.org.