Previous Seminars This Semester
Speaker: Alexander Hsieh, PhD student
Title: Detection of mosaic single nucleotide variants in exome sequencing data and implications for congenital heart disease
Abstract: The contribution of somatic mosaicism, or genetic mutations arising after oocyte fertilization, to congenital heart disease (CHD) is not well understood. Further, the relationship between mosaicism in blood and cardiovascular tissue has not been determined. We developed a computational method, Expectation-Maximization-based detection of Mosaicism (EM-mosaic), to analyze mosaicism in exome sequences of 2530 CHD proband-parent trios. EM-mosaic detected 326 mosaic mutations in blood and/or cardiac tissue DNA. Of the 309 detected in blood DNA, 85/94 (90%) tested were independently confirmed. Twenty-five mosaic variants altered CHD-risk genes, affecting 1% of our cohort. Of these 25, 22/22 candidates tested were confirmed. Variants predicted as damaging had higher variant allele fraction than benign variants, suggesting a role in CHD. The frequency of mosaic variants above 10% mosaicism was 0.13/person in blood and 0.14/person in cardiac tissue. Analysis of 66 individuals with matched cardiac tissue available revealed both tissue-specific and shared mosaicism, with shared mosaics generally having higher allele fraction. We estimate that ~1% of CHD probands have a mosaic variant detectable in blood that could contribute to cardiac malformations, particularly those damaging variants expressed at higher allele fraction compared to benign variants. Although blood is a readily-available DNA source, cardiac tissues analyzed contributed ~5% of somatic mosaic variants identified, indicating the value of tissue mosaicism analyses.
Speaker: Michelle Chau, PhD student
Title: Developing a user-centered, machine learning approach to identify preferences for inspirational social media health-related images for young populations
Abstract: Nutrition interventions for adolescents and young adults (AYAs) increasingly rely on mobile platforms and social media. Most assume nutritional decisions are rational, targeting intentions such as goal setting and self-monitoring. However, in the absence of motivation and time, nutrition choices are often automatic and based on heuristics. The use of images is a simple way to deliver heuristic messaging. My preliminary research showing AYAs frequent use of social media for inspiration, further suggests health-related images may be suitable for nutrition interventions with these groups. Previous studies have explored inspirational social media content using qualitative and manual methods. However, there is an active area of research in computational visual analysis that explores preferences and prediction for image retrieval and recommendation tasks. The application of these techniques within health and specifically how to translate human preferences into the technical requirements needed to identify inspirational images for nutrition and young populations is underexplored. In this talk, I will discuss a study to identify image features that are relevant for inspiring healthy eating in health-related social media content. Further, I will discuss future directions for exploring how these features may be incorporated into machine learning models.
Video: Watch the presentation here
Title: Machine Learning in Healthcare
Abstract: In March of 2016, the AlphaGo computer program beat world champion (and human) Lee Sedol at the board game Go. The program’s success reflected the significant progress that machine learning research has made in recent years. However, AlphaGo was just one example of what can be achieved with machine learning. This talk will provide an overview of some of the techniques that are being used in machine learning today, as well as some recent and ongoing work by Google’s research teams to advance the applications of machine learning, particularly its role in biomedical research. The talk will also discuss some of the unique challenges around applications in healthcare.
Bio: Ming Jack Po MD, PhD is a product manager in Google Health, leading a number of its machine learning research projects as well as health care product teams. Prior to joining Google, Jack spent a decade working in different capacities in areas related to medical devices and healthcare delivery. Jack is currently a trustee of the Austen Riggs Center, a board member of El Camino Health Systems, a member of the National Library of Medicine Lister Hill’s Board of Scientific Counselors and a member of the ONC’s Interoperability Standards Priorities Task Force. Jack received his MD and PhD from Columbia University, his bachelor’s degree in Biomedical Engineering, and Masters degree in Mathematics from Johns Hopkins University.
Video: Watch the presentation here
Title: Integrative Analysis of Multi-view Data for Dimension Reduction and Prediction
Abstract: Multi-view data are data collected on the same set of samples but from different views/sources. They become increasingly common in modern biomedical studies. In this talk, I’ll introduce some recent developments of the integrative analysis of multi-view data, and present a new multivariate predictive model with application to a longitudinal study of aging.
Background: Multi-view data are data collected on the same set of samples but from different views/sources. They become increasingly common in modern biomedical studies. In this talk, I’ll introduce some recent developments of the integrative analysis of multi-view data, and present a new multivariate predictive model with application to a longitudinal study of aging.
Bio: Dr. Gen Li is devoted to developing new statistical learning methods for analyzing high dimensional biomedical data. He focuses on analyzing complex data with heterogeneous types that are collected from multiple sources. His methodological research interests include dimension reduction, predictive modeling, association analysis, and functional data analysis. He is also interested in genetics and bioinformatics. He is a consortium member of the NIH Common Fund program Genotype-Tissue Expression (GTEx) project, and contributes to the development of statistical methods for expression quantitative trait loci analysis in multiple tissues. He also has research interests in scientific domains including melanoma, microbiome, and urology research.
Video: Watch the presentation here
Title: Applications of Data Science and Machine Learning in Radiology and Cardiology
Abstract: The overall goal of our group is to leverage data-driven approaches to help improve patient outcomes. This talk will demonstrate examples of how are working toward this goal by leveraging large clinical datasets, data science and machine learning. Specific examples include: 1) using 46,583 clinically-acquired 3D computed tomography images of the brain to develop and implement a deep learning model to efficiently reprioritize radiology worklists for quicker diagnosis of intracranial hemorrhage; 2) using deep learning to analyze 723,754 echocardiographic videos of the heart to accurately predict patient mortality; 3) analyzing 2 million 12-lead electrocardiographic tracings from the heart to predict clinically relevant future events and 4) optimizing evidence-based care delivery for a population of >10,000 patients with heart failure using machine learning.
Bio: Dr. Fornwalt attended the University of South Carolina as an undergraduate in mathematics and marine science. He then worked in a free medical clinic for a year before starting an MD/PhD program at Emory and Georgia Tech. After finishing his degrees in 2010, he completed an internship in pediatrics at Boston Children’s Hospital before becoming an Assistant Professor at the University of Kentucky.
After four years on faculty in Kentucky, Dr. Fornwalt moved to Geisinger where he completed his diagnostic radiology residency and founded Geisinger’s Department of Imaging Science and Innovation, which focuses on data-driven approaches to improving patient outcomes. Dr. Fornwalt is also a practicing thoraco-abdominal radiologist and an active member of Geisinger’s Heart Institute.
Speaker: Alex Kitaygorodsky, PhD Student, Dr. Yufeng Shen’s Lab
Title: Identification of disease-causing genetic mutations based on machine learning and large genomic data sets
Abstract: More than 3% of young children are born with developmental disorders such as congenital heart disease (CHD), congenital diaphragmatic hernia (CDH), and autism spectrum disorder (ASD). Understanding the genetic causes of these conditions is critical to improve health care for these children and to push forward human developmental biology and neuroscience. Recently, high-throughput sequencing technologies have enabled generation of large-scale genomic data in genetic studies of these conditions. However, translating human data to knowledge is challenging due to an incomplete understanding of biology and a lack of sufficiently powerful analytical methods. My work aims to develop new computational methods based on powerful machine learning techniques to interpret genome sequencing data and identify disease-causing genetic variations. In this talk, I will focus specifically on the role of regulatory non-protein coding mutations in CHD, where we have found a substantial role of variants disrupting RNA binding protein (RBP) binding sites. RBPs oversee normal regulation of gene expression, at both the transcriptional and especially post-transcriptional stages, and so their disruption via mutation represents an important but under-studied noncoding action mechanism. To better understand the observed enrichment in these sites, we first modeled RNA binding protein processes with a robust convolutional neural network. Then, we designed a gradient boosting super-model to integrate predicted RBP binding scores with multimodal genomic data, allowing us to predict pathogenic RBP and gene regulation disruption caused by individual mutations. Finally, we applied our model back to Whole Genome Sequencing data of autism and CHD to find new disease risk genes and improve genetic diagnosis. In summary, we leveraged large genomic datasets with a sophisticated machine learning approach to better analyze sequencing data, advance genomic medicine, and aid our understanding of developmental disorder genetics.
Speaker: Sylvia Cho, PhD Candidate, Dr. Karthik Natarajan’s Lab
Title: Identifying data quality dimensions for wearable device data
Abstract: Patient-generated health data (PGHD) is one of the emerging biomedical data that is captured and recorded by patients outside clinical encounters. One of the major factors that facilitates the documentation of PGHD is the proliferated use of health tracking technologies. Among the different health tracking technologies, wearable device is unique in that individuals can continuously and objectively self-track their health in free-living conditions. As a byproduct of using wearable devices for self-tracking, the large volume of accumulated data and diverse data types have led to the interest of reusing these data for research purposes. However, there are concerns on the quality of device-generated data due to various reasons such as technical and human limitations. Therefore, assessing the quality of wearable data is essential before reusing the data for research. Data quality dimension is an important feature for data quality assessment as it provides guidance on what aspect of data quality should be assessed for the research task. While there are abundant studies on data quality dimensions for traditional clinical data such as the electronic health record data, there is a lack of understanding on the important data quality dimensions for wearable device data. In this study, we aim to identify the data quality dimensions considered to be important by researchers when analyzing wearable data, and to verify if an existing data quality framework can be applied to this type of data or if it needs to be modified. In this talk, I will discuss the methods we used to identify the dimensions and present preliminary results of the study.
This is a DBMI Student Town Hall.
Due to the Election Day holiday on Tuesday, there is no Seminar today.
Title: Oops! I’m on the wrong patient: Evaluating System-Level Interventions for Preventing Wrong-Patient Electronic Orders
Bio: Dr. Adelman’s Patient Safety Research Program began with the development of the Wrong-Patient Retract-and-Reorder (RAR) Measure—a valid and reliable method of quantifying the frequency of wrong-patient orders placed in electronic ordering systems. The Wrong-Patient RAR measure was the first automated measure of medical errors and the first Health IT Safety Measure endorsed by the National Quality Forum. The RAR method identifies thousands of near-miss, wrong-patient errors per year in large health systems, enabling researchers to test interventions to prevent this type of error.
The Wrong-Patient RAR measure has been used to evaluate the effectiveness of patient safety interventions in several studies conducted in different electronic health record systems and clinical settings, including in the neonatal intensive care unit (NICU). The measure is the primary outcome measure for supported by the Agency for Healthcare Research and Quality (R21HS023704, R01HS024945) and the National Institute for Child Health and Human Development (R01HD094793). Additional research is underway to extend the RAR methodology to other types of errors, such as wrong-drug errors, and develop new health IT safety measures (R01HS024538).
Results of Dr. Adelman’s research led to national patient safety guidance, including a recommendation issued by the Office of the National Coordinator for Health Information Technology that healthcare organizations use the Wrong-Patient RAR measure to monitor the frequency of wrong-patient orders. Effective 2019, The Joint Commission will require that hospitals adopt a distinct newborn naming convention that incorporates the mother’s first name, based on studies by Adelman and colleagues.