Spring 2020 Seminars
Fall 2019 Seminars
Video: Watch the presentation here
Title: Machine Learning in Healthcare
Abstract: In March of 2016, the AlphaGo computer program beat world champion (and human) Lee Sedol at the board game Go. The program’s success reflected the significant progress that machine learning research has made in recent years. However, AlphaGo was just one example of what can be achieved with machine learning. This talk will provide an overview of some of the techniques that are being used in machine learning today, as well as some recent and ongoing work by Google’s research teams to advance the applications of machine learning, particularly its role in biomedical research. The talk will also discuss some of the unique challenges around applications in healthcare.
Bio: Ming Jack Po MD, PhD is a product manager in Google Health, leading a number of its machine learning research projects as well as health care product teams. Prior to joining Google, Jack spent a decade working in different capacities in areas related to medical devices and healthcare delivery. Jack is currently a trustee of the Austen Riggs Center, a board member of El Camino Health Systems, a member of the National Library of Medicine Lister Hill’s Board of Scientific Counselors and a member of the ONC’s Interoperability Standards Priorities Task Force. Jack received his MD and PhD from Columbia University, his bachelor’s degree in Biomedical Engineering, and Masters degree in Mathematics from Johns Hopkins University.
Video: Watch the presentation here
Title: Integrative Analysis of Multi-view Data for Dimension Reduction and Prediction
Abstract: Multi-view data are data collected on the same set of samples but from different views/sources. They become increasingly common in modern biomedical studies. In this talk, I’ll introduce some recent developments of the integrative analysis of multi-view data, and present a new multivariate predictive model with application to a longitudinal study of aging.
Background: Multi-view data are data collected on the same set of samples but from different views/sources. They become increasingly common in modern biomedical studies. In this talk, I’ll introduce some recent developments of the integrative analysis of multi-view data, and present a new multivariate predictive model with application to a longitudinal study of aging.
Bio: Dr. Gen Li is devoted to developing new statistical learning methods for analyzing high dimensional biomedical data. He focuses on analyzing complex data with heterogeneous types that are collected from multiple sources. His methodological research interests include dimension reduction, predictive modeling, association analysis, and functional data analysis. He is also interested in genetics and bioinformatics. He is a consortium member of the NIH Common Fund program Genotype-Tissue Expression (GTEx) project, and contributes to the development of statistical methods for expression quantitative trait loci analysis in multiple tissues. He also has research interests in scientific domains including melanoma, microbiome, and urology research.
Video: Watch the presentation here
Title: Applications of Data Science and Machine Learning in Radiology and Cardiology
Abstract: The overall goal of our group is to leverage data-driven approaches to help improve patient outcomes. This talk will demonstrate examples of how are working toward this goal by leveraging large clinical datasets, data science and machine learning. Specific examples include: 1) using 46,583 clinically-acquired 3D computed tomography images of the brain to develop and implement a deep learning model to efficiently reprioritize radiology worklists for quicker diagnosis of intracranial hemorrhage; 2) using deep learning to analyze 723,754 echocardiographic videos of the heart to accurately predict patient mortality; 3) analyzing 2 million 12-lead electrocardiographic tracings from the heart to predict clinically relevant future events and 4) optimizing evidence-based care delivery for a population of >10,000 patients with heart failure using machine learning.
Bio: Dr. Fornwalt attended the University of South Carolina as an undergraduate in mathematics and marine science. He then worked in a free medical clinic for a year before starting an MD/PhD program at Emory and Georgia Tech. After finishing his degrees in 2010, he completed an internship in pediatrics at Boston Children’s Hospital before becoming an Assistant Professor at the University of Kentucky.
After four years on faculty in Kentucky, Dr. Fornwalt moved to Geisinger where he completed his diagnostic radiology residency and founded Geisinger’s Department of Imaging Science and Innovation, which focuses on data-driven approaches to improving patient outcomes. Dr. Fornwalt is also a practicing thoraco-abdominal radiologist and an active member of Geisinger’s Heart Institute.
Speaker: Alex Kitaygorodsky, PhD Student, Dr. Yufeng Shen’s Lab
Title: Identification of disease-causing genetic mutations based on machine learning and large genomic data sets
Abstract: More than 3% of young children are born with developmental disorders such as congenital heart disease (CHD), congenital diaphragmatic hernia (CDH), and autism spectrum disorder (ASD). Understanding the genetic causes of these conditions is critical to improve health care for these children and to push forward human developmental biology and neuroscience. Recently, high-throughput sequencing technologies have enabled generation of large-scale genomic data in genetic studies of these conditions. However, translating human data to knowledge is challenging due to an incomplete understanding of biology and a lack of sufficiently powerful analytical methods. My work aims to develop new computational methods based on powerful machine learning techniques to interpret genome sequencing data and identify disease-causing genetic variations. In this talk, I will focus specifically on the role of regulatory non-protein coding mutations in CHD, where we have found a substantial role of variants disrupting RNA binding protein (RBP) binding sites. RBPs oversee normal regulation of gene expression, at both the transcriptional and especially post-transcriptional stages, and so their disruption via mutation represents an important but under-studied noncoding action mechanism. To better understand the observed enrichment in these sites, we first modeled RNA binding protein processes with a robust convolutional neural network. Then, we designed a gradient boosting super-model to integrate predicted RBP binding scores with multimodal genomic data, allowing us to predict pathogenic RBP and gene regulation disruption caused by individual mutations. Finally, we applied our model back to Whole Genome Sequencing data of autism and CHD to find new disease risk genes and improve genetic diagnosis. In summary, we leveraged large genomic datasets with a sophisticated machine learning approach to better analyze sequencing data, advance genomic medicine, and aid our understanding of developmental disorder genetics.
Speaker: Sylvia Cho, PhD Candidate, Dr. Karthik Natarajan’s Lab
Title: Identifying data quality dimensions for wearable device data
Abstract: Patient-generated health data (PGHD) is one of the emerging biomedical data that is captured and recorded by patients outside clinical encounters. One of the major factors that facilitates the documentation of PGHD is the proliferated use of health tracking technologies. Among the different health tracking technologies, wearable device is unique in that individuals can continuously and objectively self-track their health in free-living conditions. As a byproduct of using wearable devices for self-tracking, the large volume of accumulated data and diverse data types have led to the interest of reusing these data for research purposes. However, there are concerns on the quality of device-generated data due to various reasons such as technical and human limitations. Therefore, assessing the quality of wearable data is essential before reusing the data for research. Data quality dimension is an important feature for data quality assessment as it provides guidance on what aspect of data quality should be assessed for the research task. While there are abundant studies on data quality dimensions for traditional clinical data such as the electronic health record data, there is a lack of understanding on the important data quality dimensions for wearable device data. In this study, we aim to identify the data quality dimensions considered to be important by researchers when analyzing wearable data, and to verify if an existing data quality framework can be applied to this type of data or if it needs to be modified. In this talk, I will discuss the methods we used to identify the dimensions and present preliminary results of the study.
This is a DBMI Student Town Hall.
Due to the Election Day holiday on Tuesday, there is no Seminar today.
Title: Oops! I’m on the wrong patient: Evaluating System-Level Interventions for Preventing Wrong-Patient Electronic Orders
Bio: Dr. Adelman’s Patient Safety Research Program began with the development of the Wrong-Patient Retract-and-Reorder (RAR) Measure—a valid and reliable method of quantifying the frequency of wrong-patient orders placed in electronic ordering systems. The Wrong-Patient RAR measure was the first automated measure of medical errors and the first Health IT Safety Measure endorsed by the National Quality Forum. The RAR method identifies thousands of near-miss, wrong-patient errors per year in large health systems, enabling researchers to test interventions to prevent this type of error.
The Wrong-Patient RAR measure has been used to evaluate the effectiveness of patient safety interventions in several studies conducted in different electronic health record systems and clinical settings, including in the neonatal intensive care unit (NICU). The measure is the primary outcome measure for supported by the Agency for Healthcare Research and Quality (R21HS023704, R01HS024945) and the National Institute for Child Health and Human Development (R01HD094793). Additional research is underway to extend the RAR methodology to other types of errors, such as wrong-drug errors, and develop new health IT safety measures (R01HS024538).
Results of Dr. Adelman’s research led to national patient safety guidance, including a recommendation issued by the Office of the National Coordinator for Health Information Technology that healthcare organizations use the Wrong-Patient RAR measure to monitor the frequency of wrong-patient orders. Effective 2019, The Joint Commission will require that hospitals adopt a distinct newborn naming convention that incorporates the mother’s first name, based on studies by Adelman and colleagues.
No seminar due to the AMIA Symposium.
There was no seminar on Nov. 25.
Speaker: Jonathan Elias, MD, Clinical Informatics Fellow
Title: A Day in the Life of a Clinical Informatics Fellow: CI Fellowship, Epic Together’s Mobile Messaging and Provider Team Project and the Epic Together Pre- & Post-Implementation Study
Abstract: Per AMIA, Clinical Informatics (CI) is the application of informatics and information technology to deliver healthcare services. The CI Fellowship is a two-year ACGME accredited fellowship now being offered to one candidate a year through NYP CUMC, after completion of a medical residency. During this seminar, the fellowship structure and goals with example projects and research will be discussed.
A large area of focus of the fellowship is operational CI projects and academic research. Currently, Columbia University Medical Center (CUMC), NewYork-Presbyterian (NYP) and Weill Cornell Medical Center (WCM) are preparing to implement an enterprise-wide clinical information system, the EpicCare© Electronic Health Record (EHR). With the implementation of the EpicCare© EHR, there is an opportunity to improve, streamline and standardize role delineation, clinical communications and patient assignment across the EHR and secure mobile messaging platforms. The goals and processes associated with this project will be discussed.
Finally, a brief overview & update of the Epic Pre- & Post-Implementation Study will be explored. The overall purpose of this study is to evaluate clinical workflows, process efficiencies, EHR utilization, data quality and overall perceived system usability post implementation of Epic at NYP/CUMC/WCM compared to systems in place prior to Epic implementation. This project is comprised of three specific aims, outlined below, with associated high-level approach and metrics. Aim 1: Conduct pre-post time motion study focused in inpatient setting and outpatient setting (including emergency department) to identify documentation workflow and time changes after Epic EHR implementation. Aim 2: Conduct log-file analyses to measure process efficiencies, EHR utilization (e.g., documentation time), and EHR data quality metrics. Aim 3: Administer a survey to measure and compare health professionals’ perceived usability and satisfaction pre- and post-Epic implementation in the context of functionality to enhance the delivery of continuity of care and adaptation to new health information technology (HIT).
Speaker: Jiayao Wang, PhD Student, Dr. Dennis Vitkup’s Lab
Title: Contribution of recessive genotypes and common variants to autism spectrum disorder
Abstract: Autism spectrum disorder (ASD) is a genetically heterogeneous condition, caused by a combination of rare de novo and inherited variants as well as common variants in at least several hundred genes. However, significantly larger sample sizes are needed to identify the complete set of genetic risk factors. Also, contribution from inherited variants needs to be further investigated. Here we present for SPARK (SPARKForAutism.org) of ~9K families with ASD, all consented online. Whole exome sequencing (WES) and genotyping data were generated for each family using DNA from saliva. With Exome sequencing data and a simple statistical framework, we show a week contribution from recessive genotypes, as well as several significant recessive genes leads to Autism such as EIF3F and RELN. With genotype array data, we performed GWAS with transmission disequilibrium test and calculated polygenic risk scores for SPATK families. We show that autism probands has a significant higher polygenic risk compared to their siblings and the risk was spread all over the genome rather only from significant loci. Contribution from recessive genotypes and common variants, together with rare inherited variants and de novo mutations from SPARK project will complete our understanding of genetics of Autism.
TITLE: Using Genetics to Address the Challenges of 21st Century Drug Development
BIO: Michael N. Cantor, MD, MA is Executive Director, Clinical Informatics, at the Regeneron Genetics Center. Currently his work focuses on developing and optimizing phenotypes from EHR and cohort data and linking them with genetic data to help discover new drug targets. Prior to Regeneron, he was Director of Clinical Research Informatics at New York University School of Medicine. As Director of Clinical Research Informatics, he was also the clinical director for NYULH’s DataCore, where his work focused on data management for clinical trials, using data from clinical systems to research, and advanced analytics. His research interests include integrating and standardizing social determinants of health-related data into the EHR, optimizing informatics tools for frontline clinicians, and providing self-service data access tools for researchers. During his previous tenure at NYU, Dr. Cantor was the Chief Medical Information Officer for the South Manhattan Healthcare Network of the New York City Health and Hospitals Corporation, based at Bellevue, and saw patients and precepted at the medical clinic there. Dr. Cantor completed his residency in internal medicine and informatics training at Columbia, has an M.D. from Emory University, and an A.B. from Princeton, and is an Associate Professor in the Department of Medicine at NYU School of Medicine. He currently sees patients weekly at Bellevue’s medicine clinic.