Biomedical Informatics Seminar Series

The DBMI seminar series is a 1-credit course for DBMI students who can benefit from hearing new methods of research from speakers from both academia and industry. Enrollment is restricted to DBMI students, but anybody may attend the seminars. It is currently being offered virtually, though hybrid sessions will also be held in PH20-200. DBMI also hosts a Special Seminar Series: Toward Diversity, Equity, and Inclusion in Informatics, Health Care, and Society. Both upcoming presentations and past recordings will be shared on our Special Seminar Series homepage.

2024 Fall Seminars

Seminars are held Monday at 1-2 pm unless otherwise noted.

Title: Leveraging Clinical Informatics to Advance Health Equity 

Presenter: Aaron Tierney, Staff Scientist and Health Equity & Diversity Scholar, Kaiser Permanente

Join This Seminar

Abstract: During this talk, Dr. Tierney will introduce himself and provide an overview of his professional work and research, particularly as it relates to health equity and algorithms. Included in the introduction will be a brief overview of his work in the American Medical Informatics Association (AMIA) and his role on the DEI committee. His research overview will focus on equity in Large Language Models (LLMs) and in Predictive Analytics. Topics covered will include: 1) work leading the evaluation of ambient AI scribe implementation at Kaiser Permanente Northern California (currently published in NEJM Catalyst) and next steps studying their impact on the equitable delivery of diabetes care; 2) a framework to guide leaders in the development and/or acquisition of LLM based tools for care delivery; and 3) current work assessing the impact on algorithmic fairness of overlaying two algorithms designed to predict appropriate perioperative appointment lengths for patients seeking elective surgeries. He will conclude with a brief discussion of future research plans to further his work on the impact of ambient AI scribes on equitable diabetes care delivery.

Bio: Dr. Aaron Tierney, PhD is a Staff Scientist and Health Equity & Diversity Scholar at Kaiser Permanente Norther California, Division of Research. His expertise lies in the equitable design and implementation of both patient-facing and clinician-facing health information technologies. Dr. Tierney’s research agenda focuses on health information technology, aiming to generate evidence to inform how to make these technologies more equitable, accessible, and usable for vulnerable populations. Currently, Dr. Tierney’s work falls into 3 main categories: (1) the assessment of the impact of ambient artificial intelligence (AI) scribes on health care delivery; (2) bias and fairness assessments of algorithms; and (3) telemedicine implementation. He also currently serves on the Diversity, Equity, and Inclusion (DEI) committee, the Nominations committee, and the Advancement committee at the American Medical Informatics Association (AMIA).

Previous 2024 Fall Seminars

Title: Safeguarding Medically Underserved Populations in EHR-Based Research  

Presenter: Rebecca Hubbard, Carl Kawaja and Wendy Holcombe Professor of Public Health and Professor of Biostatistics and Data Science at Brown University

Watch This Presentation

Abstract: Data are captured in electronic health records (EHRs) as a direct result of patient interactions with the healthcare system. Consequently, EHR data for patients with more healthcare utilization tend to be captured more frequently and provide more detail about the patient’s health. This connection between patterns of healthcare utilization and data quantity and quality, termed informed presence, violates the common statistical assumption of independence between observation and outcome processes. This is particularly problematic for historically marginalized populations and other groups experiencing barriers to healthcare. Limited data availability has the potential to increase bias, imprecision and algorithmic unfairness in EHR-based research results for these medically underserved groups. In this presentation, I will discuss the roots of informed presence bias in EHR data and illustrate examples of informed presence bias using real-world EHR data on childhood mortality and breast cancer outcomes. I will quantify the magnitude of bias resulting from alternative patterns of dependence between outcome and exposure data capture and healthcare utilization intensity and demonstrate several solutions to this problem. While EHR data can be used to accelerate precision medicine, achieving this goal while also safeguarding equity for underserved populations requires careful attention to data provenance and analytic methods.

Bio: Dr. Hubbard is the Carl Kawaja and Wendy Holcombe Professor of Public Health and Professor of Biostatistics and Data Science at Brown University. Her research focuses on development and application of statistical methods for studies using data from electronic health records (EHR) and medical claims, including issues of data availability and data quality, and has been applied to studies across a range of application areas including oncology and pharmacoepidemiology. She is a Fellow of the ASA and Co-Editor of the journal Biostatistics.

Title #1: How Often: Characterizing Heterogeneity in Drug-Outcome Incidence Rate Estimates Attributed to Drug Indication

Presenter #1: Hsin Yi “Cindy” Chen, DBMI PhD Student

Abstract #1: Incidence rate calculation is one of the most common analyses in pharmacoepidemiology, whether it is for used for comparative background rates for drug adverse events, assessing the potential public health impact of adverse events, or designing clinical trials. Despite their importance, the accuracy of incidence rates listed on drug package inserts are often unclear. Additionally, the applicability of these incidence rates for a specific patient are complicated by the variability in the populations from which the rates were calculated. While it may be a fairly common practice to collapse available incidence rates in the literature into a single estimate, this is only appropriate when the studies from which the estimates are generated are sufficiently similar to combine with regard to important factors such as age distribution, gender and underlying disease state. Factors that further complicate and increase uncertainty in meta-analysis include different operational definitions of incidence rates, differing definitions of the adverse event, and the diversity of study designs. Our previous published study, which systematically studied the influence of patient demographics on incidence rates, showed that factors such as age, sex, and indexing events can affect the incidence rate estimates potentially up to 1000-fold. In this talk, I will focus on drug indication, which is an obvious, but usually ignored source of heterogeneity which may be important when interpreting incidence rates.

Bio #1: Hsin Yi “Cindy” Chen is a 4th year MD-PhD student (2nd year PhD) in the Department of Biomedical Informatics at Columbia University. She is advised by George Hripcsak, and her research interests include causal inference and large-scale observational data research using the EHR. Prior to starting her MD-PhD at Columbia, she received her B.S. in Biometry and Statistics from Cornell University and spent two years as a research associate at Yale University’s Department of Neurology leveraging machine learning methods for predicting secondary brain injuries after stroke. 


Title #2: Towards a universal in silico model of in vivo protein-DNA binding 

Presenter #2: Vinay Swamy, DBMI PhD Student

Abstract #2: Many machine learning models for in-vivo protein-DNA have been developed over the past 10 years. These models formulate the modelling approach as a multi-task learning problem, where regions of the genome are assigned a vector of labels corresponding to experimental data on a specific protein, in a specific biological system; this formulation locks a model into a set of specific tasks, generalize only to un regions of the genome. In this talk I will share our initial effort in building a “universal” model of in vivo protein-DNA binding, that is capable of make predictions for unseen proteins and unseen biological systems, trained using a large public repository of human ChIP-seq.

Bio #2: Vinay Swamy is a 4th year PhD student advised by Mohammed AlQuraishi and Raul Rabadan. His research focuses on the use of machine learning to model biological processes like drug response in cancer cell lines, protein-DNA binding, and protein-structure.

Title: The role of genetic evidence to improve productivity in drug discovery and development 

Presenter: Matt Nelson, Chief Executive Officer of Genscience

Watch This Presentation

Abstract: The cost of drug discovery and development is driven primarily by failure, with just ~10% of clinical programs eventually receiving approval. The most important step in a successful drug discovery and development program is selecting the drug mechanism, usually in the form of a target, for the intended patient population. We previously estimated that human genetic evidence doubles the success rate from clinical development to approval. We have expanded on this work leveraging the growth in genetic evidence over the past decade to better understand the characteristics that distinguish clinical success and failure. We estimate the probability of success for drug mechanisms with genetic support is 2.6 times greater than those without. This relative success varies among therapy areas and development phases, and improves with increasing confidence in the causal gene, but is largely unaffected by genetic effect size, minor allele frequency, or year of discovery. We further demonstrate the value genetics can play in anticipating potential on-target side effects to predict and mitigate those risks early in the development process. These results suggest we are far from reaching peak genetic insights to aid the discovery of targets for more effective drugs.

Bio: Matthew Nelson, Ph.D., is Chief Executive Officer of Genscience, a tech-focused company to improve integration of genetic evidence into drug discovery. Genscience is an affiliate of Deerfield, which Dr. Nelson joined as VP, Genetics & Genomics in 2019. Prior to Deerfield, Dr. Nelson spent almost 15 years at GlaxoSmithKline and was most recently the Head of Genetics. Prior to GlaxoSmithKline, Dr. Nelson was the Director of Biostatistics at Sequenom and Director of Genomcis at Esperion Therapeutics. He is co-author on >80 publications. Dr. Nelson was an Adjunct Associate Professor of Biostatistics at the University of North Carolina from 2010 to 2016. He holds a Ph.D. in Human Genetics and an M.A. in Statistics from the University of Michigan.

Title: Navigating AI in Medicine: Opportunities and Risks of Large Language Models in Real-World Tasks

Presenter: Zhiyong Lu, Senior Investigator, NIH/NLM

Watch this seminar

Abstract: The explosion of biomedical big data and information in the past decade or so has created new opportunities for discoveries to improve the treatment and prevention of human diseases. As such, the field of medicine is undergoing a paradigm shift driven by AI-powered analytical solutions. This talk explores the benefits and risks of AI and ChatGPT, highlighting their pivotal roles in revolutionizing biomedical discovery, patient care, diagnosis, treatment, and medical research. By demonstrating their uses in some real-world applications such as improving PubMed searches (Best Match, Nature Biotechnology 2018), supporting precision medicine (LitVar, Nature Genetics 2023), and accelerating patient trial matching (TrialGPT, Nature Communications 2024), we underscore the necessities and challenges of implementing and evaluating AI and ChatGPT in enhancing clinical decision-making, personalizing patient experiences, and accelerating knowledge discovery. 

Bio: Dr. Zhiyong Lu is a tenured Senior Investigator at the NIH/NLM IPR, leading research in biomedical text and image processing, information retrieval, and AI/machine learning. In his role as Deputy Director for Literature Search at NCBI, Dr. Lu oversees the overall R&D efforts to improve literature search and information access in resources like PubMed and LitCovid, which are used by millions worldwide each day. Additionally, Dr. Lu is Adjunct Professor of Computer Science at the University of Illinois Urbana-Champaign (UIUC). With over 400 peer-reviewed publications, Dr. Lu is a highly cited author, and a Fellow of the American College of Medical Informatics (ACMI) and the International Academy of Health Sciences Informatics (IAHSI).

Title: Artificial Intelligence in Medical Imaging

Presenter: Curtis Langlotz, Professor of Radiology, Medicine, and Biomedical Data Science and Senior Associate Vice Provost for Research,  Stanford University 

At speaker’s request, the presentation was not recorded 

Abstract: Artificial intelligence (AI) is an incredibly powerful tool for building computer vision systems that support the work of radiologists. Over the last decade, artificial intelligence methods have revolutionized the analysis of digital images, leading to high interest and explosive growth in the use of AI and machine learning methods to analyze clinical images and text. These promising techniques create systems that perform some image interpretation tasks at the level of expert radiologists. Deep learning methods are now being developed for image reconstruction, imaging quality assurance, imaging triage, computer-aided detection, computer-aided classification, and radiology report drafting. The systems have the potential to provide real-time assistance to radiologists and other imaging professionals, thereby reducing diagnostic errors, improving patient outcomes, and reducing costs. We will review the origins of AI and its applications to medical imaging and associated text, define key terms, and show examples of real-world applications that suggest how AI may change the practice of medicine. We will also review key shortcomings and challenges that may limit the application of these new methods.

Bio: Dr. Langlotz is Professor of Radiology, Medicine, and Biomedical Data Science, and Senior Associate Vice Provost for Research at Stanford University. He also serves as Director of the Center for Artificial Intelligence in Medicine and Imaging (AIMI Center), which comprises over 150 faculty at Stanford who conduct interdisciplinary machine learning research to improve clinical care. Dr. Langlotz’s NIH-funded laboratory develops machine learning methods to detect disease and eliminate diagnostic errors. He has led many national and international efforts to improve medical imaging, including the RadLex terminology standard and the Medical Imaging and Data Resource Center (MIDRC), a U.S. national imaging research resource.

Title: Patients and Clinicians at the heart of health innovation: OpenNotes Lab and Cornell Tech Health Tech Hub

Presenter: Chethan Sarabu, Clinical Assistant Professor, Pediatrics, Stanford University

Watch This Presentation

Bio: Chethan Sarabu MD, FAAP, FAMIA, trained in landscape architecture, pediatrics, and clinical informatics, builds bridges across these fields to design healthier environments and systems. He is the inaugural Director of Clinical Innovation for the Health Tech Hub at Cornell Tech’s Jacobs Institute. Over the past six years, Sarabu has been a Clinical Assistant Professor of Pediatrics at Stanford Medicine and has worked in the health tech industry as Head of Product, Director of Clinical Informatics, and Medical Director at doc.ai and later Sharecare. He collaborates with the OpenNotes Lab as an AI and Informatics Strategist and serves as a board member of The Light Collective. 

Title: GWAS in the age of AI  

Presenter: Degui Zhi, Professor and Chair, Department of Bioinformatics and Systems Medicine, UTHealth Houston 

Abstract: While genome-wide association studies (GWAS) have fueled the amazing genetic discovery in the past 15 years or so, most existing studies were using traditional phenotypes. With deep learning-based AI, it is possible to generate many new phenotypes. Powered by big data in biobanks, many new loci can be discovered. As a result, the landscape of GWAS might be different. In this talk, I will discuss a possible future with large-scale AI-driven GWAS. 

Bio: Degui Zhi is Glassell Family professor of biomedical informatics, and founding chair of Department of Bioinformatics and Systems Medicine at the McWilliams School of Biomedical informatics at the University of Texas Health Science Center at Houston (UTHealth Houston). Dr. Zhi is also the founding director of Center for AI and Genome Informatics. He received his PhD in bioinformatics at UC San Diego. Before joining UTHealth, he was a tenured associate professor of statistical genetics at University of Alabama at Birmingham. Dr. Zhi is interested in developing AI deep learning and informatics methods for biomedical big data. His team developed multiple generalist deep learning frameworks for the modeling of biomedical data, including Med-BERT, a clinical foundation model for structured clinical data, gene2vec, a distributed representation embedding model for genes based their co-expression patterns, and unsupervised deep learning models for deriving endophenotypes for genetic discovery. His team also developed advanced PBWT-based data structures and algorithms for population genetics informatics.

Title: Reflections on AI in (NYC) government 

Presenter: Neal Parikh, Director of AI for New York City

Abstract: AI and machine learning have emerged as increasingly ubiquitous technologies in a wide range of areas in both the private sector and in government. In the past several years, ethical and other policy and governance questions around how and whether to use AI for various tasks have become much more prominent, partly due to its widespread use and partly due to publicly documented failures or shortcomings of a number of systems that can negatively impact people in sometimes serious ways. 

Bio: Neal Parikh is a computer scientist who most recently served as Director of AI for New York City. He is currently Adjunct Associate Professor at Columbia University’s School of International & Public Affairs, teaching a new class on AI for policymakers. Previously, he co-founded a technology startup, which was acquired after 10 years in operation, was Inaugural Fellow at the Aspen Tech Policy Hub at the Aspen Institute, and worked as a senior quant at Goldman Sachs. He received his Ph.D. in computer science from Stanford University, focusing on large-scale machine learning and convex optimization; his research has received over 25,000 citations in the literature and is widely used in industry.