BINF G4006 Translational Bioinformatics

Course Description: Methods in translational bioinformatics (i.e. biomedical data sciences) for graduate students as well as juniors and seniors. Students study the statistical and computational algorithms to evaluate large biomedical data, with a special focus on the integration of molecular and clinical data for the advancement of medicine. Methods may include including sequence analysis, supervised and unsupervised machine learning, graph theoretic models and network analysis, information theory, deep neural networks, density estimation, and others. Students will study how to practically apply these methods to biomedical domains in non-human and human genetics, pharmacology, and public health. Successful completion of the course readies the student for graduate level research in translational bioinformatics.

Instructor

Nicholas Tatonetti, PhD

Teaching Assistant

Jiayao Wang

Class Schedule

Classes are held Mondays and Wednesdays from 2:40 pm - 3:55 pm in MUDD 644

Format
This course will be a hybrid of didactic lectures, flipped-classrooms, and peer-instruction. Learning materials will center around a practical and hands on approach to biomedical informatics applications in the translational sciences (e.g. population health, pharmaceutical sciences, molecular disease etiology, and human genetics).

The first third of the course will be lectures on data science application to molecular and cellular biology, drug discovery and safety, protein function and pathway analysis, and human genetic variation. The remaining two-thirds of course will be dedicated to a deep dive on an advanced class of computational/statistical methods applied to a particular biological domain. At this point students will take control of the direction of the course by choosing the methods that will be studied, designing lectures, and delivering teaching materials to their peers.

Students will be graded on their engagement with the course materials, their participation in the peer-instruction, their performance on assigned tasks, and an end-of-term research study.

Text Books
The textbooks for this course are both freely available.

PLOS Computational Biology: Translational Bioinformatics edited by Maricel Kann, Guest Editor, and Fran Lewitter.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Second Edition) by Trevor Hastie, Robert Tibshirani and Jerome Friedman (2009) http://web.stanford.edu/~hastie/pub.htm

Syllabus

#DateTopicReading AssignmentsHomeworkLecturer
1Sept. 9Introduction to Translational Bioinformatics[1]HW1(A)Tatonetti
2Sept. 11Data-driven Disease Biology[2,3]HW1(D); HW2(A)Tatonetti
3Sept. 16Data-driven Disease Biology[4,5]HW2(D)Tatonetti
4Sept. 18Lab: Data-driven Disease Biology[5,6]HW3(A)TA
5Sept. 23Application Spotlight: Drug Safety and Drug-Drug Interactions[22,23,24,25]Tatonetti
6Sept. 25Unsupervised machine learning methods
[7,8]Tatonetti
7Sept. 30Lab: Drug safety and drug-drug interactions[26]HW3(D)
HW3.R(A)
TA
8Oct. 2Student Lecture 1 and Discussion; How to write an abstractStudent; Tatonetti
9Oct. 7Student Lecture 2 and DiscussionHW3.R(D); HW4(A)Student
10Oct. 9Student Lecture 3 and DiscussionStudent
11Oct. 14Student Lecture 4 and DiscussionStudent
12Oct. 16Student Lecture 5 and DiscussionStudent
13Oct. 21Systems Pharmacology I[9,10]Tatonetti
14Oct. 23Systems Pharmacology II[11-15]Tatonetti
15Oct. 28How to prepare a scientific presentationTatonetti
16Oct. 30Chemical Informatics I[16,17]Tatonetti
Nov. 4Election Day
17Nov. 6Student Lecture 7 and DiscussionStudent
18Nov. 11Chemical Informatics II [18,19]Tatonetti
19Nov. 13Network Biology and Graph Theory[20,21]TA
Nov. 18Cancel due to AMIA
20Nov. 20Student Lecture 8 and DiscussionHW5(D)Student
Nov. 25Thanksgiving Week
Nov. 27Thanksgiving Week
21Dec. 2Application Spotlight: Deep neural networks in medicineTBASassan Ostvar
22Dec. 7Final Presentations I
23Dec. 9Final Presentations II

Other Possible Lecture Topics
These are additional lectures that can be slotted in depending on the number of days that the student lectures consume.

  • Biomedical Blockchains: The Don’ts and Don’ts
  • Zero knowledge proofs and why their cool
  • Differential privacy and its application to healthcare
  • The genetic architecture of disease, heritability and recurrence
  • Application Spotlight: Celiac disease, its etiology, and estimation of risk (Guest Lecturer: Ben Lebwohl)

Homework
All homework must be completed independently. Homework will be graded on uniqueness and creativity in content. “Google search results” will receive low grades.

HW1. Review syllabus of Computational Methods course
HW2. Comprehensive outline of a chosen methods topic (2-5 pages)
HW3. Build 35min lecture with references on assigned topic
HW3.R. Revise lecture according to instructor feedback
HW4. Abstract proposal for short research project (400-600 words; 1 figure; 1 table)
HW5. 10 minute presentation of research findings (10 minutes; 10 slides)

Assigned Readings
All assigned readings should be completed before the lecture, unless otherwise noted.

[1] “Introduction to Translational Bioinformatics Collection” in PLOS Computational Biology: Translational Bioinformatics (http://tatonettilab.org/courses/BINFG4006/materials/TranslationalBioinformatics.pdf)
[2] http://stm.sciencemag.org/content/3/96/96ra77
[3] Chapter 4 “Protein interactions and disease” of PLOS Computational Biology: Translational Bioinformatics
[4] Chapter 9 “Analysis using disease ontologies” of PLOS Computational Biology: Translational Bioinformatics
[5] Chapter 2 “Data-Driven View of Disease Biology” of PLOS Computational Biology: Translational Bioinformatics
[6] https://en.wikipedia.org/wiki/Receiver_operating_characteristic
[7] Chapter 14 sections 14.1 and 14.3. of Elements of Statistical Learning
[8] http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1004122
[9] http://stm.sciencemag.org/content/5/205/205rv1.full
[10] Chapter 3 “Small molecules and disease” of PLOS Computational Biology: Translational Bioinformatics
[11] http://stm.sciencemag.org/content/4/125/125ra31
[12] http://www.tatonetti.com/papers/JAMIA_2011_Tatonetti.pdf
[13] http://stke.sciencemag.org/content/3/118/ra30.full
[14] Chapter 6 “Structural variation and medical genomics” of PLOS Computational Biology: Translational Bioinformatics
[15] Chapter 7 “Pharmacogenomics” of PLOS Computational Biology: Translational Bioinformatics
[16] http://www.sagepub.com/sites/default/files/upmbinaries/40007_Chapter8.pdf
[17] http://www.daylight.com/dayhtml/doc/theory/index.html
[18] http://www.nature.com/nbt/journal/v25/n2/full/nbt1284.html
[19] http://opentutorials.cgl.ucsf.edu/index.php/Tutorial:Introduction_to_Cytoscape#Visualizing_Data_on_Networks
[20] http://onlinelibrary.wiley.com/doi/10.1038/clpt.2013.168/full
[21] Chapter 5 “Network biology approach to complex disease” of PLOS Computational Biology: Translational Bioinformatics
[22] https://ascpt.onlinelibrary.wiley.com/doi/full/10.1038/clpt.2011.83
[23] https://stm.sciencemag.org/content/4/125/125ra31.long
[24] https://www.sciencedirect.com/science/article/pii/S0735109716349397?via%3Dihub
[25] https://link.springer.com/article/10.1007%2Fs40264-016-0393-1
[26] https://www.cell.com/cell/pdf/S0092-8674(18)30525-7.pdf

Midterm and finals:
This class is a flipped peer-instructed and project-based class. Each student needs to come up with a research question and then find appropriate datasets, use statistical methods to answer the question. The midterm assignment (aka HW4) of this class will be a 1-page research proposal that described the research methods and materials you’ll be using for your project (template will be uploaded on coursework). Then each student will present the proposal on the class.

For the finals, each student will work on their proposed research plan and present the results in a 10 minute scientific presentation.

Grading
• 15% class participation
• 40% homework assignments 1-4
• 25% midterm proposal (i.e. HW5)
• 20% final project presentation

Absence Policy
The course will employ a “no excuse necessary” absentee policy. After the first two absences, your grade will be reduced by one third of a letter grade for each additional absence. For example, if you earned an A in the class and missed three of the classes in total, your final grade would be an A-. If you earned an A and missed four classes total, then your grade would be a B+, and so on. There is no need to notify the TA or Instructor when you will be absent. Attendance will be taken at the start of each class.

Academic Integrity Statement
(taken from http://gsas.columbia.edu/content/sample-statement-academic-integrity)

Columbia’s intellectual community relies on academic integrity and responsibility as the cornerstone of its work. Graduate students are expected to exhibit the highest level of personal and academic honesty as they engage in scholarly discourse and research. In practical terms, you must be responsible for the full and accurate attribution of the ideas of others in all of your research papers and projects; you must be honest when taking your examinations; you must always submit your own work and not that of another student, scholar, or internet source. Graduate students are responsible for knowing and correctly utilizing referencing and bibliographical guidelines. When in doubt, consult your professor. Citation and plagiarism- prevention resources can be found at the GSAS page on Academic Integrity and Responsible Conduct of Research (http://gsas.columbia.edu/academic-integrity).

Failure to observe these rules of conduct will have serious academic consequences, up to and including dismissal from the university. If a faculty member suspects a breach of academic honesty, appropriate investigative and disciplinary action will be taken following Dean’s Discipline procedures (http://gsas.columbia.edu/content/disciplinary-procedures).