DBMI Resources
Computing infrastructure:
The Department of Biomedical Informatics (DBMI) Information Technology group manages a 392 CPU core central cluster with 5.76 TB RAM and 450 TB of disk space. Additionally, the faculty research is conducted over 8 servers with a total capacity of over 172 CPU cores (and 11,264 CUDA GPU cores), 2 TB RAM and 83 TB of disk space. Through our link to the Center for Computational Biology and Bioinformatics, we have access to a cluster with over 6,000 CPU-cores and over 70,000 CUDA-cores for maximum performance of over 200 TFlops. A number of software computing environments are available, including Linux, Apache, MySQL, PHP platform (computing, access, and databases); Windows, Active Directory, MS SQL (desktops, file and print services, databases); Java, C, Python (development environments), R, SPSS, MATLAB (statistical packages) and APL. At health care software level, the MedLEE natural language processing, as well as Observational Health Data Sciences and Informatics (OHDSI) and i2b2 clinical research database environments, are available.
Rack-mounted servers are placed in 2 separate locations: locally at the department in an isolated and protected server room and at a remote hosting facility in New Jersey. Virtual machines provide the main back-office support for DBMI, including account management, file services, web hosting, databases, and backup. The rest of the servers, both physical and virtual, provide research, education and project application development environments for academic efforts and research support in the areas of Data mining, Text Mining, Natural Language Processing, Clinical Research Systems, User Interface research, and Clinical Systems Development. In addition, researchers have access to the clinical data in the Clinical Data Warehouse and CROWN outpatient system, which contain over 20 years of clinical data associated with over 5.1 million ambulatory and inpatients at New York-Presbyterian Hospital and the Columbia University Medical Center. The servers are appropriately secured following the institutional policies and strict system risk assessment procedures for all systems. Servers are normally firewalled from Internet access, and a strict security process ensuring separation of data and servers is followed for applications that are required to be open to the public Internet for research purposes. Servers are accessed using secure and encrypted protocols such as ssh/sftp, and network access is governed by Virtual Private Network connection. Web applications require the use of TLS/SSL protocols to encrypt their traffic. The institution is encouraging dual-factor authentication for all data access from public networks.