Michael Eliot Bales, MPH, M.Phil




 

Questions or Comments? Please e-mail me at michael.bales@dbmi[dot] columbia[dot]edu 


Knowledge. Networks. Innovation.

Michael Bales

Visualization of keywords used in academic articles related to semantic network modeling

I am a PhD student in the Department of Biomedical Informatics at Columbia University in the City of New York.

Research goals

Researchers need ways to identify and understand material in the rapidly expanding scientific literature. Many rely on bibliographic databases to identify published articles. However, existing indexing systems cannot rapidly integrate new concepts and often do not match the way users conceptualize their domain. As a result, users have difficulty doing exploratory research, understanding the current state of a domain, or understanding how a specific topic finds its place among related topics in a field. This problem is particularly acute in new or emerging interdisciplinary domains. In response to this problem we have developed a collection of tools and approaches to visualize topics as a network of related entities, rather than as a strict hierarchy. Our research is intended to show that network-based approaches can elicit the structure of emerging domains, thereby facilitating access to the information in these domains. Ideally, these approaches can promote interdisciplinary collaboration and accelerate scientific progress and discovery.

Publications

•    Bales ME, Johnson SB, Lussier YA. Topological analysis of large-scale biomedical terminology structures. Journal of the American Medical Informatics Association 2007;14:788-797. Available from URL: http://dx.doi.org/10.1197/jamia.M2080

•    Kukafka R, Bales ME, Burkhardt A, Friedman C. Human and automated coding of rehabilitation discharge summaries according to the International Classification of Functioning, Disability, and Health. Journal of the American Medical Informatics Association 2006;13:508-515; Available from URL: http://dx.doi.org/10.1197/jamia.M2107

•    Bales ME, Johnson SB. Graph theoretic modeling of large-scale semantic networks. Journal of Biomedical Informatics 2006;39(4):451-64; Available from URL: http://dx.doi.org/10.1016/j.jbi.2005.10.007

•    Bales ME, Kukafka R, Burkhardt A, Friedman C. Qualitative assessment of the International Classification of Functioning, Disability, and Health with respect to the desiderata for controlled medical vocabularies.  International Journal of Medical Informatics 2006;75(5):384-95. Available from URL: http://dx.doi.org/10.1016/j.ijmedinf.2005.07.026

•    Ashford DA, Kaiser RM, Bales ME, Shutt K, Patrawalla A, McShan A, et al. Planning against biological terrorism: lessons from outbreak investigations. Emerg Infect Dis [serial online] 2003 May;5.  Available from URL: http://www.cdc.gov/ncidod/EID/vol9no5/02-0388.htm

•    Bales ME, Dannenberg AL, Brachman PS, Kaufmann AF, Klatsky PC, Ashford DA.  Epidemiologic response to anthrax outbreaks: field investigations, 1950-2001.  Emerg Infect Dis [serial online] 2002 Oct;8.  Available from URL: http://www.cdc.gov/ncidod/eid/vol8no10/02-0223.htm


Other publications

•    Bales ME, Johnson SB, Weng C. Social network analysis of interdisciplinarity in obesity research. American Medical Informatics Association Annual Symposium, Nov. 8-12, 2008, Washington DC.

•    Weng C, Gallagher D, Bales ME, Bakken S, Ginsberg HN. Understanding interdisciplinary health sciences collaborations: A campus wide survey of obesity experts. American Medical Informatics Association Annual Symposium, Nov. 8-12, 2008, Washington DC.

•    Bales ME, Kukafka R, Burkhardt A, Friedman C. Human and automated coding of rehabilitation discharge summaries according to the International Classification of Functioning, Disability, and Health. 11th Annual North American Collaborating Center (NACC) Conference on the International Classification of Functioning, Disability and Health (ICF). Rochester MN, June 21-24, 2005.

•    Bales ME, Kukafka R, Burkhardt A, Friedman C. Extending a medical language processing system to the functional status domain. American Medical Informatics Association Annual Symposium, Oct. 22-26, 2005, Washington DC.

•    Savova G, Friedman C, Kukafka R, Bales M, Burkhardt A, Harris M, Chute C. Autocoding against five ICF codes. 10th North American Collaborating Center Conference on International Classification of Functioning, Disability and Health (ICF). Halifax, Nova Scotia, Canada. June 1-4, 2004.

•    Dannenberg AL, Bales ME, Brachman PS, Kaufmann AF, Ashford DA. Epidemiologic responses to anthrax outbreaks: A review of field investigations conducted by the Centers for Disease Control and Prevention, 1950 to August 2001. American Public Health Association Annual Meeting, November 8-14, 2002.

•    Shepard CW, Soriano-Gabarro M, Zell ER, Hayslett J, Lukacs S, Goldstein S, Factor S,  Jones J, Ridzon R, Williams I, Rosenstein N, and the CDC Adverse Events Working Group.  Antimicrobial Postexposure Prophylaxis for Anthrax: Adverse Events and Adherence.  Emerg Infect Dis [serial online] 2002 Oct;8.  Available from URL: http://www.cdc.gov/ncidod/EID/vol8no10/02-0349.htm

•    Bales ME, Dannenberg AL.  The use of geographic information systems in epidemiologic field investigations. Am J Epidemiol. June 2001;153(suppl):262S.

Bio


Michael Bales is in his fifth year of a Ph.D. program in Biomedical Informatics. An epidemiologist by training, Mr. Bales had spent the previous three years as a fellow in the Public Health Informatics at the Centers for Disease Control and Prevention (CDC) in Atlanta, Georgia. At CDC, he served as lead author on “Epidemiologic response to anthrax outbreaks: field investigations, 1950-2001.” One of his main projects at CDC was to plan and create an internal, searchable electronic database, making fifty years of unpublished reports and published manuscripts available electronically.  It included many unpublished CDC reports on early anthrax investigations which served as a main data source for the research.

Mr. Bales’ CDC experience also includes creating an archival database of field investigation reports by officers of the Epidemic Intelligence Service, teaching a class on geographic information systems (GIS) in public health, and participating in the public health response to the anthrax bioterrorism attack in October 2001. While at CDC, he also conducted a descriptive study of the use of GIS in CDC epidemiologic field investigations. Later, he served in West and Central Africa as a short-term consultant with the World Health Organization (WHO), where he helped the Togo Ministry of Health and the WHO offices in Kinshasa, Democratic Republic of Congo, to build national capacity in data management and analysis.

During the first and second year of his doctoral training, Mr. Bales worked with an interdisciplinary research team to investigate human and automated coding of functional status information (FSI) using the International Classification of Functioning, Disability, and Health (ICF) framework. The ICF, a classification system published in 2001 by the World Health Organization, provides a common language and framework for describing FSI in health records. In the first phase of research the team worked to identify redundancies in the codes; to determine coding issues pertaining to ICF qualifiers; and to assess the level of domain knowledge required to perform coding. This work culminated in a qualitative assessment of the ICF classification with respect to the desiderata for controlled medical vocabularies. In the second phase of research the team modified an existing medical language processing system for use in the ICF domain. They also trained rehabilitation experts and non-expert coders who, along with the medical language processing system, assigned selected ICF codes to rehabilitation discharge summaries. They conducted a formal evaluation of code assignments; a manuscript is in preparation.

In the second and third years of his training, Mr. Bales has focused on a theme of language networks in biomedicine.
In a methodological review on complex semantic networks, he worked with Dr. Stephen Johnson to summarize recent research (1998-2005) on large-scale semantic networks. They used a tailored search strategy to retrieve relevant articles and then coded them according to a structured coding form. They highlighted several themes that emerged. First, real-world complex semantic networks commonly have scale-free and small-world topological features. In networks with small-world properties, it is possible to move from one node to another in a relatively small number of steps (often just two or three, on average.) Scale-free networks tend to have a similar appearance when examined at varying scales. In light of these findings they demonstrated how large network analysis methods can be applied to a variety of areas of informatics, including terminology development and summarization of electronic health records.

Mr. Bales worked with Dr. Johnson and Dr. Yves Lussier to compare the large-scale structure of 16 biomedical terminologies. They showed that multiple link types and unrestricted node connectivity are associated with small-world and scale-free features, respectively. These features, common in comprehensive medical terminologies, promote efficient navigation and organic growth, whereas synthetic constraints on node connectivity, which are common in statistical classifications, localize the effects of changes and deletions. The results indicated that given its utility in portraying the effects of design constraints on structure, network modeling is a useful adjunct to ontological approaches.

Mr. Bales and Dr. Johnson have recently been examining medical language modeled according to a linguistic formalism known as syntactic dependency. In seeking a more thorough understanding of the structure of medical sublanguage they hope to contribute to existing theories that underlie medical language processing.

Interests

Organizing the literature in new or emerging domains. I have been evaluating network modeling and analysis as a way to identify groups of researchers based on how they cite one another and the terms they use in their publications. The goal is to improve access to information in the scientific literature.

Modeling and analysis of large lexical networks. I have been using graph theory to model medical language and controlled biomedical terminologies.

Public health informatics is the systematic application of information and computer science and technology to public health practice, research and learning.  Source: DPHSI, Centers for Disease Control and Prevention.

I am interested in doing research in these areas:

  • Statistics-based methods for NLP

  • Building models of language based on word or phoneme associations

  • Text data mining and information aggregation

    • Automated information gathering tools

    • Representation of gathered information in an intermediate structure

    • Information retrieval and presentation tools

My personal interests include family and friends, adventure, and music.

Ideas

Some of my older ideas are based on the belief that concepts can be organized into a rigid framework.  In early 2004 I began to believe that building frameworks based on discrete elements and relations is an important aspect of human cognition, but that these frameworks are short-lived and are quickly replaced by other rigid frameworks.  I removed these ideas from this site, and I intend to restructure them according to this new viewpoint. As of August 2007, they were available in the Internet Archive.

Michael Bales
meb2108 [at] columbia [dot] edu


All original material copyright Michael Bales unless otherwise noted.

 Knowledge representation
Knowledge representation is a central problem in artificial intelligence. The question is how to store and manipulate knowledge in an information system in a formal way so that it may be used by mechanisms to accomplish a given task. Examples of applications are expert systems, machine translation systems, computer-aided maintenance systems and information retrieval systems (including database front-ends).

Source: Wikipedia

 Graph theory
In mathematics and computer science, Graph theory studies the properties of graphs. Informally, a graph is a set of objects called vertices (or nodes) connected by links called edges. Typically, a graph is depicted as a set of dots (the vertices) connected by lines (the edges).

Source: Wikipedia

Biomedical informatics
Biomedical Informatics is the scientific field that deals with the storage, retrieval, sharing, and optimal use of biomedical information, data, and knowledge for problem solving and decision making. It touches on all basic and applied fields in biomedical science and is closely tied to modern information technologies, notably in the areas of computing and communication.

Source: Columbia University Biomedical Informatics


shareright © 2002 Phlash