|
I am a PhD student in the Department of Biomedical
Informatics at Columbia
University in the City of New York.
Research goals
Researchers
need ways to identify and understand material in the rapidly expanding
scientific literature. Many rely on bibliographic databases to identify
published articles. However, existing indexing systems cannot rapidly
integrate new concepts and often do not match the way users
conceptualize their domain. As a result, users have difficulty doing
exploratory research, understanding the current state of a domain, or
understanding how a specific topic finds its place among related topics
in a field. This problem is particularly acute in new or emerging
interdisciplinary domains. In response to this problem we have
developed a collection of tools and approaches to visualize topics as a
network of related entities, rather than as a strict hierarchy. Our
research is intended to show that network-based approaches can elicit
the structure of emerging domains, thereby facilitating access to the
information in these domains. Ideally, these approaches can promote
interdisciplinary collaboration and accelerate scientific progress and
discovery.
Publications
• Bales ME, Johnson SB,
Lussier YA. Topological analysis of large-scale biomedical terminology
structures. Journal of the American Medical Informatics
Association 2007;14:788-797. Available from URL: http://dx.doi.org/10.1197/jamia.M2080
• Kukafka R, Bales ME,
Burkhardt A, Friedman C. Human and automated coding of rehabilitation
discharge summaries according to the International Classification of
Functioning, Disability, and Health. Journal of the American Medical
Informatics Association 2006;13:508-515; Available from URL: http://dx.doi.org/10.1197/jamia.M2107
• Bales ME, Johnson SB. Graph theoretic modeling of large-scale
semantic networks. Journal of Biomedical Informatics 2006;39(4):451-64; Available from URL: http://dx.doi.org/10.1016/j.jbi.2005.10.007
• Bales ME, Kukafka R, Burkhardt
A, Friedman C. Qualitative assessment of the International
Classification of Functioning, Disability, and Health with respect to
the desiderata for controlled medical vocabularies. International
Journal of Medical Informatics 2006;75(5):384-95. Available from URL: http://dx.doi.org/10.1016/j.ijmedinf.2005.07.026
• Ashford DA, Kaiser RM, Bales
ME, Shutt K, Patrawalla A, McShan A, et al. Planning against biological
terrorism: lessons from outbreak investigations. Emerg Infect Dis
[serial online] 2003 May;5. Available from URL: http://www.cdc.gov/ncidod/EID/vol9no5/02-0388.htm
• Bales ME, Dannenberg AL, Brachman PS, Kaufmann AF,
Klatsky PC,
Ashford DA. Epidemiologic response to anthrax outbreaks: field
investigations, 1950-2001. Emerg Infect Dis [serial online] 2002
Oct;8. Available from URL: http://www.cdc.gov/ncidod/eid/vol8no10/02-0223.htm
Other
publications
• Bales ME, Johnson SB, Weng C. Social network analysis of interdisciplinarity in obesity research. American Medical Informatics Association Annual Symposium, Nov. 8-12, 2008, Washington DC.
• Weng C, Gallagher D, Bales ME, Bakken S, Ginsberg HN. Understanding interdisciplinary health sciences collaborations: A campus wide survey of obesity experts. American Medical Informatics Association Annual Symposium, Nov. 8-12, 2008, Washington DC.
• Bales ME, Kukafka R, Burkhardt
A, Friedman C. Human and automated coding of rehabilitation discharge
summaries according to the International Classification of Functioning,
Disability, and Health. 11th Annual North American Collaborating Center
(NACC) Conference on the International Classification of Functioning,
Disability and Health (ICF). Rochester MN, June 21-24, 2005.
• Bales ME, Kukafka R, Burkhardt
A, Friedman C. Extending a medical language processing system to the
functional status domain. American Medical Informatics Association
Annual Symposium, Oct. 22-26, 2005, Washington DC.
• Savova G, Friedman C, Kukafka
R, Bales M, Burkhardt A, Harris M, Chute C. Autocoding against five ICF
codes. 10th North American Collaborating Center Conference on
International Classification of Functioning, Disability and Health
(ICF). Halifax, Nova Scotia, Canada. June 1-4, 2004.
• Dannenberg
AL, Bales ME, Brachman PS, Kaufmann AF, Ashford DA. Epidemiologic
responses to anthrax outbreaks: A review of field investigations conducted
by the Centers for Disease Control and Prevention, 1950 to August 2001. American
Public Health Association Annual Meeting, November 8-14, 2002.
• Shepard CW, Soriano-Gabarro M,
Zell ER, Hayslett J, Lukacs S, Goldstein S, Factor S, Jones J,
Ridzon R, Williams I, Rosenstein N, and the CDC Adverse Events Working
Group. Antimicrobial Postexposure Prophylaxis for Anthrax:
Adverse Events and Adherence. Emerg Infect Dis [serial online]
2002 Oct;8. Available from URL: http://www.cdc.gov/ncidod/EID/vol8no10/02-0349.htm
• Bales ME,
Dannenberg AL. The use of geographic information systems in
epidemiologic field investigations. Am J Epidemiol. June
2001;153(suppl):262S.
Bio
Michael Bales is in his fifth year
of a Ph.D. program in Biomedical Informatics. An epidemiologist by
training,
Mr. Bales had spent the previous three years as a fellow in the Public
Health
Informatics at the Centers for Disease Control and Prevention (CDC) in Atlanta, Georgia. At CDC, he served as lead author on “Epidemiologic response to anthrax
outbreaks: field investigations, 1950-2001.” One of his
main projects at CDC was to plan and create an internal, searchable
electronic
database, making fifty years of unpublished reports and published
manuscripts
available electronically. It included many unpublished CDC
reports on
early anthrax investigations which served as a main data source for the
research.
Mr. Bales’
CDC experience also includes creating an archival database of field
investigation reports by officers of the Epidemic Intelligence Service,
teaching a class on geographic information systems (GIS) in public
health, and
participating in the public health response to the anthrax bioterrorism
attack
in October 2001. While at CDC, he also conducted a descriptive
study of
the use of GIS in CDC epidemiologic field investigations. Later,
he
served in West and Central Africa as a short-term consultant with the
World
Health Organization (WHO), where he helped the Togo Ministry of Health
and the
WHO offices in Kinshasa,
Democratic Republic of Congo, to build national capacity in data
management and
analysis.
During the first and second year of his doctoral training, Mr. Bales
worked with an interdisciplinary research team to investigate human and
automated coding of functional status information (FSI) using the
International Classification of Functioning, Disability, and Health
(ICF) framework. The ICF, a classification system published in 2001 by
the World Health Organization, provides a common language and framework
for describing FSI in health records. In the first phase of research
the team worked to identify redundancies in the codes; to determine
coding issues pertaining to ICF qualifiers; and to assess the level of
domain knowledge required to perform coding. This work culminated in a
qualitative assessment of the ICF classification with respect to the
desiderata for controlled medical vocabularies. In the second phase of
research the team modified an existing medical language processing
system for use in the ICF domain. They also trained rehabilitation
experts and non-expert coders who, along with the medical language
processing system, assigned selected ICF codes to rehabilitation
discharge summaries. They conducted a formal evaluation of code
assignments; a manuscript is in preparation.
In the second and third years of his training, Mr. Bales has focused on a theme of language networks in biomedicine.
In a methodological review on complex semantic networks, he worked with
Dr. Stephen Johnson to summarize recent research (1998-2005) on
large-scale semantic networks. They used a tailored search strategy to
retrieve relevant articles and then coded them according to a
structured coding form. They highlighted several themes that emerged.
First, real-world complex semantic networks commonly have scale-free
and small-world topological features. In networks with small-world
properties, it is possible to move from one node to another in a
relatively small number of steps (often just two or three, on average.)
Scale-free networks tend to have a similar appearance when examined at
varying scales. In light of these findings they demonstrated how large
network analysis methods can be applied to a variety of areas of
informatics, including terminology development and summarization of
electronic health records.
Mr. Bales worked with Dr. Johnson and Dr. Yves Lussier to compare
the large-scale structure of 16 biomedical terminologies. They showed
that multiple link types and unrestricted node connectivity are
associated with small-world and scale-free features, respectively.
These features, common in comprehensive medical terminologies, promote
efficient navigation and organic growth, whereas synthetic constraints
on node connectivity, which are common in statistical classifications,
localize the effects of changes and deletions. The results indicated
that given its utility in portraying the effects of design constraints
on structure, network modeling is a useful adjunct to ontological
approaches.
Mr. Bales and Dr. Johnson have recently been examining medical language
modeled according to a linguistic formalism known as syntactic
dependency. In seeking a more thorough understanding of the structure
of medical sublanguage they hope to contribute to existing theories
that underlie medical language processing.
Interests
Organizing the literature in new or emerging domains.
I have been evaluating network modeling and analysis
as a way to identify groups of researchers based on how they cite
one another and the terms they use in their publications. The goal is
to improve access to information in the scientific literature.
Modeling and analysis of large lexical networks.
I have been using graph
theory to model medical language and controlled biomedical
terminologies.
Public health informatics is
the systematic application of information and computer science and
technology to public health practice, research and learning.
Source: DPHSI,
Centers for Disease Control and Prevention.
I am interested in doing
research in these areas:
-
Statistics-based methods for
NLP
-
Building models of language based on word or phoneme associations
-
Text data mining and
information aggregation
-
Automated information
gathering tools
-
Representation of gathered
information in an intermediate structure
-
Information retrieval and
presentation tools
My personal
interests include family and friends, adventure, and music.
Ideas
Some of my older ideas are
based on the belief that concepts can be organized into a rigid
framework. In early 2004 I began to believe that building
frameworks based on discrete elements and relations is an important
aspect of human cognition, but that these frameworks are short-lived
and are quickly replaced by other rigid frameworks. I removed these ideas from this site, and I intend to
restructure them according to this new viewpoint. As of August
2007, they were available in the Internet
Archive.
Michael Bales
meb2108 [at] columbia [dot] edu
All
original material copyright Michael Bales unless otherwise
noted.
|
Knowledge representation
Knowledge representation
is a central problem in artificial intelligence. The question is how to
store and manipulate knowledge in an information system in a formal way
so that it may be used by mechanisms to accomplish a given task.
Examples of applications are expert systems, machine translation
systems, computer-aided maintenance systems and information retrieval
systems (including database front-ends).
Source: Wikipedia
Graph theory
In mathematics and computer
science, Graph theory studies the properties of graphs.
Informally, a graph is a set of objects called vertices (or nodes)
connected by links called edges. Typically, a graph is depicted as a
set of dots (the vertices) connected by lines (the edges).
Source: Wikipedia
Biomedical informatics
Biomedical Informatics
is the scientific field that deals with the storage, retrieval,
sharing, and optimal use of biomedical information, data, and knowledge
for problem solving and decision making. It touches on all basic and
applied fields in biomedical science and is closely tied to modern
information technologies, notably in the areas of computing and
communication.
Source: Columbia
University Biomedical Informatics
|