Previous BioInformatics Research
The Artificial Intelligence Lab has been involved in medical text mining since the early 90s. Several text mining techniques have been designed, developed and tested, including:
- Arizona Noun Phraser: originally
a general English noun phraser, it was adjusted
to extract only relevant medical terms from
- Automatic Indexing: stop
wording and algorithmic index phrase formation.
- Concept Space: index phrase
co-occurrence information used to generate
an automatic thesaurus for search term suggestion.
- Keyword Suggester: A keyword
suggester for the medical domain; it dynamically
suggest additional keywords for user queries
based on a mapping algorithm that combines
Concept Space with the UMLS.
- Java-based Graphical Thesaurus:
graphical display of concept space terms.
- Java-based Visual Browser: dynamic and scalable self-organizing maps for visualization and categorization of large retrieval set. [demo]
provides a powerful search vehicle focused on
improving the availability of, and access to,
medical information on the Internet and in medical
databases for professional and advanced users.
HelpfulMed offers users three state-of-the-art
technologies to search and browse for intelligent
and reliable medical information. Our cutting-edge
search technology allows users to locate essential
cancer information by extracting precise noun
phrases and determining relationships with other
fine-grained medical terminology through HelpfulMed's
proprietary concept-based search support. HelpfulMed's
three proprietary technologies are summarized
A Medical Spider based on the Hopfield Net spreading activation algorithm was designed to specifically retrieve medical web pages. It is equipped with a medical vocabulary knowledge base. An inlink analysis algorithm which compares the UMLS knowledge base to the text of the web page was developed in order for the spider to assess whether the web page in question is indeed a medical web page.
The Medical Concept Space is an automatically generated thesaurus designed to facilitate concept-based, cross-domain information retrieval. The concept space contains over 48.5 million unique terms and over 1.7 billion relationships. The system suggests highly relevant search terms exactly as they appear in the documents, enhancing information recall. Terms are presented in categories according to their source: author, noun phrase terms, MeSH terms. The top 40 terms are displayed in order based on a weighting/ranking algorithm. The Medical Concept Space was generated on a Silicon Graphics Origin 2000 8-node processor, using 2 weeks of cpu time.
MEDMap is a 2-D multi-layered graphical display of important medical concepts and a document server that supports guided browsing of concepts and documents. MEDMap is based on the Kohonen self-organizing map algorithm. Using as input the 48.5 million unique terms and 1.7 billion relationships from the Medical Concept Space, the system partitions MEDLINE documents into 132.700 categories on 4586 maps. Clicking on a map region will take you down a layer in the multi-layered map, or show you the documents associated with that category. Textual concept labels and colors are used to demarcate regions in the SOM, color has no specific meaning. HelpfulMed was generated on a Silicon Graphics Origin 2000 8-node processor, using 14 days of cpu time. [demo]
MedTextUs is an online search system designed to facilitate efficient and precise information retrieval for medical professionals and the general public. The system is built upon advanced technologies in areas of information retrieval, document characterization, and visualization. The core technologies used include meta search, noun phrasing, concept mapper, and SOM (Self-Organizing Map). MedTextus starts by querying the selected medical literature database based on the given keywords. The spider then fetches the documents returned from those databases. After collecting the required number of web pages, further analysis will be performed. Noun phrases will be extracted from the pages, which allows the user to know what key concepts are related to a given document. The concepts can also be visualized in a 2-D map, which categorizes the web pages by collecting them into regions, each of which represents a concept. All these features allow the user to automatically collect information more effectively and represent it in a more meaningful way. [demo]
BioMedical Text Mining Publications
1. H. Chen, A. Lally, B. Zhu, and M. Chau, "HelpfulMed: Intelligent Searching for Medical Information over the Internet," Journal of the American Society for Information Science and Technology, 54 (7), 683-694, 2003
2. G. Leroy and H. Chen, “Meeting Medical Terminology Needs-The Ontology-Enhanced Medical Concept Mapper,” IEEE Transactions on Information Technology in Biomedicine, Volume 5(4), 261-270, 2001.
3. K. Tolle and
H. Chen, “Comparing
Noun Phrasing Techniques for Use with Medical
Digital Library Tools,”
Journal of the American Society for Information Science, Special Issue on Digital Libraries, Volume 51(4), 352-370, 2000.
Houston, H. Chen, B. R. Schatz, R. R. Sewell,
K. M. Tolle, T. E. Doszkocs, S. M. Hubbard and
D. T. Ng, “Exploring
the Use of Concept Spaces to Improve Medical
Decision Support Systems, Volume 30 (2), 171-186, 2000.
5. H. Chen, J. Martinez, T. D. Ng and B. R. Schatz, “A Concept Space Approach to Addressing the Vocabulary Problem in Scientific Information Retrieval: An Experiment on the Worm Community System,”
Journal of the American Society for Information Science, Volume 48 (1), 17-31, 1997.
BioMedical Data Mining Publications
1. K. M. Tolle, H. Chen and H. Chow,
“Estimating drug/plasma concentration levels by applying neural networks to pharmacokinetic data sets,”
Decision Support Systems, Volume 30 (2), 139-152, 2000.
2. A.L. Houston,
H. Chen, S. M. Hubbard, B. R. Schatz, T. D.
Ng, R. R. Sewell and K. M. Tolle, “Medical
Data Mining on the Internet: Research on a Cancer
Artificial Intelligence Review, Volume 13, 437-466, 1999.
3. H. Chow, K.
Tolle, D. Roe, V. Elsberry and H. Chen, “Application
of Neural Networks to Population Pharmacokinetics
Journal of Pharmaceutical Sciences, Volume 86 (7), 840-845, 1997.
4. H. Chow, H.
Chen, T. Ng, P. Myrdal and S. H. Yalkowsky,
Backpropagation Networks for the Estimation
of Aqueous Activity Coefficients of Aromatic
Journal of Chemical Information and Computer Sciences, American Chemical Society, Volume 3 (4), 723-728, 1995.
For additional information, please contact us.