Open Access Research

Enhancing access to the Bibliome: the TREC 2004 Genomics Track

William R Hersh1*, Ravi Teja Bhupatiraju1, Laura Ross1, Phoebe Roberts2, Aaron M Cohen1 and Dale F Kraemer1

Author Affiliations

1 Oregon Health & Science University, Portland, OR, USA

2 Biogen Idec Corp., Cambridge, MA, USA

For all author emails, please log on.

Journal of Biomedical Discovery and Collaboration 2006, 1:3  doi:10.1186/1747-5333-1-3

Published: 13 March 2006

Abstract

Background

The goal of the TREC Genomics Track is to improve information retrieval in the area of genomics by creating test collections that will allow researchers to improve and better understand failures of their systems. The 2004 track included an ad hoc retrieval task, simulating use of a search engine to obtain documents about biomedical topics. This paper describes the Genomics Track of the Text Retrieval Conference (TREC) 2004, a forum for evaluation of IR research systems, where retrieval in the genomics domain has recently begun to be assessed.

Results

A total of 27 research groups submitted 47 different runs. The most effective runs, as measured by the primary evaluation measure of mean average precision (MAP), used a combination of domain-specific and general techniques. The best MAP obtained by any run was 0.4075. Techniques that expanded queries with gene name lists as well as words from related articles had the best efficacy. However, many runs performed more poorly than a simple baseline run, indicating that careful selection of system features is essential.

Conclusion

Various approaches to ad hoc retrieval provide a diversity of efficacy. The TREC Genomics Track and its test collection resources provide tools that allow improvement in information retrieval systems.