Your browser version may not work well with NCBI's Web applications. More information here...
Related Articles, Links
Click here to read Click here to read
Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation.

Lord PW, Stevens RD, Brass A, Goble CA.

Department of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK. p.lord@russet.org.uk

MOTIVATION: Many bioinformatics data resources not only hold data in the form of sequences, but also as annotation. In the majority of cases, annotation is written as scientific natural language: this is suitable for humans, but not particularly useful for machine processing. Ontologies offer a mechanism by which knowledge can be represented in a form capable of such processing. In this paper we investigate the use of ontological annotation to measure the similarities in knowledge content or 'semantic similarity' between entries in a data resource. These allow a bioinformatician to perform a similarity measure over annotation in an analogous manner to those performed over sequences. A measure of semantic similarity for the knowledge component of bioinformatics resources should afford a biologist a new tool in their repertoire of analyses. RESULTS: We present the results from experiments that investigate the validity of using semantic similarity by comparison with sequence similarity. We show a simple extension that enables a semantic search of the knowledge held within sequence databases. AVAILABILITY: Software available from http://www.russet.org.uk.

Publication Types:
PMID: 12835272 [PubMed - indexed for MEDLINE]