Evolutionary Biology & Ontologies Workshop report

Our first educational and outreach event “Evolutionary Biology & Ontologies Workshop” was held at the Evolution meetings in Minneapolis, Minnesota (June 20, 2008), and we felt it was a big success.  We had lots of enthusiasm and over 50 attendees for this all day workshop, which was organized by the Phenoscape PIs (Paula Mabee, Todd Vision, Monte Westerfield), NESCent and Barry Smith from the National Center for Biomedical Ontologies (NCBO). 

We need to especially thank all our speakers for their excellent presentations, They not only gave the audience a varied introduction to ontologies, but also a set of examples of the integrative questions that can be answered using them.   The slideshows for all of these talks are on our wiki but I thought I would provide a brief overview of each one right below as a summary of the workshop. The use of ontologies is just emerging in evolutionary biology, and it is an exciting time to be involved in this field.  As we move forward to use ontologies in evolutionary biology, we discover new requirements and challenges — for example, the challenge of how we create ontologies that are interoperable — so that we can ask big questions that span not only taxonomic groups (such as bees and fishes and mouse and fly) but different knowledge domains (such as phenotype, evolution, and genetics, genomics, medicine).

So here is an overview of each talk.

Barry Smith (NCBO and Univ. Buffalo) kicked off the workshop with “An Introduction to Ontology for Evolutionary Biology“.  He talked about the GO in relation to “doing biology across the genome”, and he described the domain coverage of the ontologies in the OBO foundry.  He introduced basic ontological concepts and relations, including several new ones that must be made in creating ontology resources for evolutionary biology.  He is currently working with a number of groups including EoL to develop an Environment Ontology (EnvO).

Chris Mungall (Lawrence Berkeley Labs) presented “PATO: The Phenotype and Trait Ontology“.  He focused on why we need ontologies (for example, to solve the data integration problem), introduced “phenotype” and the newly proposed Homologous_to relation. 

Monte Westerfield (Univ. Oregon, Zebrafish Information Network) talked about the use of ontologies in “Linking Animal Models and Human Diseases“.  Just as BLAST can be used to connect animal genes to human genes, shared ontologies and syntax can connect mutant phenotypes to candidate human disease genes. He talked about the importance of curation consistency and how it can be measured and increased.  His proof of principle example, comparing ontology-annotated phenotypes in human and zebrafish demonstrated that genes associated with human diseases can be identified.

It was then my turn to introduce Phenoscape in my talk “An Introduction to the Use of Ontologies in Linking Evolutionary Phenotypes to Genetics“.  Just as animal model phenotypes can be compared to human phenotypes using ontologies, so can animal model phenotypes be compared to phenotypes of multiple species – thus linking comparative evolutionary data to model organism genetics.  I presented several use cases for fishes and described the rapid progress that we have made in developing new ontologies, extending existing ones as well as some of the challenges involved in multispecies ontologies.

Jim Balhoff (NESCent, Phenoscape) presented the software that he has developed for curation of evolutionary phenotypes “Phenote: Curation Software for Evolutionary Phenotypes“.  Originally developed for ontology-based annotation of genetic mutant phenotypes and widely used by model organism curators, Jim described how he has extended and reconfigured Phenote to handle comparative evolutionary data, and he also spoke about his plans to integrate ontology-based phenotype descriptions into the NeXML file format.

Melissa Haendel (Zebrafish Information Network) presented “The Common Anatomy Reference Ontology (CARO) and queries across species“.  She described current anatomy ontologies and why we build them, and the standardization and integration of anatomy ontologies using CARO.  She explained how the issues created by homology can be addressed, including the draft definition of the Homologous_to relationship, and she described how homology data can be used with ontologies to query across species. 

After lunch we had about a ½ hour of “Lightning Talks” during which workshop attendees had an opportunity to introduce themselves and give brief (2-3 minutes) descriptions of their interest in ontologies.  Several attendees seized the chance and made the audience aware of their ontology building efforts.

After that, Ann Maglia (Missouri Univ. of Science and Technology, AmphibAnat) resumed the regular talks and spoke about her group’s work in the “Development of the Amphibian Anatomical Ontology“.  This ontology was driven by the need of the amphibian community to standardize their terminology and thus integrate their comparative work.  She is exploring automated techniques to reduce manual efforts and enhance existing, manually created ontologies.  Ann also described the relational database (RDBOM), the modularization, and web-based community curation methods built by her group and community. 

Martin Ramirez (Museo Argentino de Ciencias Naturales) in his talk “Ontologies, Image Databases, and Evolutionary Biology” described the use of the spider ontology to link images from the spider Tree of Life project to a phylogenetic dataset and a body of image annotation.  He discussed a series of challenges, such as issues involving the alignment of ontologies using homology relations and the “combinability” of ontologies.

Todd Vision (Univ. North Carolina, NESCent), presented “Ontologies and the Identification of Candidate Genes for Complex Traits“.  He considered the qualities of “good” candidate genes and how they are typically discovered and ranked, and then described a test case where ontology terms were used in picking candidate genes computationally (using an approach named CAESAR) for Type 2 Diabetes.  This approach has a promising application for evolutionary traits in non-model organisms.

Peter Midford (Univ. Kansas, Phenoscape) introduced taxonomy ontologies with his talk “Names, Ranks, Clades, and Taxonomy Ontologies” — what’s in them, how they are built and why one would use one.  The issue of what species are ontologically, more specifically whether species are classes or instances or both, impacts not only the way an ontology is built but also what kinds of assertions can be inferred or not.

Suzanna Lewis (Lawrence Berkeley Lab, BBOP) and I wrapped up by listing some of the challenges for the evolutionary biology community that became apparent over the day. 

Especially prominent was the need for communication among evolutionary biologists developing ontologies so that they can be developed from the outset to be interoperable.  This and many related comments from attendees suggested that an NSF-RCN (Research Coordination Network) proposal might be the best next step and an effective mechanism to fund and promote communication among and between evolutionary biologists, ontologists, and model organism biologists.

As a result, we are now beginning to gather ideas, objectives, and possible participants for an RCN proposal. If you have thoughts on this, or if you are interested in participating, we would love to hear from you – leave a comment here, or email me (pmabee{at}usd{dot}edu).

