Biomedical Domain Benchmark

From Thinclabwiki

Jump to: navigation, search

Biomedical ontologies bring unique challenges to the ontology alignment problem. Moreover, there is an explicit interest for ontologies and ontology alignment in the domain of biomedicine. Consequently, we present a new biomedical ontology alignment testbed, which provides an important application context to the alignment research community. Due to the large sizes of biomedical ontologies, the testbed could serve as a comprehensive large ontology benchmark. Existing correspondences submitted to National Center for Biomedical Ontology (NCBO) may serve as the reference alignments for the pairs, although our analysis reveals that these maps represent just a small fraction of the total alignment that is possible between two ontologies. Consequently, new correspondences that are discovered during benchmarking may be submitted to NCBO for curation and publication. In order to create the testbed, we combed through more than 300 ontologies hosted at NCBO and OBO Foundry, and isolated a benchmark of 50 different biomedical ontology pairs. The ontology pairs are listed in the table below. Our primary criteria for including a pair in the benchmark was the presence of a sufficient amount of correspondences between the ontologies in the pair, as determined from NCBO’s BioPortal. We briefly describe the steps in creating the testbed:

  1. We selected ontologies, which exist in either OWL or RDF models.
  2. Next, we paired the ontologies and ordered the pairs by the percentage of available correspondences. This is calculated as the ratio of correspondences that exist in BioPortal for the pair of ontologies under consideration divided by the product of the number of entities in both the ontologies.
  3. Top 100 ontology pairs are selected, followed by ordering the pairs based on their joint sizes.
  4. We created 5 bins of equal sizes and randomly sampled each bin with a uniform distribution, to obtain the final 50 pairs.

Biomedical Ontology Alignment Benchmark

Test ID Ontology 1 Ontology 1 Reference Alignment
1 1114 1021 reference alignment
2 1063 1022 reference alignment
3 1038 1587 reference alignment
4 1114 1022 reference alignment
5 1017 1001 reference alignment
6 1078 1022 reference alignment
7 1017 1587 reference alignment
8 1090 1095 reference alignment
9 1568 1022 reference alignment
10 1001 1587 reference alignment
11 1108 1587 reference alignment
12 1068 1402 reference alignment
13 1095 1013 reference alignment
14 1574 1013 reference alignment
15 1095 1051 reference alignment
16 1095 1110 reference alignment
17 1574 1000 reference alignment
18 1362 1030 reference alignment
19 1110 1574 reference alignment
20 1090 1013 reference alignment
21 1090 1051 reference alignment
22 1090 1110 reference alignment
23 1065 1022 reference alignment
24 1108 1005 reference alignment
25 1095 1404 reference alignment
26 1051 1110 reference alignment
27 1574 1404 reference alignment
28 1095 1022 reference alignment
29 1574 1022 reference alignment
30 1027 1013 reference alignment
31 1005 1013 reference alignment
32 1090 1404 reference alignment
33 1090 1022 reference alignment
34 1362 1404 reference alignment
35 1362 1015 reference alignment
36 1362 1022 reference alignment
37 1013 1404 reference alignment
38 1015 1013 reference alignment
39 1013 1022 reference alignment
40 1107 1022 reference alignment
41 1051 1404 reference alignment
42 1404 1000 reference alignment
43 1051 1022 reference alignment
44 1064 1123 reference alignment
45 1110 1022 reference alignment
46 1027 1022 reference alignment
47 1005 1404 reference alignment
48 1015 1005 reference alignment
49 1041 1007 reference alignment
50 1005 1022 reference alignment


Ontologies

NCBO ID Ontology Total Classes
1404 Uber anatomy ontology 7294
1041 Protein modification 1338
1017 FlyBase Controlled Vocabulary 821
1038 Plant Growth and Development Stage 282
1007 Chemical entities of biological interest 31470
1107 Phenotypic quality 2281
1013 eVOC (Expressed Sequence Annotation for Humans) 2274
1090 Amphibian gross anatomy 1603
1021 Human developmental anatomy 2314
1065 Tick gross anatomy 628
1078 Spatial Ontology 129
1051 Zebrafish anatomy and development 2788
1000 Mouse adult gross anatomy 2982
1068 Subcellular Anatomy Ontology (SAO) 821
1114 Bilateria anatomy 114
1123 Ontology for Biomedical Investigations 3537
1587 Plant Ontology 1585
1063 Common Anatomy Reference Ontology 50
1568 Anatomical Entity Ontology 238
1064 Fly taxonomy 6599
1001 Cereal plant gross anatomy 1270
1574 vertebrate Homologous Organ Groups 1184
1402 NIF Cell 2703
1110 Teleost Anatomy Ontology 3039
1005 BRENDA tissue / enzyme source 5139
1108 Plant Anatomy 1270
1022 Human developmental anatomy 8340
1027 Medaka fish anatomy and development 4358
1362 Hymenoptera Anatomy Ontology 1930
1015 Drosophila gross anatomy 7797
1095 Xenopus anatomy and development 1041
1030 Mosquito gross anatomy 1864
Personal tools