- 一般社団法人 情報処理学会
- IPSJ Transactions on Bioinformatics (ISSN:18826679)
- vol.1, pp.35-41, 2008 (Released:2008-11-28)
The Barcode of Life (BOL) project aims to identify species with no other information than DNA sequence. We assume that BOL includes information on higher taxa. In the present study, we compute nonmetric distance from BOL barcodes by using rank order of pairwise distance for 3 distinct examples, namely, Ant Diversity in Northern Madagascar, Survey of Chelicerates, and Birds of North America. This enables us to recognize higher taxa, i.e., genus, family, and order, more easily. For example, the ratio of mean inner taxa nonmetric distance to the intertaxa distance is smaller than that for raw (metric) distance. Furthermore, for most pairs of higher taxa, the mean intertaxa distance is more than twice larger than intrataxa distances. The nonmetric multidimensional scaling method enables to discriminate higher taxa compared to tree construction by the neighbor-joining method or the maximum parsimony method with raw distance measure, when each species is embedded into more than 40 dimensional space with an accuracy of 90% even after leave-one-out-cross-validation.