DNA Barcoding Workflow
Species identiﬁcation through barcoding is usually achieved by the retrieval of a short DNA sequence – the ‘barcode’ – from a standard part of the genome (i.e. a speciﬁc gene region) from the specimen under investigation. The barcode sequence from each unknown specimen is then compared with a library of reference barcode sequences derived from individuals of known identity. A specimen is identiﬁed if its sequence closely matches one in the barcode library. Otherwise, the new record can lead to a novel barcode sequence for a given species (i.e. a new haplotype or geographical variant), or it can suggest the existence of a newly encountered species.
Various gene regions have been employed for species-level biosystematics, however, DNA barcoding advocates the adoption of a ‘global standard’, and a 650-base fragment of the 50 end of the mitochondrial gene cytochrome c oxidase I (COI, cox1) has gained designation as the barcode region for animals. This fragmentsize has been selected so that a reliable sequence read can be obtained by a single sequence pass in conventional cycle sequencing platforms. Shorter fragments of COI have also been shown to be effective for the identiﬁcation of specimens with degraded DNA, however, where a 650-base sequence is not easily obtainable. In addition, the usability and robustness of COI in a standard highthroughput barcoding analysis have been extensively assessed.
Other researchers have suggested that alternate loci might also serve as a basis for species identiﬁcation. For example, 18S rDNA has been used for the identiﬁcation of soil nematodes and other small organisms in an approach known as ‘DNA taxonomy’ . This approach differs from DNA barcoding in that it does not aim to link the genetic entities recognised through sequence analysis with Linnaean species. As such, it is most useful for groups of organisms that lack detailed taxonomic systems. Alternate markers have also been used where COI sequences have not been produced robustly or are shown to be divergent within species or as further molecular evidence in the discovery of cryptic species. Moreover, in some groups such as plants, COI (and mitochondrial genomes at large) do not evolve rapidly enough to provide species-level resolution, and alternative markers are being pursued.
Several studies have demonstrated the effectiveness of DNA barcoding in different animal groups. These projects have shown that >95% of species possess unique COI barcode sequences; thus species-level identiﬁcations are regularly attained. The earliest barcode studies received some criticism, mainly owing to their limited taxonomic and geographical breadth; however, more recent studies have addressed these issues by targeting species-rich groups (i.e. those containing many closely related species) in tropical settings, and by comprehensive analyses of all the species in a given taxonomic assemblage. Momentum has further been aided by establishment of the Consortium for the Barcode of Life (CBOL, http://barcoding.si.edu) – an international alliance of research organizations that support the development of DNA barcoding as an international standard for species identiﬁcation– and by development of the Barcode of Life Data Systems (http://www.barcodinglife.org) – a global online data management system for DNA barcodes
Barcoding projects typically involve gathering specimens of a given taxonomic group (identiﬁed by conventional taxonomic methods such as morphology; see below), cataloging them together with collateral data such asphotographs and locality information, and assembling the barcode library (i.e. a 650-base segment of the COI gene). The analysis of DNA barcoding data is usually performed by a clustering method, such as distance-based neighbor-joining (NJ), and by evaluating genetic distances within and between species. More complex methodologies for data analysis are under development, including statistical tests for species assignment, and character-based clustering methods
hajibabei etal., 2007