They occur almost once in every 1, nucleotides on average, which means there are roughly 4 to 5 million SNPs in a person's genome. These variations may be unique or occur in many individuals; scientists have found more than million SNPs in populations around the world. Most commonly, these variations are found in the DNA between genes. They can act as biological markers, helping scientists locate genes that are associated with disease.
Furthermore, targeted R genes represent a very important class of plant genes, with important roles in creating profitable breeding programs and studying plant biodiversity and evolution. Sequence information is not required prior to analysis, as R gene profiling employs locus-specific degenerate primers targeting highly conserved R gene domains. Since primers for RGMs are conserved they can be easily transferred to virtually any plant taxon, facilitating cross-species amplifications.
Furthermore, resistance-gene derived fragments can be further analyzed and converted to cleaved amplified polymorphic sequence CAPS and sequence-characterized amplified region SCAR markers [ , , ]. The study of Valkonen et al. This can be a major advantage over other gene-targeting markers such as retroelement-based markers, where prior sequence information is required to position the primers.
This advantage has increased the popularity of AAD markers in a diverse range of plant groups. As alternatives to other techniques, RGMs can be used where no prior genomic knowledge or even no sequence information is available. Such markers can be used to assess genetic diversity among resistance loci, or to characterize germplasm collections based on these traits Table 3.
The effective characterization of the gene pools of wild relatives of crop species using RGM methods could highly beneficial. It could facilitate the management of genetic resources, as plant breeding programs are mostly concerned with finding and introgressing traits - mostly resistance genes - found in wild relatives. An important requirement for phylogenetic studies is that inferences should be based on homologous characters that share common ancestry.
Strictly homologous molecular characters or orthologous sequences are often assumed to map to the same genomic location, while paralogs map to different positions. However, orthologous sequences could also map to different positions due to extensive genomic rearrangements [ ]. Therefore, it is better to view homology as a relationship based on common origin between any entities without further distinction, [ ] while orthology is descent from a single ancestral sequence with relationship viewed in terms of speciation vertical descent.
Paralogy, by contrast, can be viewed as relationship via duplication [ ]. Many multi-locus methods fail to fulfill the requirement for homology as they produce non-homologous bands that are mistakenly inferred to be homologs after phylogenetic analysis. In this case the scored bands are apparently similar but phylogenetically independent. In DNA fingerprinting apparent homology may arise from non-identical bands that co-migrate simply by chance or because they share similar sequences, but these can be either orthologs, pseudogenes, transposable elements or even repetitive elements with unknown functions [ ].
False scoring of just slightly different size fragments in two separate profiles can also lead to false homology [ ]. In this respect, the problem of correct homology assessment may not be restricted to phylogenetics but may be a factor in all genome scanning studies. In the case of resistance-gene based banding patterns it can be difficult to define characters as either orthologous or paralogous. Genetically linked gene families have higher probabilities for recombination than single genes.
Genetic recombinations between alleles of R genes of the same cluster can re-assort the genetic variation created by mutation to create new alleles [ ]. The importance of this in R genes is illustrated by the fact that most novel alleles are associated with recombination events [ ]. In the reciprocal arms race of host parasite evolution a number of factors affect the degree to which the members of an R gene cluster recombine with each other to create new variants.
Although the resistance-gene families are regarded as stable complexes, unequal recombination occurs, albeit only at low frequencies. In the case of some unexplained scenarios, unequal recombinations can be implicated as sources of homoplasy. However, homoplasy becomes a greater problem when distantly related species are involved and is less likely to be a problem for studies of very closely related species with a similar genomic organization [ 39 , 40 ]. The targeting of more conserved regions of resistance-genes makes RGMs more appropriate for many applications, since the chance of homoplasy is reduced.
This may be due to several factors such as constraints on allele size range, high mutation rates, size homoplasy and low levels of conservation of SSRs among Zingiberaceae, which hampered the use of microsatellites in this study. In these studies, the patterns generated by NBS-profiling complimented the results obtained from the other markers systems.
Similar comparisons for RGAP have not yet been made. This indicates that resistance-gene based markers can be at least as useful as AADs or SSRs for phylogeny reconstruction, and they may even perform better when more diverse material is used due to a reduction in the levels of homoplasy.
As paralogy depends on the mutation rate of the RGAs it may be possible that bands are non-homologous. If co-migrating non-homologous bands do exist in resistance-gene based fingerprints their frequency must be low due to the specificity of amplification as discussed in the previous section. However, the drawbacks of using degenerate primers may yet remain, as these specific primers may nonetheless be biased towards known R genes.
However, it has been shown that although R gene profiling yields genes that are already known, plenty of new RGAs are also targeted [ 68 ]. Functional sequences are assumed to be under selection. Plants do not have circulatory system based immunity as is seen in animals.
Therefore, they are very dependent on individual cellular defense mechanisms, which are often based on single R-genes with specific structures.
These genes are likely to be under selection, which might influence the outcome of any phylogenetic analysis. Results indicate that different regions of these genes evolve with different rates according to a birth-and-death process [ ].
Some regions are hypervariable and incorporate many non-synonymous and synonymous mutations, while other parts evolve at a steadier rate. Resistance-gene based fingerprints are preferentially generated from plant resistance genes; therefore they better shape the evolution of these genes within a species, or among certain taxa. In the case of tuber-bearing Solanum species, poor resolution was obtained at the basal nodes of the reconstructed phylogenetic trees based on NBS-profiling [ ].
The authors explained this by extensive hybridization among species that evolved within a relatively short period of time, coupled with rapid radiation with no clear sequential branching.
This observation may indicate that R gene evolution and species evolution could be linked, and banding patterns may reflect true phylogenies. R genes with different selection mechanisms may occur in a specific profile at a relatively low frequency, but these few bands will not significantly affect the overall phylogeny. These single resistance genes in some cases could be crucial for the survival of a species at a particular moment of speciation.
On an evolutionary time scale this would equate to a short period as plant pathogens spread rather fast, requiring that the resistance genes necessary for survival should also spread rapidly.
According to Wang et al. Biological responses of plant cells to certain stress factors are important phenomena, as these processes depend on the regulation of gene expression. Many methods have been developed in an attempt to gain an insight into these processes, and this has led to the generation of PCR-based markers.
Fingerprinting markers are based on the specific amplification of a subset of fragments, which can be derived from RNA as well as DNA. The techniques summarized here are based on transcribed regions of the genome that are most likely functional. The methods described here may utilize the RNA pool directly, or after further processing, using cDNA or ESTs coupled with bioinformatic tools to generate random or specifically designed primers. Endogenous non-coding small RNAs consisting of 20—24 nucleotides are ubiquitous in eukaryotic genomes, where they play important regulatory roles, [ 69 ] and they provide an excellent source for molecular marker development.
The flanking sequences of small RNAs are conserved, allowing the design of primers for use in PCR reactions and fingerprinting Figure 9. The technique developed by Gui et al. The basic principle is to use primer pairs of flanking small RNAs to initiate a PCR reaction and detect length polymorphisms that are due to indels present in the small RNA pool [ ].
According to the authors the technique is reproducible, representing a high-throughput, non-coding, sequence-based marker system. It can be used for genome mapping and for genotyping. Outline of iSNAP. Differently oriented small RNAs grey arrows are present in the genome. These primers can be used to generate fingerprints either solely as presented on the figure or in combination.
Successful amplifications depend on the orientation of small RNAs in the genome. PCR products are depicted as brown bars. This method was developed by Bachem et al. Further steps are similar to the protocols for AFLP and include restriction digestion with one or two restriction enzymes, with the cDNA used as a primary template.
The digestion is followed by the ligation of adapters and anchors. After this a preamplification is carried out with primers corresponding to the anchors. In the final step a selective amplification is implemented, with extended primers having one or even more selective nucleotides. The resulting fingerprints are visualized by silver-staining of polyacrylamide, or else fluorescently labeled primers can be used to detect peaks.
This technique is efficient for the identification of common and rare transcripts and for studying genome-wide gene expression [ ]. It can also be used to identify differences in the expression of different genes under various stress conditions [ ]. Since the initial description of the basic techniques, many modifications have been published that have increased the efficiency of the method [ , , ].
Using cDNA-AFLP a genome wide transcriptome map has been constructed for Arabidopsis , [ ] and it has also successfully been used to detect gene expression alterations in Triticum aestivum [ ] and to develop polymorphic transcript-derived fragments TDFs in Manihot esculenta Crantz [ ].
The study of Bryan et al. These markers can be converted to specific PCR markers, and these genome-specific amplicons used in gene tagging or diagnostics. Subsequent studies have modified the basic technique by altering the probes, or the way that the probes are generated for the analysis.
Probes can be designed in such a way that permits applications across species or even across genera [ ] within a particular plant family [ ].
This method has been used effectively in several plant species such as sunflower Helianthus annuus L. Sequencing of cDNA produces a large amount of information, now available in public databases. Expressed sequence tags ESTs are short transcribed sequences that are usually read in a single direction and provide a good basis for gene expression analyses and detecting genetic diversity.
Many available bioinformatics tools, e. The recent increase in the availability of expressed sequence tag EST data has facilitated the development of microsatellite or simple sequence repeat SSR markers in a number of plant species groups [ ].
The major difference is in primer development and the locations of the primers, as EST-SSRs are generated from the transcribed region of the genome. They are harvested directly from sequence data using in silico techniques. Data mining can be carried out in many alternative databases specifically designed for particular plant groups, e.
There are many software tools specifically designed for database mining, e. Further examples can be found in the review by Varshney et al. Expressed sequence tag derived genic SSRs are most likely to be found within functional sequences, and thus provide abundant information compared to genomic SSR markers. Their most important feature is easier transferability among distantly related species compared with gSSRs.
Such markers can be used for the same purposes as gSSRs and have proved to be useful in the analysis of alpine lady-fern Athyrium distentofolium Tausch ex Opiz; [ ], rice, [ ] and the genus Medicago L.
Plant genetic programs aiming to characterize the transcribed region of the genome yield a large amount of ESTs, genes and cDNA clones directly accessible from different databases developed for these purposes.
In most cases the major aim of these studies is not the generation of new marker sets, or the development of primers based on novel sequence information, but rather analysis of for example plant stress responses. However, marker development can benefit from such approaches as new primers from the expressed region of the genome can be developed with bioinformatics tools and algorithms. In this regard cDNA or EST derived markers are no more than byproducts of large sequencing projects that can be sorted by bio-data mining.
Such processes can be carried out relatively easily and without significant costs if free software is used for data processing. Once ESTs are generated and used for different purposes new primers can be developed cheaply.
The same applies to iSNAP markers, as these were also developed based on the results of large scale next generation sequencing of small RNAs. The greatest advantage of RBMs is that they are derived from the expressed region of the genome.
The generated fragments can easily be associated with phenotypic traits, this being extremely important for genetic mapping studies. On the other hand, in studies aiming to explore genetic variation in natural populations these markers should be used with caution, because they may be under selection. RNA-based markers are also expected to be transferable between related species and genera as the primers are designed from conserved coding regions of the genome.
As iSNAP is recent technique, information is still sparse. Easy transferability of the EST-derived markers has been demonstrated in several studies [ — ]. The consensus finding of these studies is that EST-derived markers can be applied without any redundancy in related plant genera, even in cases where detailed sequence or EST information is lacking.
However, in cross-species applications the recurring problem of orthology assessment can arise. Studies suggest that primers designed for a given species will most probably amplify the same fragment in related genera [ ]. The amplification success rate seems to vary among different plant groups. Due to their robustness, the development of EST-derived markers is especially popular in crop breeding programs, especially in cereals, where large genomic libraries exist and ESTs are more frequently used compared with other crop species [ ].
Genetic diversity research programs exploring the wild relatives of economically important crop species have a particular opportunity to benefit from these developments.
Unfortunately, the same cannot be said for other plant genera that lack economic importance despite having ecological or evolutionary significance.
A summary of various aspects of RBMs can be found in Table 3. Taking advantage of the increasing knowledge of genomic elements, a novel family of markers has been developed, here termed targeted fingerprinting markers TFMs.
These are by definition multi-locus markers, generated in a semi-random and targeted manner from various regions of the genome, and presumably corresponding to polymorphic sites of any gene or gene related region irrespective of their function. This means that marker systems grouped here are gene -targeted markers which do not necessarily yield fingerprints involved in phenotypic trait variation.
TFM markers tend to combine advantageous features of several basic techniques, while also incorporating methodological modifications to increase sensitivity and resolution in order to detect genetic discontinuity and distinctiveness.
They incorporate modifications of the primers and benefit from a priori genomic information available for the organism. Anchoring elements e. Fingerprints are generated in a semi-random manner, because due to the incorporation of common features of the plant genome, banding patterns are produced from anonymous but targeted sites.
This enables whole genome distribution and better reproducibility than can be achieved with specific primer design or even with modified PCR protocols. Exploiting common genomic features makes TFM techniques easily transferable between many organisms and provides alternatives to previous AAD markers. They differ from each other with respect to important features such as genomic abundance, level of polymorphism detected, locus specify, reproducibility, technical requirements, and cost.
The major TFM techniques will be summarized here according to their requirements and the modifications that characterize them. This technique, developed by Desmarais et al. It also has the advantages of high-resolution fingerprinting in that it offers the possibility of directly sequencing each new marker locus [ ].
It was designed to obtain nucleotide sequence information for DNA fragments from any genome with no a priori sequence data Figure This technique is an explicit extension of RAPD with longer primers 19—21 bp. The main advantage of the method is that banding patterns Additional file 10 are obtained with a minimum number of primers by simple combinations and by changing only one primer between different experiments.
Studies utilizing DALPs report that results can be reliably and rapidly obtained for a wide variety of purposes, including investigation of population diversity [ , ], genetic mapping [ ] and defining new monolocus co-dominant markers [ ]. Outline of the DALP technique. Promoter regions facilitate gene transcription and are located close to a particular gene, [ ] therefore they can be used to specifically profile the genome of the analyzed organism.
Promoter elements determine the point of transcription initiation and alter the rate and specificity of transcription [ ]. The gene specific architecture of promoter sequences shows high diversity, consisting of many short motifs that serve as recognition sites for proteins involved in transcription initiation [ , ]. This feature of promoters makes them suitable for tagging with degenerate primers to generate length polymorphisms, easily detectable by electrophoresis.
Pang et al. It is relatively difficult to characterize promoter regions in different organisms, but numerous databases e. The authors imply that the technique might be useful for developing molecular markers to search for polymorphism associated phenotypic traits amplified from the regulatory regions of plant genomes. A large number of polymorphisms can be revealed using primers targeting short recognition sites in the plant genome, since almost any primer can initiate PCR amplification.
Region amplified polymorphic RAP techniques also use arbitrary primers, but differ significantly from the widely used RAPD technique [ 14 ]. Based on the modifications incorporated in the primers, three main techniques have been developed.
The primers used in this technique are longer 17—21 nt than the 10 nt ones used in RAPD. With the inclusion of this motif in the core of the forward primer, exon regions containing this element are preferentially amplified. Since these regions are more variable between different individuals, the intrinsic dissimilarity incorporated in the primer sets makes it feasible to generate polymorphic bands based on introns and exons [ 75 ].
The PCR profile is also modified to ensure specificity and high stringency and consists of two parts, the early and late cycles Figure Primer-DNA template annealing depends on the matching level of both sequences determining the amplification efficiency. The low initial annealing temperature ensures the binding of both primers to sites with partial matches in the target DNA, creating a population of amplicons that contains the priming sites.
This is similar to in vitro mutagenesis using PCR primers [ ]. SRAP has rapidly gained in popularity based on the following advantages: i a large number of polymorphic fragments are amplified in each reaction, ii there is no a priori need for information about sequences, iii primers can be applied to any species, iv it is cost effective and easy to perform, v reproducibility is high, and vi PCR products can be directly sequenced using the original primers without cloning.
The method has now been widely used in plant genetics see Additional file 3. Outline of SRAP. The fixed primer is designed from available partial sequences of candidate genes, such as expressed sequence tags ESTs. The generation of fixed primers limits the use of this technique to species where ESTs are known, or requires the generation of new sequence information for primer development Additional file Despite this limitation it has been widely used for several purposes in different plant species, e.
CoRAP [ 77 ], is also based on the use of a fixed and an arbitrary primer. The only difference is in the arbitrary primer, which contains a different core sequence motif CACGC , commonly found in plant gene introns. This core sequence ensures the utilization of conserved intron sequences in plant genotyping while the fixed conserved primers target coding sequences, together generating highly reproducible and reliable fingerprints.
If the distribution of these gene elements allows successful PCR, banding patterns resulting from a specific fingerprint will be amplified. Indels in these regions will certainly generate different distributions of amplified products.
The closer the genetic relationship between the two individuals, the more similar the corresponding band patterns of the amplified PCR products will be [ 77 ]. Molecular markers from the transcribed region of the genome have potential for various applications in plant genotyping as they reveal polymorphism that might be directly related to gene function. This method is based on the observation that the short conserved regions of plant genes are surrounded by the ATG translation start codon [ ].
The technique uses single primers designed to anneal to the flanking regions of the ATG initiation codon on both DNA strands. The generated amplicons Additional file 13 are possibly distributed within gene regions that contain genes on both plus and minus DNA strands. The utility of primer pairs in SCoTs was advocated by Gorji et al.
SCoT markers are usually reproducible, while primer length and annealing temperature are not the sole factors determining reproducibility [ , ]. They are dominant markers, however, while a number of co-dominant markers are also generated during amplification, and thus could be used for genetic diversity analysis.
SCoTs can be used either in isolation or in combination with other techniques to assess genetic diversity and to obtain reliable information about population processes and structure across different plant families [ ]. In some cases reproducibility can be a problem with techniques that detect large amounts of polymorphism or more complex banding patterns Table 4.
However, with careful PCR optimization reproducibility need not be a severe problem. It is well known that polyploidization can promote rapid essential rearrangements in the genome such as genome restructuring, intergenomic recombination, or even a rapid loss of DNA [ ].
The application of such multi-locus markers in the same way as AADs can produce incorrect genetic distances depending on the degree of genomic rearrangement. Unfortunately, studies including a detailed investigation of the effects of polyploidy on banding patterns are very rare for TFMs. For very complex banding patterns bands should be separated on polyacrylamide gels rather than agarose, as suggested in the descriptions of the TFM methods.
The independence of scored markers is limited by linkage, as they should be derived from separate loci if they are not to be regarded as dependent here meaning that the locus is counted more than once. Dependence is important for some studies, since loci scored in such a way could be easily overlooked. For genetic mapping the behavior of the markers is an important feature, for example AADs tend to cluster in the pericentromeric regions and although they are randomly generated, tend to form clusters when the constructed genetic map becomes denser [ , ].
The behavior of some TFMs in mapping studies is well documented, while for some other markers this information is still lacking. SRAP markers showed more consistent distribution in other studies, which may indicate that they are better markers than AFLPs for map construction [ , ].
This must be due to the fact that AFLP is affected by DNA methylation, resulting in pseudo-polymorphism and uneven marker distribution in some species [ ]. Lin et al. For genetic diversity assessment of germplasm collections, SRAP markers are also considered to be superior to AADs as they seem to be more congruent with morphological variation and evolutionary history [ ]. Moreover, in sunflower, Hu [ ] was able to define linkage groups in telomeric regions. A marker can become dependent based on overlooked co-dominancy or nested priming.
The latter is easier to detect, while undetected co-dominancy may lead to an overestimate of the number of polymorphic loci and an underestimate of allelic diversity [ 27 ]. Any co-dominant bands discovered should be coded in a multi-allelic system, and analyzed in a different manner from binary dominant data.
It has been reported that SRAP yields dominant and co-dominant markers together in the same reaction. The frequency of co-dominant bands seems to vary among taxa. They emphasize this finding as being an important advantage of this technique over AADs. SNPs are mostly formed when errors occur substitution, insertion and deletion.
SNPs are prominent sources of variation in human genome and serve as excellent genetic markers. Some regions of the genome are richer in SNPs than others. SNPs may occur within gene sequences or in intergenic sequences. SNPs mostly are located in noncoding regions of the genome and have mostly no direct known impact on the phenotype of an individual but their role till now remains elusive, and depending on where SNPs occurs, it might have different consequences at the phenotypic level [ 3 ].
It is a type of DNA variation in which a specific nucleotide sequence of various lengths ranging from one to several base pairs is inserted or deleted. Indels are widely spread across the genome. DNA repeats can be classified as interspersed repeats or tandem repeats. This can comprise over two-thirds of the human genome [ 15 ]. Interspersed repeats are dispersed across the genome within gene sequences or intergenic and include retro pseudo genes and transposons.
Centromeres and telomeres largely comprise tandem repeats. Despite increasing evidence on the functionality of DNA repeats, their biologic role is still elusive and under frequent debate [ 11 ]. Tandem repeats are organized in a head-to-tail orientation; based on the size of each repeat unit, satellite repeats can be further divided into macrosatellites, minisatellites, and microsatellites [ 17 ].
Some of these repeats are described as follows: macrosatellites, with sequence repeats longer than bp, are the largest of the tandem DNA repeats, located on one or multiple chromosomes [ 11 ], minisatellites, stretches of DNA, are characterized by moderate length patterns, 10— bp usually less than 50 bp [ 9 , 18 ], and microsatellites also known as short tandem repeats STRs repeat units of less than 10 bp, [ 3 ]. Structural and copy number variations CNVs are another frequent source of genome variability [ 6 , 19 , 20 ].
Some currently used terms are structural variations; a genomic alteration e. The development and use of molecular methods for the detection of DNA molecular markers is one of the most significant progresses in the field of molecular genetics. Mapping the human genome requires a set of genetic markers to which we can relate the position of genes. Molecular markers can be used to mark in genomes for various purposes such as mapping human diseases, pharmacogenetics, and human identification.
Single base pair change leads to single nucleotide variant, probably accounting for many genetic conditions caused by single gene or multiple genes. SNPs represent the major source of human genomic variability. Due to the lack of knowledge on exact SNP number, it is difficult to give a direct estimate of the number of the SNPs in the human genome but in different public and private data bases, more than 5 million have been recorded and about 4 million validated [ 23 ].
Over 60, however are within genes and some of them associated with diseases [ 2 ]. Single nucleotide polymorphisms within protein-coding regions either synonymous polymorphisms; those that do not have any effect on the organism and are said to be selectively silent as the substitution causes no amino acid change in the protein produced silent mutation or nonsynonymous substitution results in change in encoded amino acids either missense mutation; change the protein through codon alteration or nonsense mutation results in a chain termination codon [ 3 ].
Single nucleotide polymorphisms within a coding sequence cause genetic diseases including sickle cell anemia. SNPs responsible for a disease can also occur in any genetic region that can eventually affect the expression activity of genes, for example, in promoter regions. SNPs in the noncoding region of the gene, though their effect is still debatable, most of the genome mostly consists of regulatory elements that control gene expression, but these regions have remained largely unexplored in clinical diagnostics due to the high cost of whole genome sequencing and interpretive challenges.
Clinical diagnostic sequencing currently focuses on identifying causal mutations in the exome, where most disease-causing mutations are known to occur. Another important group of SNPs is the one that alters the primary structure of a protein involved in drug metabolism; these SNPs are targets for pharmacogenetics studies.
However, some SNPs are not causative, some SNPs are in close association with, and therefore segregate with, a disease-causing sequence so, the presence of SNP correlates with the presence or an increased risk of developing the disease; these SNPs are useful in diagnostics, disease prediction, and other applications [ 3 ].
Single nucleotide polymorphisms can be used as genetic markers for constructing high genetic maps and to carry out association studies related to diseases because of their abundance and the availability of high throughput analysis technologies. SNPs have become an important application in the development and research of genetic markers [ 14 ].
There are numerous strategies that can be implemented to new single nucleotide variant SNVs discoveries; the most common and well-known method is by direct sequencing and in comparison to a puplic or other sequence date base [ 25 , 26 ] or locus specific amplification of target genomic region followed by sequence comparison [ 27 , 28 ]; prescreening prior to sequence determination is needed.
SNV detection encompasses two broad areas: 1 scanning DNA sequences for previously unknown polymorphisms and 2 screening genotyping individuals for known polymorphisms. Scanning for new SNVs can be further classified to two different types of approaches, the first one being the global or random approach and the other one being the regional targeted approach [ 14 ].
Haplotypes are groups of SNPs that are generally inherited together. Haplotypes can have stronger correlations with diseases or other phenotypic effects compared with individual SNPs and may therefore provide increased diagnostic accuracy in some cases [ 32 ].
Microsatellites are short tandem repeats STRs , repeat units, or motifs of less than 10 bp; because of high variability, microsatellite loci are often used in forensics, population genetics, and genetic genealogy. Significant associations were demonstrated between microsatellite variants and many diseases [ 15 ]. Depending on the search algorithm, there are approximately ,—1,, microsatellite loci which are 2—6 bp long in the human reference genome [ 33 , 34 ].
Within genes, STRs are nonrandomly distributed across protein-coding sequences, untranslated regions UTRs , and introns. Currently, SNP markers are one of the preferred genotyping approaches, because they are abundant in the genome, genetically stable, and amenable to high-throughput automated analysis [ 42 ].
SNPs are bi-allelic markers, indicating a specific polymorphism in only two alleles of a population [ 44 ]. SNPs distribute in both coding and non-coding regions of genomes, they are vital players in the process of population genetic variations and species evolution [ 45 ]. SNPs are third generation molecular marker technology coming after RFLPs and SSRs [ 46 ]; it has been successfully used to investigate genetic variation among different species and breeds [ 47 — 49 ].
Compared with previous markers, SNPs have the following advantages: 1 they are numerous and widely distributed throughout the entire genome [ 50 ]. Because of their extensive distribution and abundant variations, SNPs play an important role in farm animal population structure, genetic differentiation, origin, and evolution research.
For example, linkage disequilibrium LD among different SNPs can be utilized for association analysis. Furthermore, we can gain information concerning animal population diversity and population evolution origins, differentiation, and migrations via SNP haplotypes among different populations. One disadvantage of SNP markers is the low level information obtained compared with that of a highly polymorphic microsatellite, but this can be compensated for by employing a higher numbers of markers SNP chips and whole-genome sequencing [ 52 , 53 ].
It is the most straight-forward method and provides more complete information on the genetic variation among different populations because it can detect all the variations within the genome. Currently, the problem with whole-genome sequencing is setting up a high-through data analysis platform to explore useful information for the conservation and utilization of farm animals. Barcoding is an automatic scanning and identification technology, which has emerged from practical computer technologies.
Biological taxonomists apply this principle to species classification, referring to a DNA barcode. The intent of DNA barcoding is to use large-scale screening of one or more reference genes in order to i assign unknown individuals to species, and ii enhance discovery of new species [ 54 , 55 ].
Tautz et al. Subsequently, Hebert et al. Researchers can compile a public library of DNA barcodes linked to named specimens, which can provide a new master key for identifying species diversity [ 57 ]. Compared with time-consuming and inefficient traditional morphological classification [ 58 ], DNA Barcoding has a high accuracy of However, as with the other markers mentioned the DNA barcoding technique also has some disadvantages: 1 the genome fragments are very difficult to obtain and are relatively conservative and have no enough variations.
The above disadvantages can be compensated for by using one or more nuclear gene barcodes together to make a standardized analysis of AnGR. The accurate evaluation of animal genetic resources is the basis for their conservation and utilization. From the first demonstration of RFLPs to the current whole-genome sequencing, many methods have been developed and tested at the DNA sequence level, providing a large number of markers and opening up new opportunities for evaluating diversity in farm animal genetic resources.
With the development of new markers, more accurate genetic evaluation is possible. The development of molecular markers will continue in the near future and provide better understanding of animal genetic resources. Biol Reprod. Anim Genet. Cell Mol Life Sci. Cancer Res. Poultry breeding Genet. Google Scholar. CAS Google Scholar. Book Google Scholar. Br Poult Sci. Article Google Scholar. Drinkwater RD, Hetzel DJS: Application of molecular biology to understanding genotype-environment interactions in livestock production.
Tissue Antigens. Mammal Genome. Nucleic Acids Res. Meat Sci. Theor Appl Genet. Res Microbiol. EP Patent. J Indus Micro Biotech. Alban J Agricul Sci. Mol Breed.
Anim Genetics.
0コメント