首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Single nucleotide polymorphisms (SNPs) are important tools in studying complex genetic traits and genome evolution. Computational strategies for SNP discovery make use of the large number of sequences present in public databases (in most cases as expressed sequence tags (ESTs)) and are considered to be faster and more cost-effective than experimental procedures. A major challenge in computational SNP discovery is distinguishing allelic variation from sequence variation between paralogous sequences, in addition to recognizing sequencing errors. For the majority of the public EST sequences, trace or quality files are lacking which makes detection of reliable SNPs even more difficult because it has to rely on sequence comparisons only.  相似文献   

2.
Molecular markers are used to provide the link between genotype and phenotype, for the production of molecular genetic maps and to assess genetic diversity within and between related species. Single nucleotide polymorphisms (SNPs) are the most abundant molecular genetic marker. SNPs can be identified in silico , but care must be taken to ensure that the identified SNPs reflect true genetic variation and are not a result of errors associated with DNA sequencing. The SNP detection method autoSNP has been developed to identify SNPs from sequence data for any species. Confidence in the predicted SNPs is based on sequence redundancy, and haplotype co-segregation scores are calculated for a further independent measure of confidence. We have extended the autoSNP method to produce autoSNPdb, which integrates SNP and gene annotation information with a graphical viewer. We have applied this software to public barley expressed sequences, and the resulting database is available over the Internet. SNPs can be viewed and searched by sequence, functional annotation or predicted synteny with a reference genome, in this case rice. The correlation between SNPs and barley cultivar, expressed tissue type and development stage has been collated for ease of exploration. An average of one SNP per 240 bp was identified, with SNPs more prevalent in the 5' regions and simple sequence repeat (SSR) flanking sequences. Overall, autoSNPdb can provide a wealth of genetic polymorphism information for any species for which sequence data are available.  相似文献   

3.
Applications of single nucleotide polymorphisms in crop genetics   总被引:26,自引:0,他引:26  
The discovery of single nucleotide polymorphisms (SNPs) and insertions/deletions, which are the basis of most differences between alleles, has been simplified by recent developments in sequencing technology. SNP discovery in many crop species, such as corn and soybean, is relatively straightforward because of their high level of intraspecific nucleotide diversity, and the availability of many gene and expressed sequence tag (EST) sequences. For these species, direct readout of SNP haplotypes is possible. Haplotype-based analysis is more informative than analysis based on individual SNPs, and has more power in analyzing association with phenotypes. The elite germplasm of some crops may have been subjected to bottlenecks relatively recently, increasing the amount of linkage disequilibrium (LD) present and facilitating the association of SNP haplotypes at candidate gene loci with phenotypes. Whole-genome scans may help identify genome regions that are associated with interesting phenotypes if sufficient LD is present. Technological improvements make the use of SNP and indel markers attractive for high-throughput use in marker-assisted breeding, EST mapping and the integration of genetic and physical maps.  相似文献   

4.
Whole-genome duplication (polyploidization) is among the most dramatic mutational processes in nature, so understanding how natural selection differs in polyploids relative to diploids is an important goal. Population genetics theory predicts that recessive deleterious mutations accumulate faster in allopolyploids than diploids due to the masking effect of redundant gene copies, but this prediction is hitherto unconfirmed. Here, we use the cotton genus (Gossypium), which contains seven allopolyploids derived from a single polyploidization event 1–2 Million years ago, to investigate deleterious mutation accumulation. We use two methods of identifying deleterious mutations at the nucleotide and amino acid level, along with whole-genome resequencing of 43 individuals spanning six allopolyploid species and their two diploid progenitors, to demonstrate that deleterious mutations accumulate faster in allopolyploids than in their diploid progenitors. We find that, unlike what would be expected under models of demographic changes alone, strongly deleterious mutations show the biggest difference between ploidy levels, and this effect diminishes for moderately and mildly deleterious mutations. We further show that the proportion of nonsynonymous mutations that are deleterious differs between the two coresident subgenomes in the allopolyploids, suggesting that homoeologous masking acts unequally between subgenomes. Our results provide a genome-wide perspective on classic notions of the significance of gene duplication that likely are broadly applicable to allopolyploids, with implications for our understanding of the evolutionary fate of deleterious mutations. Finally, we note that some measures of selection (e.g., dN/dS, πN/πS) may be biased when species of different ploidy levels are compared.  相似文献   

5.
DNA barcodes are widely used in taxonomy, systematics, species identification, food safety, and forensic science. Most of the conventional DNA barcode sequences contain the whole information of a given barcoding gene. Most of the sequence information does not vary and is uninformative for a given group of taxa within a monophylum. We suggest here a method that reduces the amount of noninformative nucleotides in a given barcoding sequence of a major taxon, like the prokaryotes, or eukaryotic animals, plants, or fungi. The actual differences in genetic sequences, called single nucleotide polymorphism (SNP) genotyping, provide a tool for developing a rapid, reliable, and high‐throughput assay for the discrimination between known species. Here, we investigated SNPs as robust markers of genetic variation for identifying different pigeon species based on available cytochrome c oxidase I (COI) data. We propose here a decision tree‐based SNP barcoding (DTSB) algorithm where SNP patterns are selected from the DNA barcoding sequence of several evolutionarily related species in order to identify a single species with pigeons as an example. This approach can make use of any established barcoding system. We here firstly used as an example the mitochondrial gene COI information of 17 pigeon species (Columbidae, Aves) using DTSB after sequence trimming and alignment. SNPs were chosen which followed the rule of decision tree and species‐specific SNP barcodes. The shortest barcode of about 11 bp was then generated for discriminating 17 pigeon species using the DTSB method. This method provides a sequence alignment and tree decision approach to parsimoniously assign a unique and shortest SNP barcode for any known species of a chosen monophyletic taxon where a barcoding sequence is available.  相似文献   

6.
Whole-genome duplications (WGDs) are a prominent process of diversification in eukaryotes. The genetic and evolutionary forces that WGD imposes on cytoplasmic genomes are not well understood, despite the central role that cytonuclear interactions play in eukaryotic function and fitness. Cellular respiration and photosynthesis depend on successful interaction between the 3,000+ nuclear-encoded proteins destined for the mitochondria or plastids and the gene products of cytoplasmic genomes in multi-subunit complexes such as OXPHOS, organellar ribosomes, Photosystems I and II, and Rubisco. Allopolyploids are thus faced with the critical task of coordinating interactions between the nuclear and cytoplasmic genes that were inherited from different species. Because the cytoplasmic genomes share a more recent history of common descent with the maternal nuclear subgenome than the paternal subgenome, evolutionary “mismatches” between the paternal subgenome and the cytoplasmic genomes in allopolyploids might lead to the accelerated rates of evolution in the paternal homoeologs of allopolyploids, either through relaxed purifying selection or strong directional selection to rectify these mismatches. We report evidence from six independently formed allotetraploids that the subgenomes exhibit unequal rates of protein-sequence evolution, but we found no evidence that cytonuclear incompatibilities result in altered evolutionary trajectories of the paternal homoeologs of organelle-targeted genes. The analyses of gene content revealed mixed evidence for whether the organelle-targeted genes are lost more rapidly than the non-organelle-targeted genes. Together, these global analyses provide insights into the complex evolutionary dynamics of allopolyploids, showing that the allopolyploid subgenomes have separate evolutionary trajectories despite sharing the same nucleus, generation time, and ecological context.  相似文献   

7.
Single nucleotide polymorphisms (SNPs) are becoming more commonly used as molecular markers in conservation studies. However, relatively few studies have employed SNPs for species with little or no existing sequence data, partly due to the practical challenge of locating appropriate SNP loci in these species. Here we describe an application of SNP discovery via shotgun cloning that requires no pre-existing sequence data and is readily applied to all taxa. Using this method, we isolated, cloned and screened for SNP variation at 90 anonymous sequence loci (51 kb total) from the banded wren (Thryothorus pleurostictus), a Central American species with minimal pre-existing sequence data and a documented paucity of microsatellite allelic variation. We identified 168 SNPs (a mean of one SNP/305 bp, with SNPs unevenly distributed across loci). Further characterization of variation at 41 of these SNP loci among 256 individuals including 37 parent–offspring families suggests that they provide substantial information for defining the genetic mating system of this species, and that SNPs may be generally useful for this purpose when other markers are problematic.  相似文献   

8.
Sequence comparison of orthologous regions enables estimation of the divergence between genomes, analysis of their evolution and detection of particular features of the genomes, such as sequence rearrangements and transposable elements. Despite the economic importance of Coffea species, little genomic information is currently available. Coffea is a relatively young genus that includes more than one hundred diploid species and a single tetraploid species. Three Coffea orthologous regions of 470-900 kb were analyzed and compared: both subgenomes of allotetraploid Coffea arabica (contributed by the diploid species Coffea eugenioides and Coffea canephora) and the genome of diploid C. canephora. Sequence divergence was calculated on global alignments or on coding and non-coding sequences separately. A search for transposable elements detected 43 retrotransposons and 198 transposons in the sequences analyzed. Comparative insertion analysis made it possible to locate 165 TE insertions in the phylogenetic tree of the three genomes/subgenomes. In the tetraploid C. arabica, a homoeologous non-reciprocal transposition (HNRT) was detected and characterized: a 50 kb region of the C. eugenioides derived subgenome replaced the C. canephora derived counterpart. Comparative sequence analysis on three Coffea genomes/subgenomes revealed almost perfect gene synteny, low sequence divergence and a high number of shared transposable elements. Compared to the results of similar analysis in other genera (Aegilops/Triticum and Oryza), Coffea genomes/subgenomes appeared to be dramatically less diverged, which is consistent with the relatively recent radiation of the Coffea genus. Based on nucleotide substitution frequency, the HNRT was dated at 10,000-50,000 years BP, which is also the most recent estimation of the origin of C. arabica.  相似文献   

9.
The single nucleotide polymorphism (SNP) is the difference of the DNA sequence between individuals and provides abundant information about genetic variation. Large scale discovery of high frequency SNPs is being undertaken using various methods. However, the publicly available SNP data sometimes need to be verified. If only a particular gene locus is concerned, locus-specific polymerase chain reaction amplification may be useful. Problem of this method is that the secondary peak has to be measured. We have analyzed trace data from conventional sequencing equipment and found an applicable rule to discern SNPs from noise. The rule is applied to multiply aligned sequences with a trace and the peak height of the traces are compared between samples. We have developed software that integrates this function to automatically identify SNPs. The software works accurately for high quality sequences and also can detect SNPs in low quality sequences. Further, it can determine allele frequency, display this information as a bar graph and assign corresponding nucleotide combinations. It is also designed for a person to verify and edit sequences easily on the screen. It is very useful for identifying de novo SNPs in a DNA fragment of interest.  相似文献   

10.
The successful exploitation of natural genetic diversity requires a basic knowledge of the extent of the variation present in a species. To study natural variation in Arabidopsis thaliana, we defined nested core collections maximizing the diversity present among a worldwide set of 265 accessions. The core collections were generated based on DNA sequence data from a limited number of fragments evenly distributed in the genome and were shown to successfully capture the molecular diversity in other loci as well as the morphological diversity. The core collections are available to the scientific community and thus provide an important resource for the study of genetic variation and its functional consequences in Arabidopsis. Moreover, this strategy can be used in other species to provide a rational framework for undertaking diversity surveys, including single nucleotide polymorphism (SNP) discovery and phenotyping, allowing the utilization of genetic variation for the study of complex traits.  相似文献   

11.
Association mapping currently relies on the identification of genetic markers. Several technologies have been adopted for genetic marker analysis, with single nucleotide polymorphisms (SNPs) being the most popular where a reasonable quantity of genome sequence data are available. We describe several tools we have developed for the discovery, annotation, and visualization of molecular markers for association mapping. These include autoSNPdb for SNP discovery from assembled sequence data; TAGdb for the identification of gene specific paired read Illumina GAII data; CMap3D for the comparison of mapped genetic and physical markers; and BAC and Gene Annotator for the online annotation of genes and genomic sequences.  相似文献   

12.
The NanoChip electronic microarray is designed for the rapid detection of genetic variation in research and clinical diagnosis. We have developed a multiplex electronic microarray assay, specific for single nucleotide polymorphism (SNP) genotyping and mutation detection, using universal adaptor sequences tailed to the 5' end of PCR primers specific to each target. PCR products, amplified by primers directed to the universal adaptor sequence, are immobilized on the microarray either directly or via capture oligonucleotides complementary to the universal adaptor sequence. This simple modification results in a significant increase in fidelity with improved specificity and accuracy. In addition, the multiplexing of genetic variant detection allows increased throughput and significantly reduced cost per assay. This general schema can also be applied to other microarray and macroarray formats.  相似文献   

13.
Single-nucleotide polymorphisms (SNPs) and insertion–deletions (INDELs) are currently the important classes of genetic markers for major crop species. In this study, methods for developing SNP markers in rapeseed (Brassica napus L.) and their in silico mapping and use for genotyping are demonstrated. For the development of SNP and INDEL markers, 181 fragments from 121 different gene sequences spanning 86 kb were examined. A combination of different selection methods (genome-specific amplification, hetero-duplex analysis and sequence analysis) allowed the detection of 18 singular fragments that showed a total of 87 SNPs and 6 INDELs between 6 different rapeseed varieties. The average frequency of sequence polymorphism was estimated to be one SNP every 247 bp and one INDEL every 3,583 bp. Most SNPs and INDELs were found in non-coding regions. Polymorphism information content values for SNP markers ranged between 0.02 and 0.50 in a set of 86 varieties. Using comparative genetics data for B. napus and Arabidopsis thaliana, an allocation of SNP markers to linkage groups in rapeseed was achieved: a unique location was determined for seven gene sequences; two and three possible locations were found for six and four sequences, respectively. The results demonstrate the usefulness of existing genomic resources for SNP discovery in rapeseed.  相似文献   

14.
We have developed a computer based method to identify candidate single nucleotide polymorphisms (SNPs) and small insertions/deletions from expressed sequence tag data. Using a redundancy-based approach, valid SNPs are distinguished from erroneous sequence by their representation multiple times in an alignment of sequence reads. A second measure of validity was also calculated based on the cosegregation of the SNP pattern between multiple SNP loci in an alignment. The utility of this method was demonstrated by applying it to 102,551 maize (Zea mays) expressed sequence tag sequences. A total of 14,832 candidate polymorphisms were identified with an SNP redundancy score of two or greater. Segregation of these SNPs with haplotype indicates that candidate SNPs with high redundancy and cosegregation confidence scores are likely to represent true SNPs. This was confirmed by validation of 264 candidate SNPs from 27 loci, with a range of redundancy and cosegregation scores, in four inbred maize lines. The SNP transition/transversion ratio and insertion/deletion size frequencies correspond to those observed by direct sequencing methods of SNP discovery and suggest that the majority of predicted SNPs and insertion/deletions identified using this approach represent true genetic variation in maize.  相似文献   

15.
Phaeosphaeria species are important causal agents of Stagonospora leaf blotch diseases in cereals. In this study, the nucleotide sequence and deduced polypeptide of the trifunctional histidine biosynthesis gene (his) are used to investigate the phylogenetic relationships and provide molecular identification among cereal Phaeosphaeria species. The full-length sequences of the his gene were obtained by PCR amplification and compared among cereal Phaeosphaeria species. The coding sequence of the his gene in wheat-biotype P. nodorum (PN-w) was 2697 bp. The his genes in barley-biotype P. nodorum (PN-b), two P. avenaria f. sp. triticea isolates (homothallic Pat1 and Pat3), and Phaeosphaeria species from Polish rye and dallis grass were 2694 bp. The his gene in heterothallic isolate Pat2, however, was 2693 bp because the intron had one fewer base. In P. avenaria f. sp. avenaria (Paa), the his gene was only 2670 bp long. The differences in the size of the his gene contributed to the variation in amino acid sequences in the gap region located between the phosphoribosyl-ATP pyrophosphohydrolase and histidinol dehydrogenase sub-domains. Based on nucleotide and deduced amino acid sequences of the his gene, Pat1 was not closely related to either PN-w or the Paa clade. It appears that rates of evolution of the his gene were fast in cereal Phaeosphaeria species. The possible involvement of meiotic recombination in genetic diversity of the his gene in P. nodorum is discussed.  相似文献   

16.
Bahr A  Wilson AB 《Gene》2012,497(1):52-57
Gene conversion, the unidirectional exchange of genetic material between homologous sequences, is thought to strongly influence patterns of genetic diversity. The high diversity of major histocompatibility complex (MHC) genes in many species is thought to reflect a long history of gene conversion events both within and among loci. Theoretical work suggests that intra- and interlocus gene conversion leave characteristic signatures of nucleotide diversity, but empirical studies of MHC variation have rarely been able to analyze the effects of conversion events in isolation, due to the presence of multiple gene copies in most species. The potbellied seahorse (Hippocampus abdominalis), a species with a single copy of the MH class II beta-chain gene (MHIIb), provides an ideal system in which to explore predictions on the effects of intralocus gene conversion on patterns of genetic diversity. The genetic diversity of the MHIIb peptide binding region (PBR) is high in the seahorse, similar to other vertebrate species. In contrast, the remainder of the gene shows a total absence of synonymous variation and low levels of intronic sequence diversity, concentrated in 3 short repetitive regions and 1-12 SNPs per intron. The distribution of substitutions across the gene results in a patchwork pattern of shared polymorphism between otherwise divergent sequences. The pattern of nucleotide diversity observed in the seahorse MHIIb gene is congruent with theoretical expectations for intralocus gene conversion, indicating that this evolutionary mechanism has played an important role in MHC gene evolution, contributing to both the high diversity in the PBR and the low diversity outside this region. Neutral variation at this locus may be further reduced due to biases in nucleotide composition and functional constraints.  相似文献   

17.
Single nucleotide polymorphisms (SNPs) have gained wide use in humans and model species and are becoming the marker of choice for applications in other species. Technology that was developed for work in model species may provide useful tools for SNP discovery and genotyping in non-model organisms. However, SNP discovery can be expensive, labour intensive, and introduce ascertainment bias. In addition, the most efficient approaches to SNP discovery will depend on the research questions that the markers are to resolve as well as the focal species. We discuss advantages and disadvantages of several past and recent technologies for SNP discovery and genotyping and summarize a variety of SNP discovery and genotyping studies in ecology and evolution.  相似文献   

18.
19.
Abstract Ecotilling was used as a simple nucleotide polymorphism (SNP) discovery tool to examine DNA variation in natural populations of the western black cottonwood, Populus trichocarpa, and was found to be more efficient than sequencing for large-scale studies of genetic variation in this tree. A publicly available, live reference collection of P. trichocarpa from the University of British Columbia Botanical Garden was used in this study to survey variation in nine different genes among individuals from 41 different populations. A large amount of genetic variation was detected, but the level of variation appears to be less than in the related species, Populus tremula, based on reported statistics for that tree. Genes examined varied considerably in their level of variation, from PoptrTB1 which had a single SNP, to PoptrLFY which had more than 23 in the 1000-bp region examined. Overall nucleotide diversity, measured as (Total), was relatively low at 0.00184. Linkage disequilibrium, on the other hand, was higher than reported for some woody plant species, with mean r2 equal to 0.34. This study reveals the potential of Ecotilling as a rapid genotype discovery method to explore and utilize the large pool of genetic variation in tree species.  相似文献   

20.
Single nucleotide polymorphisms (SNPs) represent the most abundant type of genetic polymorphism in plant genomes. SNP markers are valuable tools for genetic analysis of complex traits of agronomic importance, linkage and association mapping, genome-wide selection, map-based cloning, and marker-assisted selection. Current challenges for SNP genotyping in polyploid outcrossing species include multiple alleles per loci and lack of high-throughput methods suitable for variant detection. In this study, we report on a high-resolution melting (HRM) analysis system for SNP genotyping and mapping in outcrossing tetraploid genotypes. The sensitivity and utility of this technology is demonstrated by identification of the parental genotypes and segregating progeny in six alfalfa populations based on unique melting curve profiles due to differences in allelic composition at one or multiple loci. HRM using a 384-well format is a fast, consistent, and efficient approach for SNP discovery and genotyping, useful in polyploid species with uncharacterized genomes. Possible applications of this method include variation discovery, analysis of candidate genes, genotyping for comparative and association mapping, and integration of genome-wide selection in breeding programs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号