首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Flood EM  Tang F  Horvath MM  Pertsemlidis A  Garner HR 《BioTechniques》2002,33(4):814, 816, 818-814,20 passim
SNPCEQer identifies and reports SNPs in sequences obtained from the Beckman CEQ2000 DNA Analysis System. SNPCEQer aligns sequences obtained using CEQ2000 heterozygote detection analysis and reports discrepancies between individual sequences and the consensus sequence it generates from this set as SNPs when the individual base calls have high-quality values. SNPCEQer reported comparable numbers of SNPs to the UNIX-based PolyPhred (148 vs. 165, respectively) in regions amplified from eight genes. A total of 21 different SNPs was discovered. Each gene region was analyzed in 96-306 samples. SNPCEQer was designed to operate from Windows NT, making SNP detection more accessible to users without UNIX systems. SNPCEQer is available free of charge at http://innovation.swmed.edu.  相似文献   

2.
The single nucleotide polymorphism (SNP) is the difference of the DNA sequence between individuals and provides abundant information about genetic variation. Large scale discovery of high frequency SNPs is being undertaken using various methods. However, the publicly available SNP data sometimes need to be verified. If only a particular gene locus is concerned, locus-specific polymerase chain reaction amplification may be useful. Problem of this method is that the secondary peak has to be measured. We have analyzed trace data from conventional sequencing equipment and found an applicable rule to discern SNPs from noise. The rule is applied to multiply aligned sequences with a trace and the peak height of the traces are compared between samples. We have developed software that integrates this function to automatically identify SNPs. The software works accurately for high quality sequences and also can detect SNPs in low quality sequences. Further, it can determine allele frequency, display this information as a bar graph and assign corresponding nucleotide combinations. It is also designed for a person to verify and edit sequences easily on the screen. It is very useful for identifying de novo SNPs in a DNA fragment of interest.  相似文献   

3.
MOTIVATION: Single nucleotide polymorphisms (SNPs) analysis is an important means to study genetic variation. A fast and cost-efficient approach to identify large numbers of novel candidates is the SNP mining of large scale sequencing projects. The increasing availability of sequence trace data in public repositories makes it feasible to evaluate SNP predictions on the DNA chromatogram level. MAVIANT, a platform-independent Multipurpose Alignment VIewing and Annotation Tool, provides DNA chromatogram and alignment views and facilitates evaluation of predictions. In addition, it supports direct manual annotation, which is immediately accessible and can be easily shared with external collaborators. RESULTS: Large-scale SNP mining of polymorphisms bases on porcine EST sequences yielded more than 7900 candidate SNPs in coding regions (cSNPs), which were annotated relative to the human genome. Non-synonymous SNPs were analyzed for their potential effect on the protein structure/function using the PolyPhen and SIFT prediction programs. Predicted SNPs and annotations are stored in a web-based database. Using MAVIANT SNPs can visually be verified based on the DNA sequencing traces. A subset of candidate SNPs was selected for experimental validation by resequencing and genotyping. This study provides a web-based DNA chromatogram and contig browser that facilitates the evaluation and selection of candidate SNPs, which can be applied as genetic markers for genome wide genetic studies. AVAILABILITY: The stand-alone version of MAVIANT program for local use is freely available under GPL license terms at http://snp.agrsci.dk/maviant. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

4.
Molecular markers are used to provide the link between genotype and phenotype, for the production of molecular genetic maps and to assess genetic diversity within and between related species. Single nucleotide polymorphisms (SNPs) are the most abundant molecular genetic marker. SNPs can be identified in silico , but care must be taken to ensure that the identified SNPs reflect true genetic variation and are not a result of errors associated with DNA sequencing. The SNP detection method autoSNP has been developed to identify SNPs from sequence data for any species. Confidence in the predicted SNPs is based on sequence redundancy, and haplotype co-segregation scores are calculated for a further independent measure of confidence. We have extended the autoSNP method to produce autoSNPdb, which integrates SNP and gene annotation information with a graphical viewer. We have applied this software to public barley expressed sequences, and the resulting database is available over the Internet. SNPs can be viewed and searched by sequence, functional annotation or predicted synteny with a reference genome, in this case rice. The correlation between SNPs and barley cultivar, expressed tissue type and development stage has been collated for ease of exploration. An average of one SNP per 240 bp was identified, with SNPs more prevalent in the 5' regions and simple sequence repeat (SSR) flanking sequences. Overall, autoSNPdb can provide a wealth of genetic polymorphism information for any species for which sequence data are available.  相似文献   

5.
In order to assess the extent of DNA sequence variation in cattle, introns and exons from both the leptin and Amyloid Precursor Protein (APP) genes have been sequenced in a panel of DNAs derived from 22 diverse animals. Direct DNA sequencing of PCR products was used; thus, 44 chromosomes were studied. Polymorphisms were identified by manual scanning of sequence chromatograms and computerized sequence analysis. Twenty Single Nucleotide Polymorphisms (SNPs) were detected in 1788 bp sequenced from the leptin gene, giving a frequency of 1 SNP per 89 bp. Twenty-four SNPs were detected in a 458-bp fragment of the APP gene; 23 of the polymorphisms were contained in a 302-bp intron 16 fragment. This equates to an SNP frequency of 1 per 13 bp for the intron. We can thus conclude that this portion of the bovine APP gene constitutes a hypermutable region. Nucleotide sequence diversity values of 0.019 and 0.0026 were obtained for APP and leptin respectively. Received: 22 April 1999 / Accepted: 25 July 1999  相似文献   

6.
R Kota  M Wolf  W Michalek  A Graner 《Génome》2001,44(4):523-528
Recent advances in DNA sequence analysis and the establishment of high-throughput assays have provided the framework for large-scale discovery and analysis of DNA sequence variation. In this context, single nucleotide polymorphisms (SNPs) are of particular interest. To initiate a systematic approach to develop an SNP map of barley (Hordeum vulgare L.), we have employed denaturing high-performance liquid chromatography (DHPLC) to analyse segregating SNP patterns in a doubled-haploid (DH) mapping population. To this end, SNPs between the parental genotypes were identified using a direct sequencing approach. Once a SNP was established between the parents, the optimal melting temperature of the PCR fragment containing the SNP was predicted for its analysis by DHPLC. Following the detection of the optimal temperature, the DH lines were analysed for the presence of either of the alleles. To test the utility of the analysis, data from previously mapped RFLP markers from which these SNPs were derived were compared. Results from these experiments indicate that DHPLC can be efficiently employed in analysing SNPs on a high-throughput scale.  相似文献   

7.
We report on the comparative utilities of simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) markers for characterizing maize germplasm in terms of their informativeness, levels of missing data, repeatability and the ability to detect expected alleles in hybrids and DNA pools. Two different SNP chemistries were compared; single-base extension detected by Sequenom MassARRAY, and invasive cleavage detected by Invader chemistry with PCR. A total of 58 maize inbreds and four hybrids were genotyped with 80 SSR markers, 69 Invader SNP markers and 118 MassARRAY SNP markers, with 64 SNP loci being common to the two SNP marker chemistries. Average expected heterozygosity values were 0.62 for SSRs, 0.43 for SNPs (pre-selected for their high level of polymorphism) and 0.63 for the underlying sequence haplotypes. All individual SNP markers within the same set of sequences had an average expected heterozygosity value of 0.26. SNP marker data had more than a fourfold lower level of missing data (2.1-3.1%) compared with SSRs (13.8%). Data repeatability was higher for SNPs (98.1% for MassARRAY SNPs and 99.3% for Invader) than for SSRs (91.7%). Parental alleles were observed in hybrid genotypes in 97.0% of the cases for MassARRAY SNPs, 95.5% for Invader SNPs and 81.9% for SSRs. In pooled samples with mixtures of alleles, SSRs, MassARRAY SNPs and Invader SNPs were equally capable of detecting alleles at mid to high frequencies. However, at low frequencies, alleles were least likely to be detected using Invader SNP markers, and this technology had the highest level of missing data. Collectively, these results showed that SNP technologies can provide increased marker data quality and quantity compared with SSRs. The relative loss in polymorphism compared with SSRs can be compensated by increasing SNP numbers and by using SNP haplotypes. Determining the most appropriate SNP chemistry will be dependent upon matching the technical features of the method within the context of application, particularly in consideration of whether genotypic samples will be pooled or assayed individually.  相似文献   

8.
CpG dinucleotides mutate at a high rate because cytosine is vulnerable to deamination, cytosines in CpG dinucleotides are often methylated, and deamination of 5-methylcytosine (5mC) produces thymidine. Previous experiments have shown that DNA melting is the rate-limiting step in cytosine deamination. Here we show, through the analysis of human single-nucleotide polymorphisms (SNPs), that the mutation rate produced by 5mC deamination is highly dependent on local GC content. In fact, linear regression analysis showed that the log(10) of the 5mC mutation rates (inferred from SNP frequencies) had slopes of -3 when graphed with respect to the GC content of neighboring sequences. This is the ideal slope that would be expected if the correlation between CpG underrepresentation and GC content had been solely caused by DNA melting. Moreover, this same result was obtained regardless of the SNP locations (all SNPs versus only SNPs in noncoding intergenic regions, excluding CpG islands) and regardless of the lengths over which GC content was calculated (SNP sequences with a modal length of 564 bp versus genomic contigs with a modal length of 163 kb). Several alternative interpretations are discussed.  相似文献   

9.
We selected 125 candidate single nucleotide polymorphisms (SNPs) in genes belonging to the human type 1 interferon (IFN) gene family and the genes coding for proteins in the main type 1 IFN signalling pathway by screening databases and by in silico comparison of DNA sequences. Using quantitative analysis of pooled DNA samples by solid-phase mini-sequencing, we found that only 20% of the candidate SNPs were polymorphic in the Finnish and Swedish populations. To allow more effective validation of candidate SNPs, we developed a four-colour microarray-based mini-sequencing assay for multiplex, quantitative allele frequency determination in pooled DNA samples. We used cyclic mini-sequencing reactions with primers carrying 5′-tag sequences, followed by capture of the products on microarrays by hybridisation to complementary tag oligonucleotides. Standard curves prepared from mixtures of known amounts of SNP alleles demonstrate the applicability of the system to quantitative analysis, and showed that for about half of the tested SNPs the limit of detection for the minority allele was below 5%. The microarray-based genotyping system established here is universally applicable for genotyping and quantification of any SNP, and the validated system for SNPs in type 1 IFN-related genes should find many applications in genetic studies of this important immunoregulatory pathway.  相似文献   

10.
Single nucleotide polymorphisms (SNPs) can significantly contribute to the characterization of the genes predisposing to iron overloads or deficiencies. We report an SNP survey of coding and non-coding regions of eight genes involved in iron metabolism, by two successive methods. First, we made use of the public domain sequence data, by using assembled expressed sequence tags, non-redundant sequences, and SNP database screening. We extracted 77 potential SNPs of which only 31 could be further validated by sequencing DNA from 44 unrelated multi-ethnic individuals. Our results indicate that a bioinformatic approach may be effective only in those cases where candidate SNPs are extracted from two different data sources or in cases of experimentally confirmed SNPs. Second, additional systematic sequencing of DNA from 24 unrelated Breton subjects increased the number of SNPs over a total length of 86 kb to 96. The average distance between the SNPs and minor allele frequencies were higher than reported by others authors; this discrepancy may reflect the nature of the genes studied and the ethnic homogeneity of our test population.  相似文献   

11.
BACKGROUND: We have developed a rapid, high throughput method for single nucleotide polymorphism (SNP) genotyping that employs an oligonucleotide ligation assay (OLA) and flow cytometric analysis of fluorescent microspheres. METHODS: A fluoresceinated oligonucleotide reporter sequence is added to a "capture" probe by OLA. Capture probes are designed to hybridize both to genomic "targets" amplified by polymerase chain reaction and to a separate complementary DNA sequence that has been coupled to a microsphere. These sequences on the capture probes are called "ZipCodes". The OLA-modified capture probes are hybridized to ZipCode complement-coupled microspheres. The use of microspheres with different ratios of red and orange fluorescence makes a multiplexed format possible where many SNPs may be analyzed in a single tube. Flow cytometric analysis of the microspheres simultaneously identifies both the microsphere type and the fluorescent green signal associated with the SNP genotype. RESULTS: Application of this methodology is demonstrated by the multiplexed genotyping of seven CEPH DNA samples for nine SNP markers located near the ApoE locus on chromosome 19. The microsphere-based SNP analysis agreed with genotyping by sequencing in all cases. CONCLUSIONS: Multiplexed SNP genotyping by OLA with flow cytometric analysis of fluorescent microspheres is an accurate and rapid method for the analysis of SNPs.  相似文献   

12.
Heterodera glycines, the soybean cyst nematode (SCN), is a damaging agricultural pest that could be effectively managed if critical phenotypes, such as virulence and host range could be understood. While SCN is amenable to genetic analysis, lack of DNA sequence data prevents the use of such methods to study this pathogen. Fortunately, new methods of DNA sequencing that produced large amounts of data and permit whole genome comparative analyses have become available. In this study, 400 million bases of genomic DNA sequence were collected from two inbred biotypes of SCN using 454 micro-bead DNA sequencing. Comparisons to a BAC, sequenced by Sanger sequencing, showed that the micro-bead sequences could identify low and high copy number regions within the BAC. Potential single nucleotide polymorphisms (SNPs) between the two SCN biotypes were identified by comparing the two sets of sequences. Selected resequencing revealed that up to 84% of the SNPs were correct. We conclude that the quality of the micro-bead sequence data was sufficient for de novo SNP identification and should be applicable to organisms with similar genome sizes and complexities. The SNPs identified will be an important starting point in associating phenotypes with specific regions of the SCN genome.  相似文献   

13.
2SNP software package implements a new very fast scalable algorithm for haplotype inference based on genotype statistics collected only for pairs of SNPs. This software can be used for comparatively accurate phasing of large number of long genome sequences, e.g. obtained from DNA arrays. As an input 2SNP takes genotype matrix and outputs the corresponding haplotype matrix. On datasets across 79 regions from HapMap 2SNP is several orders of magnitude faster than GERBIL and PHASE while matching them in quality measured by the number of correctly phased genotypes, single-site and switching errors. For example, 2SNP requires 41 s on Pentium 4 2 Ghz processor to phase 30 genotypes with 1381 SNPs (ENm010.7p15:2 data from HapMap) versus GERBIL and PHASE requiring more than a week and admitting no less errors than 2SNP.  相似文献   

14.
Migratory birds are of particular interest for population genetics because of the high connectivity between habitats and populations. A high degree of connectivity requires using many genetic markers to achieve the required statistical power, and a genome wide SNP set can fit this purpose. Here we present the development of a genome wide SNP set for the Barnacle Goose Branta leucopsis, a model species for the study of bird migration. We used the genome of a different waterfowl species, Mallard Anas platyrhynchos, as a reference to align Barnacle Goose second generation sequence reads from an RRL library and detected 2188 SNPs genome wide. Furthermore, we used chimeric flanking sequences, merged from both Mallard and Barnacle Goose DNA sequence information, to create primers for validation by genotyping. Validation with a 384 SNP genotyping set resulted in 374 (97%) successfully typed SNPs in the assay, of which 358 (96%) were polymorphic. Additionally, we validated our SNPs on relatively old (30 years) museum samples, which resulted in a success rate of at least 80%. This shows that museum samples could be used in standard SNP genotyping assays. Our study also shows that the genome of a related species can be used as reference to detect genome wide SNPs in birds, because genomes of birds are highly conserved. This is illustrated by the use of chimeric flanking sequences, which showed that the incorporation of flanking nucleotides from Mallard into Barnacle Goose sequences lead to equal genotyping performance when compared to flanking sequences solely composed of Barnacle Goose sequence.  相似文献   

15.
ABSTRACT: BACKGROUND: A genome-wide set of single nucleotide polymorphisms (SNPs) is a valuable resource in genetic research and breeding and is usually developed by re-sequencing a genome. If a genome sequence is not available, an alternative strategy must be used. We previously reported the development of a pipeline (AGSNP) for genome-wide SNP discovery in coding sequences and other single-copy DNA without a complete genome sequence in self-pollinating (autogamous) plants. Here we updated this pipeline for SNP discovery in outcrossing (allogamous) species and demonstrated its efficacy in SNP discovery in walnut (Juglans regia L.). RESULTS: The first step in the original implementation of the AGSNP pipeline was the construction of a reference sequence and the identification of single-copy sequences in it. To identify single-copy sequences, multiple genome equivalents of short SOLiD reads of another individual were mapped to shallow genome coverage of long Sanger or Roche 454 reads making up the reference sequence. The relative depth of SOLiD reads was used to filter out repeated sequences from single-copy sequences in the reference sequence. The second step was a search for SNPs between SOLiD reads and the reference sequence. Polymorphism within the mapped SOLiD reads would have precluded SNP discovery; hence both individuals had to be homozygous. The AGSNP pipeline was updated here for using SOLiD or other type of short reads of a heterozygous individual for these two principal steps. A total of 32.6X walnut genome equivalents of SOLiD reads of vegetatively propagated walnut scion cultivar 'Chandler' were mapped to 48,661 'Chandler' bacterial artificial chromosome (BAC) end sequences (BESs) produced by Sanger sequencing during the construction of a walnut physical map. A total of 22,799 putative SNPs were initially identified. A total of 6,000 Infinium II type SNPs evenly distributed along the walnut physical map were selected for the construction of an Infinium BeadChip, which was used to genotype a walnut mapping population having 'Chandler' as one of the parents. Genotyping results were used to adjust the filtering parameters of the updated AGSNP pipeline. With the adjusted filtering criteria, 69.6% of SNPs discovered with the updated pipeline were real and could be mapped on the walnut genetic map. A total of 13,439 SNPs were discovered by BES re-sequencing. BESs harboring SNPs were in 677 FPC contigs covering 98% of the physical map of the walnut genome. CONCLUSION: The updated AGSNP pipeline is a versatile SNP discovery tool for a high-throughput, genome-wide SNP discovery in both autogamous and allogamous species. With this pipeline, a large set of SNPs were identified in a single walnut cultivar.  相似文献   

16.
Single nucleotide polymorphisms (SNPs) are the most abundant form of genetic variation. SNPs are important markers that link sequence variations to phenotypic changes. Because of the importance of SNPs in the life and medical sciences, a great deal of effort has been devoted to developing accurate, rapid, and cost-effective technologies for SNP analysis. In this article, we describe a novel method for SNP genotyping based on differential fluorescence emission due to cleavage by Thermus thermophilus RNase HII (TthRNase HII) of DNA heteroduplexes containing an SNP site-specific chimeric DNA-rN1-DNA molecular beacon (cMB). We constructed a loop sequence for a cMB that contains a single SNP-specific ribonucleotide at the central site. When the cMB probe is hybridized to a target double-stranded DNA (dsDNA), a perfect match of the cMB/DNA duplex permits efficient cleavage with TthRNase HII, whereas a mismatch in the duplex due to an SNP greatly reduces efficiency. Cleavage efficiency is measured by the incremental difference of fluorescence emission of the beacon. We show that the genotypes of 10 individuals at 12 SNP sites across a series of human leukocyte antigen (HLA) can be determined correctly with respect to conventional DNA sequencing. This novel TthRNase HII-based method offers a platform for easy and accurate SNP analysis.  相似文献   

17.
Salmonid genomes are considered to be in a pseudo‐tetraploid state as a result of a genome duplication event that occurred between 25 and 100 Ma. This situation complicates single‐nucleotide polymorphism (SNP) discovery in rainbow trout as many putative SNPs are actually paralogous sequence variants (PSVs) and not simple allelic variants. To differentiate PSVs from simple allelic variants, we used 19 homozygous doubled haploid (DH) lines that represent a wide geographical range of rainbow trout populations. In the first phase of the study, we analysed SbfI restriction‐site associated DNA (RAD) sequence data from all the 19 lines and selected 11 lines for an extended SNP discovery. In the second phase, we conducted the extended SNP discovery using PstI RAD sequence data from the selected 11 lines. The complete data set is composed of 145 168 high‐quality putative SNPs that were genotyped in at least nine of the 11 lines, of which 71 446 (49%) had minor allele frequencies (MAF) of at least 18% (i.e. at least two of the 11 lines). Approximately 14% of the RAD SNPs in this data set are from expressed or coding rainbow trout sequences. Our comparison of the current data set with previous SNP discovery data sets revealed that 99% of our SNPs are novel. In the support files for this resource, we provide annotation to the positions of the SNPs in the working draft of the rainbow trout reference genome, provide the genotypes of each sample in the discovery panel and identify SNPs that are likely to be in coding sequences.  相似文献   

18.
水稻单核苷酸多态性及其应用现状   总被引:6,自引:0,他引:6  
刘传光  张桂权 《遗传》2006,28(6):737-744
单核苷酸多态性(single nucleotide polymorphisms, SNPs)在水稻中数量多,分布密度高,遗传稳定性高。水稻SNPs的发现方法主要有对样本DNA的PCR产物直接测序、从SSR区段检测SNPs和从基因组序列直接搜索等。目前已有多种基因分型技术运用到了水稻SNPs检测,SNPs检测的高度自动化使水稻SNPs基因分型非常方便。单核苷酸多态性在水稻遗传图谱的构建、基因克隆和功能基因组学研究、标记辅助选择育种、遗传资源分类及物种进化等方面的应用具有巨大潜力。  相似文献   

19.
We describe a rapid and easily automated phylogenetic grouping technique based on analysis of bacterial genome single-nucleotide polymorphisms (SNPs). We selected 13 SNPs derived from a complete sequence analysis of 11 essential genes previously used for multilocus sequence typing (MLST) of 30 Escherichia coli strains representing the genetic diversity of the species. The 13 SNPs were localized in five genes, trpA, trpB, putP, icdA, and polB, and were selected to allow recovery of the main phylogenetic groups (groups A, B1, E, D, and B2) and subgroups of the species. In the first step, we validated the SNP approach in silico by extracting SNP data from the complete sequences of the five genes for a panel of 65 pathogenic strains belonging to different E. coli pathovars, which were previously analyzed by MLST. In the second step, we determined these SNPs by dideoxy single-base extension of unlabeled oligonucleotide primers for a collection of 183 commensal and extraintestinal clinical E. coli isolates and compared the SNP phylotyping method to previous well-established typing methods. This SNP phylotyping method proved to be consistent with the other methods for assigning phylogenetic groups to the different E. coli strains. In contrast to the other typing methods, such as multilocus enzyme electrophoresis, ribotyping, or PCR phylotyping using the presence/absence of three genomic DNA fragments, the SNP typing method described here is derived from a solid phylogenetic analysis, and the results obtained by this method are more meaningful. Our results indicate that similar approaches may be used for a wide variety of bacterial species.  相似文献   

20.
Oilseed rape (Brassica napus) is an allotetraploid species consisting of two genomes, derived from B. rapa (A genome) and B. oleracea (C genome). The presence of these two genomes makes single nucleotide polymorphism (SNP) marker identification and SNP analysis more challenging than in diploid species, as for a given locus usually two versions of a DNA sequence (based on the two ancestral genomes) have to be analyzed simultaneously during SNP identification and analysis. One hundred amplicons derived from expressed sequence tag (ESTs) were analyzed to identify SNPs in a panel of oilseed rape varieties and within two sister species representing the ancestral genomes. A total of 604 SNPs were identified, averaging one SNP in every 42 bp. It was possible to clearly discriminate SNPs that are polymorphic between different plant varieties from SNPs differentiating the two ancestral genomes. To validate the identified SNPs for their use in genetic analysis, we have developed Illumina GoldenGate assays for some of the identified SNPs. Through the analysis of a number of oilseed rape varieties and mapping populations with GoldenGate assays, we were able to identify a number of different segregation patterns in allotetraploid oilseed rape. The majority of the identified SNP markers can be readily used for genetic mapping, showing that amplicon sequencing and Illumina GoldenGate assays can be used to reliably identify SNP markers in tetraploid oilseed rape and to convert them into successful SNP assays that can be used for genetic analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号