首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Microsatellites, also known as simple sequence repeats (SSRs), are among the most commonly used marker types in evolutionary and ecological studies. Next Generation Sequencing techniques such as 454 pyrosequencing allow the rapid development of microsatellite markers in nonmodel organisms. 454 pyrosequencing is a straightforward approach to develop a high number of microsatellite markers. Therefore, developing microsatellites using 454 pyrosequencing has become the method of choice for marker development. Here, we describe a user friendly way of microsatellite development from 454 pyrosequencing data and analyse data sets of 17 nonmodel species (plants, fungi, invertebrates, birds and a mammal) for microsatellite repeats and flanking regions suitable for primer development. We then compare the numbers of successfully lab‐tested microsatellite markers for the various species and furthermore describe diverse challenges that might arise in different study species, for example, large genome size or nonpure extraction of genomic DNA. Successful primer identification was feasible for all species. We found that in species for which large repeat numbers are uncommon, such as fungi, polymorphic markers can nevertheless be developed from 454 pyrosequencing reads containing small repeat numbers (five to six repeats). Furthermore, the development of microsatellite markers for species with large genomes was also with Next Generation Sequencing techniques more cost and time‐consuming than for species with smaller genomes. In this study, we showed that depending on the species, a different amount of 454 pyrosequencing data might be required for successful identification of a sufficient number of microsatellite markers for ecological genetic studies.  相似文献   

2.
3.
4.
5.
High‐density genome‐wide sequencing increases the likelihood of discovering genes of major effect and genomic structural variation in organisms. While there is an increasing availability of reference genomes across broad taxa, the greatest limitation to whole‐genome sequencing of multiple individuals continues to be the costs associated with sequencing. To alleviate excessive costs, pooling multiple individuals with similar phenotypes and sequencing the homogenized DNA (Pool‐Seq) can achieve high genome coverage, but at the loss of individual genotypes. Although Pool‐Seq has been an effective method for association mapping in model organisms, it has not been frequently utilized in natural populations. To extend bioinformatic tools for rapid implementation of Pool‐Seq data in nonmodel organisms, we developed a pipeline called PoolParty and illustrate its effectiveness in genetic association mapping. Alignment expectations based on five pooled Chinook salmon (Oncorhynchus tshawytscha) libraries showed that approximately 48% genome coverage per library could be achieved with reasonable sequencing effort. We additionally examined male and female O. tshawytscha libraries to illustrate how Pool‐Seq techniques can successfully map known genes associated with functional differences among sexes such as growth hormone 2. Finally, we compared pools of individuals of different spawning ages for each sex to discover novel genes involved with age at maturity in O. tshawytscha such as opsin4 and transmembrane protein19. While not appropriate for every system, Pool‐Seq data processed by the PoolParty pipeline is a practical method for identifying genes of major effect in nonmodel organisms when high genome coverage is necessary and cost is a limiting factor.  相似文献   

6.
7.
Characterization and population genetic analysis of multilocus genes, such as those found in the major histocompatibility complex (MHC) is challenging in nonmodel vertebrates. The traditional method of extensive cloning and Sanger sequencing is costly and time‐intensive and indirect methods of assessment often underestimate total variation. Here, we explored the suitability of 454 pyrosequencing for characterizing multilocus genes for use in population genetic studies. We compared two sample tagging protocols and two bioinformatic procedures for 454 sequencing through characterization of a 185‐bp fragment of MHC DRB exon 2 in wolverines (Gulo gulo) and further compared the results with those from cloning and Sanger sequencing. We found 10 putative DRB alleles in the 88 individuals screened with between two and four alleles per individual, suggesting amplification of a duplicated DRB gene. In addition to the putative alleles, all individuals possessed an easily identifiable pseudogene. In our system, sequence variants with a frequency below 6% in an individual sample were usually artefacts. However, we found that sample preparation and data processing procedures can greatly affect variant frequencies in addition to the complexity of the multilocus system. Therefore, we recommend determining a per‐amplicon‐variant frequency threshold for each unique system. The extremely deep coverage obtained in our study (approximately 5000×) coupled with the semi‐quantitative nature of pyrosequencing enabled us to assign all putative alleles to the two DRB loci, which is generally not possible using traditional methods. Our method of obtaining locus‐specific MHC genotypes will enhance population genetic analyses and studies on disease susceptibility in nonmodel wildlife species.  相似文献   

8.
In plants, particular micro‐RNAs (miRNAs) induce the production of a class of small interfering RNAs (siRNA) called trans‐acting siRNA (ta‐siRNA) that lead to gene silencing. A single miRNA target is sufficient for the production of ta‐siRNAs, which target can be incorporated into a vector to induce the production of siRNAs, and ultimately gene silencing. The term miRNA‐induced gene silencing (MIGS) has been used to describe such vector systems in Arabidopsis. Several ta‐siRNA loci have been identified in soybean, but, prior to this work, few of the inducing miRNAs have been experimentally validated, much less used to silence genes. Nine ta‐siRNA loci and their respective miRNA targets were identified, and the abundance of the inducing miRNAs varies dramatically in different tissues. The miRNA targets were experimentally verified by silencing a transgenic GFP gene and two endogenous genes in hairy roots and transgenic plants. Small RNAs were produced in patterns consistent with the utilization of the ta‐siRNA pathway. A side‐by‐side experiment demonstrated that MIGS is as effective at inducing gene silencing as traditional hairpin vectors in soybean hairy roots. Soybean plants transformed with MIGS vectors produced siRNAs and silencing was observed in the T1 generation. These results complement previous reports in Arabidopsis by demonstrating that MIGS is an efficient way to produce siRNAs and induce gene silencing in other species, as shown with soybean. The miRNA targets identified here are simple to incorporate into silencing vectors and offer an effective and efficient alternative to other gene silencing strategies.  相似文献   

9.
Developing genomic insights is challenging in nonmodel species for which resources are often scarce and prohibitively costly. Here, we explore the potential of a recently established approach using Pool‐seq data to generate a de novo genome assembly for mining exons, upon which Pool‐seq data are used to estimate population divergence and diversity. We do this for two pairs of sympatric populations of brown trout (Salmo trutta): one naturally sympatric set of populations and another pair of populations introduced to a common environment. We validate our approach by comparing the results to those from markers previously used to describe the populations (allozymes and individual‐based single nucleotide polymorphisms [SNPs]) and from mapping the Pool‐seq data to a reference genome of the closely related Atlantic salmon (Salmo salar). We find that genomic differentiation (FST) between the two introduced populations exceeds that of the naturally sympatric populations (FST = 0.13 and 0.03 between the introduced and the naturally sympatric populations, respectively), in concordance with estimates from the previously used SNPs. The same level of population divergence is found for the two genome assemblies, but estimates of average nucleotide diversity differ ( ≈ 0.002 and  ≈ 0.001 when mapping to S. trutta and S. salar, respectively), although the relationships between population values are largely consistent. This discrepancy might be attributed to biases when mapping to a haploid condensed assembly made of highly fragmented read data compared to using a high‐quality reference assembly from a divergent species. We conclude that the Pool‐seq‐only approach can be suitable for detecting and quantifying genome‐wide population differentiation, and for comparing genomic diversity in populations of nonmodel species where reference genomes are lacking.  相似文献   

10.
11.
In the post-genomic era, data management and development of bioinformatic tools are critical for the adequate exploitation of genomics data. In this review, we address the actual situation for the subset of crops represented by the perennial fruit species. The agronomical singularity of these species compared to plant and crop model species provides significant challenges on the implementation of good practices generally not addressed in other species. Studies are usually performed over several years in non-controlled environments, usage of rootstock is common, and breeders heavily rely on vegetative propagation. A reference genome is now available for all the major species as well as many members of the economically important genera for breeding purposes. Development of pangenome for these species is beginning to gain momentum which will require a substantial effort in term of bioinformatic tool development. The available tools for genome annotation and functional analysis will also be presented.  相似文献   

12.
Using high throughput sequencing we obtained a large number of microsatellites from Podocnemis lewyana, an endemic turtle from northwestern South America. We used 454 Genome Sequence FLX platform of sheared genomic DNA from randomly sampling approximately 17% of the haploid genome. We identified 86,501 reads (8.1% of all reads) that contained our definition of microsatellite loci. AC and TC were the most abundant motifs in the P. lewyana genome. TGC and AAAC were most abundant tri and tetra-nucleotide motifs respectively. 72.7% of microsatellite reads had flanking sequence regions suitable for primer design and PCR amplification. We validated the identified potentially amplifiable loci (PAL) and tested for polymorphism by selecting 15 loci corresponding to tetranucleotides. Twelve loci showed polymorphism in eight individuals. These findings demonstrates that microsatellite detection using next-generation sequencing is an efficient way of getting a lot of loci for listed taxa and in turn will have a large impact on future genetic studies aiming to understand and implement conservation plans for this highly threatened freshwater turtle.  相似文献   

13.
Structural genomics projects aim to provide a sharp increase in the number of structures of functionally unannotated, and largely unstudied, proteins. Algorithms and tools capable of deriving information about the nature, and location, of functional sites within a structure are increasingly useful therefore. Here, a neural network is trained to identify the catalytic residues found in enzymes, based on an analysis of the structure and sequence. The neural network output, and spatial clustering of the highly scoring residues are then used to predict the location of the active site.A comparison of the performance of differently trained neural networks is presented that shows how information from sequence and structure come together to improve the prediction accuracy of the network. Spatial clustering of the network results provides a reliable way of finding likely active sites. In over 69% of the test cases the active site is correctly predicted, and a further 25% are partially correctly predicted. The failures are generally due to the poor quality of the automatically generated sequence alignments.We also present predictions identifying the active site, and potential functional residues in five recently solved enzyme structures, not used in developing the method. The method correctly identifies the putative active site in each case. In most cases the likely functional residues are identified correctly, as well as some potentially novel functional groups.  相似文献   

14.
15.
16.
17.
Decreasing costs of next‐generation sequencing (NGS) experiments have made a wide range of genomic questions open for study with nonmodel organisms. However, experimental designs and analysis of NGS data from less well‐known species are challenging because of the lack of genomic resources. In this work, we investigate the performance of alternative experimental designs and bioinformatics approaches in estimating variability and neutrality tests based on the site‐frequency‐spectrum (SFS) from individual resequencing data. We pay particular attention to challenges faced in the study of nonmodel organisms, in particular the absence of a species‐specific reference genome, although phylogenetically close genomes are assumed to be available. We compare the performance of three alternative bioinformatics approaches – genotype calling, genotype–haplotype calling and direct estimation without calling genotypes. We find that relying on genotype calls provides biased estimates of population genetic statistics at low to moderate read depth (2–8×). Genotype–haplotype calling returns more accurate estimates irrespective of the divergence to the reference genome, but requires moderate depth (8–20×). Direct estimation without calling genotypes returns the most accurate estimates of variability and of most SFS tests investigated, including at low read depth (2–4×). Studies without species‐specific reference genome should thus aim for low read depth and avoid genotype calling whenever individual genotypes are not essential. Otherwise, aiming for moderate to high depth at the expense of number of individuals, and using genotype–haplotype calling, is recommended.  相似文献   

18.
19.
20.
High‐throughput microarray experiments often generate far more biological information than is required to test the experimental hypotheses. Many microarray analyses are considered finished after differential expression and additional analyses are typically not performed, leaving untapped biological information left undiscovered. This is especially true if the microarray experiment is from an ecological study of multiple populations. Comparisons across populations may also contain important genomic polymorphisms, and a subset of these polymorphisms may be identified with microarrays using techniques for the detection of single feature polymorphisms (SFP). SFPs are differences in microarray probe level intensities caused by genetic polymorphisms such as single‐nucleotide polymorphisms and small insertions/deletions and not expression differences. In this study, we provide a new algorithm for the detection of SFPs, evaluate the algorithm using existing data from two publicly available Affymetrix Barley (Hordeum vulgare) microarray data sets and compare them to two previously published SFP detection algorithms. Results show that our algorithm provides more consistent and sensitive calling of SFPs with a lower false discovery rate. Simultaneous analysis of SFPs and differential expression is a low‐cost method for the enhanced analysis of microarray data, enabling additional biological inferences to be made.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号