首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到15条相似文献,搜索用时 15 毫秒
1.
convert is a user‐friendly, 32‐bit Windows program that facilitates ready transfer of codominant, diploid genotypic data amongst commonly used population genetic software packages. convert reads input files in its own ‘standard’ data format, easily produced from an excel file of diploid, codominant marker data, and can convert these to the input formats of the following programs: gda , genepop , arlequin , popgene , microsat , phylip , and structure . convert can also read input files in genepop format. In addition, convert can produce a summary table of allele frequencies in which private alleles and the sample sizes at each locus are indicated.  相似文献   

2.
There has been a great increase in both the number of population genetic analysis programs and the size of data sets being studied with them. Since the file formats required by the most popular and useful programs are variable, automated reformatting or conversion between them is desirable. formatomatic is an easy to use program that can read allelic data files in genepop , raw (csv ) or convert formats and create data files in nine formats: raw (csv ), arlequin , genepop , immanc /bayesass +, migrate , newhybrids , msvar , baps and structure . Use of formatomatic should greatly reduce time spent reformatting data sets and avoid unnecessary errors.  相似文献   

3.
We have developed a software package named PEAS to facilitate analyses of large data sets of single nucleotide polymorphisms (SNPs) for population genetics and molecular phylogenetics studies. PEAS reads SNP data in various formats as input and is versatile in data formatting; using PEAS, it is easy to create input files for many popular packages, such as STRUCTURE, frappe, Arlequin, Haploview, LDhat, PLINK, EIGENSOFT, PHASE, fastPHASE, MEGA and PHYLIP. In addition, PEAS fills up several analysis gaps in currently available computer programs in population genetics and molecular phylogenetics. Notably, (i) It calculates genetic distance matrices with bootstrapping for both individuals and populations from genome-wide high-density SNP data, and the output can be streamlined to MEGA and PHYLIP programs for further processing; (ii) It calculates genetic distances from STRUCTURE output and generates MEGA file to reconstruct component trees; (iii) It provides tools to conduct haplotype sharing analysis for phylogenetic studies based on high-density SNP data. To our knowledge, these analyses are not available in any other computer program. PEAS for Windows is freely available for academic users from http://www.picb.ac.cn/~xushua/index.files/Download_PEAS.htm.  相似文献   

4.
The use of diploid sequence markers is still challenging despite the good quality of the information they provide. There is a common problem to all sequencing approaches [traditional cloning and sequencing of PCR amplicons as well as next-generation sequencing (NGS)]: when no variation is found within the sequences from a given individual, homozygozity can never be asserted with certainty. As a consequence, sequence data from diploid markers are mostly analysed at the population (not the individual level) particularly in animal studies. This study aims at contributing to solve this. Using the Bayes theorem and the binomial law, useful results are derived, among which: (i) the number of sequence reads per individual (or sequencing depth) which is required to ensure, at a given probability threshold, that some heterozygotes are not considered erroneously as homozygotes, as a function of the observed heterozygozity (H(o) ) of the locus in the population; (ii) a way of estimating H(o) from low coverage NGS data; (iii) a way of testing the null hypothesis that a genetic marker corresponds to a single and diploid locus, in the absence of data from controlled crosses; (iv) strategies for characterizing sequence genotypes in populations minimizing the average number of sequence reads per individual; (v) a rationale to decide which are the variations that one needs to consider along the sequence, as a function of the sequencing depth affordable, the level of polymorphism desired and the risk of sequencing error. For traditional sequencing technology, optimal strategies appear surprisingly different from the usual empirical ones. The average number of sequence reads required to obtain 99% of fully determined genotypes never exceeds six, this value corresponding to the worst situation when H(o) equals 0.6. This threshold value of H(o) is strikingly stable when the tolerated proportion of nonfully resolved genotypes varies in a reasonable range. These results do not rely on the Hardy-Weinberg equilibrium assumption or on diallelism of nucleotidic sites.  相似文献   

5.
One of the most tedious steps in genetic data analyses is the reformatting data generated with one program for use with other applications. This conversion is necessary because comprehensive evaluation of the data may be based on different algorithms included in diverse software, each requiring a distinct input format. A platform‐independent and freely available program or a web‐based tool dedicated to such reformatting can save time and efforts in data processing. Here, we report widgetcon , a website and a program which has been developed to quickly and easily convert among various molecular data formats commonly used in phylogenetic analysis, population genetics, and other fields. The web‐based service is available at https://www.widgetcon.net . The program and the website convert the major data formats in four basic steps in less than a minute. The resource will be a useful tool for the research community and can be updated to include more formats and features in the future.  相似文献   

6.
We present here a new version of the Arlequin program available under three different forms: a Windows graphical version (Winarl35), a console version of Arlequin (arlecore), and a specific console version to compute summary statistics (arlsumstat). The command-line versions run under both Linux and Windows. The main innovations of the new version include enhanced outputs in XML format, the possibility to embed graphics displaying computation results directly into output files, and the implementation of a new method to detect loci under selection from genome scans. Command-line versions are designed to handle large series of files, and arlsumstat can be used to generate summary statistics from simulated data sets within an Approximate Bayesian Computation framework.  相似文献   

7.
The use of genetic information is crucial in conservation programs for the establishment of breeding plans and for the evaluation of restocking success. Short tandem repeats (STRs) have been the most widely used molecular markers in such programs, but next‐generation sequencing approaches have prompted the transition to genome‐wide markers such as single nucleotide polymorphisms (SNPs). Until now, most sturgeon species have been monitored using STRs. The low diversity found in the critically endangered European sturgeon (Acipenser sturio), however, makes its future genetic monitoring challenging, and the current resolution needs to be increased. Here, we describe the discovery of a highly informative set of 79 SNPs using double‐digest restriction‐associated DNA (ddRAD) sequencing and its validation by genotyping using the MassARRAY system. Comparing with STRs, the SNP panel proved to be highly efficient and reproducible, allowing for more accurate parentage and kinship assignments' on 192 juveniles of known pedigree and 40 wild‐born adults. We explore the effectiveness of both markers to estimated relatedness and inbreeding, using simulated and empirical datasets. Interestingly, we found significant correlations between STRs and SNPs at individual heterozygosity and inbreeding that give support to a reasonable representation of whole genome diversity for both markers. These results are useful for the conservation program of A. sturio in building a comprehensive studbook, which will optimize conservation strategies. This approach also proves suitable for other case studies in which highly discriminatory genetic markers are needed to assess parentage and kinship.  相似文献   

8.
Species are considered to be the basic unit of ecological and evolutionary studies. As multilocus genomic data are increasingly available, there have been considerable interests in the use of DNA sequence data to delimit species. In this study, we show that machine learning can be used for species delimitation. Our method treats the species delimitation problem as a classification problem for identifying the category of a new observation on the basis of training data. Extensive simulation is first conducted over a broad range of evolutionary parameters for training purposes. Each pair of known populations is combined to form training samples with a label of “same species” or “different species”. We use support vector machine (SVM) to train a classifier using a set of summary statistics computed from training samples as features. The trained classifier can classify a test sample to two outcomes: “same species” or “different species”. Given multilocus genomic data of multiple related organisms or populations, our method (called CLADES) performs species delimitation by first classifying pairs of populations. CLADES then delimits species by maximizing the likelihood of species assignment for multiple populations. CLADES is evaluated through extensive simulation and also tested on real genetic data. We show that CLADES is both accurate and efficient for species delimitation when compared with existing methods. CLADES can be useful especially when existing methods have difficulty in delimitation, for example with short species divergence time and gene flow.  相似文献   

9.
We present the computer program hybridlab 1.0 for simulating intraspecific hybrids from population samples of nuclear genetic markers such as microsatellites, allozymes or SNPs (single nucleotide polymorphisms). The program generates a user‐specified number of multilocus F1 hybrid genotypes between any pair of potentially hybridizing populations included in a standard input‐file of multilocus genotypes for population genetic analysis. This simple, user‐friendly program has a wide range of applications for studying natural and artificial hybridization; in particular, for evaluating the statistical power for individual assignment of parental and hybrid individuals. An example of application for Atlantic cod populations is given.  相似文献   

10.
aflpdat is a collection of R functions that facilitates the handling of dominant genotypic data. It converts data from a standard tab‐separated table to the input formats of the following programs: arlequin , structure , treecon , paup and hickory . In addition, it calculates the proportion of polymorphic markers in each population, and estimates gene diversity as the average proportion of pairwise differences between individuals with confidence intervals based on bootstrapping across markers. It also produces summary tables of marker frequencies and the presence or absence of markers in each population. aflpdat can be downloaded from http://www.nhm.uio.no/ncb/ .  相似文献   

11.
12.
Prioritizing and making efficient conservation plans for threatened populations requires information at both evolutionary and ecological timescales. Nevertheless, few studies integrate multidisciplinary approaches, mainly because of the difficulty for conservationists to assess simultaneously the evolutionary and ecological status of populations. Here, we sought to demonstrate how combining genetic and demographic analyses allows prioritizing and initiating conservation plans. To do so, we combined snapshot microsatellite data and a 30‐year‐long demographic survey on a threatened freshwater fish species (Parachondrostoma toxostoma) at the river basin scale. Our results revealed low levels of genetic diversity and weak effective population sizes (<63 individuals) in all populations. We further detected severe bottlenecks dating back to the last centuries (200–800 years ago), which may explain the differentiation of certain populations. The demographic survey revealed a general decrease in the spatial distribution and abundance of P. toxostoma over the last three decades. We conclude that demo‐genetic approaches are essential for (1) identifying populations for which both evolutionary and ecological extinction risks are high; and (2) proposing conservation plans targeted toward these at risk populations, and accounting for the evolutionary history of populations. We suggest that demo‐genetic approaches should be the norm in conservation practices.  相似文献   

13.
With novel developments in sequencing technologies, time‐sampled data are becoming more available and accessible. Naturally, there have been efforts in parallel to infer population genetic parameters from these data sets. Here, we compare and analyse four recent approaches based on the Wright–Fisher model for inferring selection coefficients (s) given effective population size (Ne), with simulated temporal data sets. Furthermore, we demonstrate the advantage of a recently proposed approximate Bayesian computation (ABC)‐based method that is able to correctly infer genomewide average Ne from time‐serial data, which is then set as a prior for inferring per‐site selection coefficients accurately and precisely. We implement this ABC method in a new software and apply it to a classical time‐serial data set of the medionigra genotype in the moth Panaxia dominula. We show that a recessive lethal model is the best explanation for the observed variation in allele frequency by implementing an estimator of the dominance ratio (h).  相似文献   

14.
In species with large geographic ranges, genetic diversity of different populations may be well studied, but differences in loci and sample sizes can make the results of different studies difficult to compare. Yet, such comparisons are important for assessing the status of populations of conservation concern. We propose a simple approach of using a single well-studied reference population as a ‘yardstick'' to calibrate results of different studies to the same scale, enabling comparisons. We use a well-studied large carnivore, the brown bear (Ursus arctos), as a case study to demonstrate the approach. As a reference population, we genotyped 513 brown bears from Slovenia using 20 polymorphic microsatellite loci. We used this data set to calibrate and compare heterozygosity and allelic richness for 30 brown bear populations from 10 different studies across the global distribution of the species. The simplicity of the reference population approach makes it useful for other species, enabling comparisons of genetic diversity estimates between previously incompatible studies and improving our understanding of how genetic diversity is distributed throughout a species range.  相似文献   

15.
R ST, an analogue of F ST, provides a convenient approach for estimating levels of genetic differentiation from microsatellite data. This paper examines current approaches for calculating estimates of R ST and suggests a weighting scheme based on the transformation of allele sizes at loci across data sets. Combined within an analysis of variance framework this scheme yields an estimator of R ST analogous to the θ estimator of F ST. Software for the IBM-PC is described which carries out such calculations and assesses the significance of R ST or Nm estimates using bootstrap and permutation tests.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号