首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
SUMMARY: The development of statistical models linking the molecular state of a cell to its physiology is one of the most important tasks in the analysis of Functional Genomics data. Because of the large number of variables measured a comprehensive evaluation of variable subsets cannot be performed with available computational resources. It follows that an efficient variable selection strategy is required. However, although software packages for performing univariate variable selection are available, a comprehensive software environment to develop and evaluate multivariate statistical models using a multivariate variable selection strategy is still needed. In order to address this issue, we developed GALGO, an R package based on a genetic algorithm variable selection strategy, primarily designed to develop statistical models from large-scale datasets.  相似文献   

2.
The canyon treefrog, Hyla arenicolor, is a wide-ranging hylid found from southwestern US into southern Mexico. Recent studies have shown this species to have a complex evolutionary history, with several phylogeographically distinct lineages, a probable cryptic species, and multiple episodes of mitochondrial introgression with the sister group, the H. eximia complex. We aimed to use genome wide AFLP markers to better resolve relationships within this group. As in other studies, our inferred phylogeny not only provides evidence for repeated mitochondrial introgression between H. arenicolor lineages and H. eximia/H. wrightorum, but it also affords more resolution within the main H. arenicolor clade than was previously achieved with sequence data. However, as with a previous study, the placement of a lineage of H. arenicolor whose distribution is centered in the Balsas Basin of Mexico remains poorly resolved, perhaps due to past hybridization with the H. eximia complex. Furthermore, the AFLP data set shows no differentiation among lineages from the Grand Canyon and Colorado Plateau despite their large mitochondrial sequence divergence. Finally, our results infer a well-supported sister relationship between this combined Colorado Plateau/Grand Canyon lineage and the Sonoran Desert lineage, a relationship that strongly contradicts conclusions drawn from the mtDNA evidence. Our study provides a basis for further behavioral and ecological speciation studies of this system and highlights the importance of multi-taxon (species) sampling in phylogenetic and phylogeographic studies.  相似文献   

3.
Tad Dallas 《Ecography》2014,37(4):402-405
Metacommunity theory is an extension of metapopulation theory with the goal of understanding how ecological communities vary through space and time. One off‐shoot of metacommunity theory deals with understanding how community structure varies along biotic or environmental gradients. The Elements of Metacommunity Structure framework is a three‐tiered analysis of metacommunity structure that enables the user to identify metacommunity properties that arise in site‐by‐species incidence matrices. These properties can then be related to underlying variables that influence species distributions. The EMS framework is now implemented in metacom, an open source R package that allows for the analysis and plotting of metacommunities.  相似文献   

4.
ABSTRACT: BACKGROUND: Linkage analysis is a useful tool for detecting genetic variants that regulate a trait of interest, especially genes associated with a given disease. Although penetrance parameters play an important role in determining gene location, they are assigned arbitrary values according to the researcher's intuition or as estimated by the maximum likelihood principle. Several methods exist by which to evaluate the maximum likelihood estimates of penetrance, although not all of these are supported by software packages and some are biased by marker genotype information, even when disease development is due solely to the genotype of a single allele. FINDINGS: Programs for exploring the maximum likelihood estimates of penetrance parameters were developed using the R statistical programming language supplemented by external C functions. The software returns a vector of polynomial coefficients of penetrance parameters, representing the likelihood of pedigree data. From the likelihood polynomial supplied by the proposed method, the likelihood value and its gradient can be precisely computed. To reduce the effect of the supplied dataset on the likelihood function, feasible parameter constraints can be introduced into maximum likelihood estimates, thus enabling flexible exploration of the penetrance estimates. An auxiliary program generates a perspective plot allowing visual validation of the model's convergence. The functions are collectively available as the MLEP R package. CONCLUSIONS: Linkage analysis using penetrance parameters estimated by the MLEP package enables feasible localization of a disease locus. This is shown through a simulation study and by demonstrating how the package is used to explore maximum likelihood estimates. Although the input dataset tends to bias the likelihood estimates, the method yields accurate results superior to the analysis using intuitive penetrance values for disease with low allele frequencies. MLEP is part of the Comprehensive R Archive Network and is freely available at http://cran.r-project.org/web/packages/MLEP/index.html.  相似文献   

5.
Statistical Analysis of Mixed‐Ploidy Populations (StAMPP) is a freely available R package for calculation of population structure and differentiation based on single nucleotide polymorphism (SNP) genotype data from populations of any ploidy level, and/or mixed‐ploidy levels. StAMPP provides an advance on previous similar software packages, due to an ability to calculate pairwise FST values along with confidence intervals, Nei's genetic distance and genomic relationship matrixes from data sets of mixed‐ploidy level. The software code is designed to efficiently handle analysis of large genotypic data sets that are typically generated by high‐throughput genotyping platforms. Population differentiation studies using StAMPP are broadly applicable to studies of molecular ecology and conservation genetics, as well as animal and plant breeding.  相似文献   

6.
In genetics, many evolutionary pathways can be modeled by the ordered accumulation of permanent changes. Mixture models of mutagenetic trees have been used to describe disease progression in cancer and in HIV. In cancer, progression is modeled by the accumulation of chromosomal gains and losses in tumor cells; in HIV, the accumulation of drug resistance-associated mutations in the viral genome is known to be associated with disease progression. From such evolutionary models, genetic progression scores can be derived that assign measures for the disease state to single patients. Rtreemix is an R package for estimating mixture models of evolutionary pathways from observed cross-sectional data and for estimating associated genetic progression scores. The package also provides extended functionality for estimating confidence intervals for estimated model parameters and for evaluating the stability of the estimated evolutionary mixture models.  相似文献   

7.

Background  

Researchers in the field of bioinformatics often face a challenge of combining several ordered lists in a proper and efficient manner. Rank aggregation techniques offer a general and flexible framework that allows one to objectively perform the necessary aggregation. With the rapid growth of high-throughput genomic and proteomic studies, the potential utility of rank aggregation in the context of meta-analysis becomes even more apparent. One of the major strengths of rank-based aggregation is the ability to combine lists coming from different sources and platforms, for example different microarray chips, which may or may not be directly comparable otherwise.  相似文献   

8.
The daylily (Hemerocallis spp.) is one of the most economically important ornamental plant species in commerce. Interestingly, it is also one of the most heavily bred crops during the past 60 years. Since the American Hemerocallis Society began acting as the official registry of daylily cultivars in 1947, more than 40 000 registrations have been processed. In order to determine the effects of intensive breeding on cultivar development, and to study relationships among different species, genetic variation in the daylily was estimated using AFLP markers. Nineteen primary genotypes (species and early cultivars) and 100 modern cultivars from different time periods were evaluated using 152 unambiguous bands (average 79% polymorphism rate) derived from three AFLP primer combinations. Overall, pairwise similarity estimates between entries ranged between 0.618 and 0.926 (average=0.800). When comparing cultivar groups from different time periods (1940–1998), genetic similarity was initially increased, compared to the primary diploid genotypes, remained constant from 1940 to 1980, and then steadily increased as breeding efforts intensified and hybridizers began focusing on a limited tetraploid germplasm pool derived by colchicine conversion. Among modern (1991–1998) daylily cultivars, genetic similarity has increased by approximately 10% compared to the primary genotypes. These data were also used to evaluate recent taxonomic classifications among daylily species which, with a few minor exceptions, were generally supported by the AFLP data. Received: 15 March 2000 / Accepted: 13 June 2000  相似文献   

9.
Amplified fragment length polymorphism (AFLP) analysis was performed in order to evaluate genetic characteristics of one common population and two selective hatchery populations of flounder Paralichthys olivaceus. A group of 60 genotypes belonging to three populations was screened using 10 different AFLP primer combinations. A total of 491 loci were produced in the three studied populations. The loci of 65.78%, 61.47% and 60.92% were polymorphic over all the genotypes tested in common, susceptible and resistant populations, respectively. The number of polymorphic loci detected by single primer combination ranged from 21 to 43. The average heterozygosity of common, susceptible and resistant populations was 0.1656, 0.1609 and 0.1586, respectively, which showed no significant difference. Compared with the common population, the two selective hatchery populations, susceptible and resistant, showed significant genetic differences including a smaller (P < 0.05) number of total loci, a smaller (P < 0.05) number of total polymorphic loci and a smaller (P < 0.05) percentage of low frequency (0–0.2) polymorphic loci. AFLP banding pattern was transformed into binary data and matrices were processed with POPGENE and TFPGA software. Similarity relationships were described graphically by a dendrogram, which clustered the three populations. The AFLP fingerprinting technique was confirmed to be a reproducible and sensitive tool for the study of population genetics of flounder. The present study confirmed that it was important to detect the genetic variability of the selective hatchery populations for the conservation of natural flounder resources.  相似文献   

10.

Background  

Inference of population stratification and individual admixture from genetic markers is an integrative part of a study in diverse situations, such as association mapping and evolutionary studies. Bayesian methods have been proposed for population stratification and admixture inference using multilocus genotypes and widely used in practice. However, these Bayesian methods demand intensive computation resources and may run into convergence problem in Markov Chain Monte Carlo based posterior samplings.  相似文献   

11.
adegenet: a R package for the multivariate analysis of genetic markers   总被引:4,自引:0,他引:4  
The package adegenet for the R software is dedicated to the multivariate analysis of genetic markers. It extends the ade4 package of multivariate methods by implementing formal classes and functions to manipulate and analyse genetic markers. Data can be imported from common population genetics software and exported to other software and R packages. adegenet also implements standard population genetics tools along with more original approaches for spatial genetics and hybridization. AVAILABILITY: Stable version is available from CRAN: http://cran.r-project.org/mirrors.html. Development version is available from adegenet website: http://adegenet.r-forge.r-project.org/. Both versions can be installed directly from R. adegenet is distributed under the GNU General Public Licence (v.2).  相似文献   

12.
It is important to preprocess high-throughput data generated from mass spectrometry experiments in order to obtain a successful proteomics analysis. Outlier detection is an important preprocessing step. A naive outlier detection approach may miss many true outliers and instead select many non-outliers because of the heterogeneity of the variability observed commonly in high-throughput data. Because of this issue, we developed a outlier detection software program accounting for the heterogeneous variability by utilizing linear, non-linear and non-parametric quantile regression techniques. Our program was developed using the R computer language. As a consequence, it can be used interactively and conveniently in the R environment. AVAILABILITY: An R package, OutlierD, is available at the Bioconductor project at http://www.bioconductor.org  相似文献   

13.
tossm (Testing of Spatial Structure Methods) is a package for testing the performance of genetic analytical methods in a management context. In the tossm package, any method developed to detect population genetic structure can be combined with a mechanism for creating management units (MUs) based on the genetic analysis. The resulting Boundary-Setting Algorithm (BSA) dictates harvest boundaries with a genetic basis. These BSAs can be evaluated with respect to how well the MUs they define meet management objectives.  相似文献   

14.
15.
Cost reduction in plant breeding and conservation programs depends largely on correctly defining the minimal sample size required for the trustworthy assessment of intra- and inter-cultivar genetic variation. White clover, an important pasture legume, was chosen for studying this aspect. In clonal plants, such as the aforementioned, an appropriate sampling scheme eliminates the redundant analysis of identical genotypes. The aim was to define an optimal sampling strategy, i.e., the minimum sample size and appropriate sampling scheme for white clover cultivars, by using AFLP data (283 loci) from three popular types. A grid-based sampling scheme, with an interplant distance of at least 40 cm, was sufficient to avoid any excess in replicates. Simulations revealed that the number of samples substantially influenced genetic diversity parameters. When using less than 15 per cultivar, the expected heterozygosity (He) and Shannon diversity index (I) were greatly underestimated, whereas with 20, more than 95% of total intra-cultivar genetic variation was covered. Based on AMOVA, a 20-cultivar sample was apparently sufficient to accurately quantify individual genetic structuring. The recommended sampling strategy facilitates the efficient characterization of diversity in white clover, for both conservation and exploitation.  相似文献   

16.
A recently developed molecular technique (amplified fragment length polymorphisms, AFLP) was used for characterizing genetic heterogeneity within and among populations of a critically endangered species of plant, Astragalus cremnophylax var. cremnophylax. Using AFLP, up to 50 polymorphic genetic markers per AFLP-PCR amplification were generated, and a total of 220 variable markers overall. This information was used first to assess genetic diversity within each of the three known populations of Astragalus cremnophylax var. cremnophylax from Grand Canyon National Park in Arizona, USA: North Rim (NR; n= 970), South Rim Site 1 (SR1; n= 500), and South Rim Site 2 (SR2; n= 2). Diversity in the form of average heterozygosity (H) and the proportion of polymorphic genes (P) was greatest in the NR population ((H) = 0.13 and (P) = 0.38) and least in the SR2 population ((H) = 0.02 and (P) = 0.04). Diversity was also quite low for the SR1 population ((H) = 0.04 and (P) = 0.10). In addition, substantial genetic differentiation among populations was indicated by both phenetic (AMOVA) and genetic analyses (overall corrected FST= 0.41). This finding was corroborated by the results of several multivariate analyses which utilized the genetic data, including a UPGMA cluster analysis and a principal coordinate analysis which revealed the existence of discrete groups corresponding to the populations. Population structure was further revealed within the NR population which was known to consist of four spatially separated groups of plants. Several recommendations for the future management of the species are discussed.  相似文献   

17.
The distribution of traits along phylogenies bears signatures of how ecological and evolutionary processes have interacted to influence phenotypic evolution, which can be deciphered using macroevolutionary models. BBMV implements a model for the evolution of continuous characters on phylogenies that generalizes existing ones, like Brownian motion and the Ornstein‐Uhlenbeck model. In this model quantitative characters evolve under both random diffusion and a deterministic force that can be of any possible shape and strength. The model can be used to infer evolutionary scenarios that remained inaccessible so far, like directional trends, disruptive selection, and even bounded evolution. With this new tool at hand, researchers will be able to test complex hypothesis‐driven scenarios regarding trait evolution, but they will also have the possibility to estimate the shape of the adaptive landscapes in which traits evolved. Ultimately, this will provide a way to infer how ecological processes have influenced phenotypic evolution over long timescales. The BBMV package is implemented in the R statistical language and is freely available on the CRAN repository < https://CRAN.R‐project.org/package=BBMV >. All source code can also be found on < https://github.com/fcboucher/BBMV >, along with a detailed tutorial.  相似文献   

18.

Background

The study of discrete characters is crucial for the understanding of evolutionary processes. Even though great advances have been made in the analysis of nucleotide sequences, computer programs for non-DNA discrete characters are often dedicated to specific analyses and lack flexibility. Discrete characters often have different transition rate matrices, variable rates among sites and sometimes contain unobservable states. To obtain the ability to accurately estimate a variety of discrete characters, programs with sophisticated methodologies and flexible settings are desired.

Results

DiscML performs maximum likelihood estimation for evolutionary rates of discrete characters on a provided phylogeny with the options that correct for unobservable data, rate variations, and unknown prior root probabilities from the empirical data. It gives users options to customize the instantaneous transition rate matrices, or to choose pre-determined matrices from models such as birth-and-death (BD), birth-death-and-innovation (BDI), equal rates (ER), symmetric (SYM), general time-reversible (GTR) and all rates different (ARD). Moreover, we show application examples of DiscML on gene family data and on intron presence/absence data.

Conclusion

DiscML was developed as a unified R program for estimating evolutionary rates of discrete characters with no restriction on the number of character states, and with flexibility to use different transition models. DiscML is ideal for the analyses of binary (1s/0s) patterns, multi-gene families, and multistate discrete morphological characteristics.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-320) contains supplementary material, which is available to authorized users.  相似文献   

19.
The Chinese alligator (Alligator sinensis) is a critically endangered species in China. Wild populations of Chinese alligator are on the edge of extinction. Through a release program, some captive‐bred alligators will be selected and released into the wild to supplement and renew natural populations. The purpose of this study was to investigate the genetic variation of captive‐bred Chinese alligators by AFLP markers and to select individuals with maximally different genetic backgrounds for release. Forty‐three captive‐bred alligators of the second filial generation from the Anhui Research Center for Chinese Alligator Reproduction (ARCCAR) were surveyed using four primer combinations, yielding 117 AFLP markers. According to AFLP fingerprints, six samples had distinctly different band patterns compared to other samples. When the six samples were removed from the analysis, there were 19 monomorphic loci and 98 polymorphic loci yielding 84% polymorphic loci. Moreover, the genetic similarity (GS) among 37 samples varied from 0.13–0.97, and the average was 0.7503±0.0064 standard error (SE). When the six samples were included, the GS value among the 43 samples declined and varied from 0.06–0.97, and the average was 0.6523±0.0079 SE. Based on a cluster analysis using UPGMA, a dendrogram of the 43 alligators was constructed. According to the cluster analysis and gender of the 43 samples, eight Chinese alligators with very different genetic backgrounds were selected and suggested for release with two groups in the future. Zoo Biol 0:1–12, 2006. © 2006 Wiley‐Liss, Inc.  相似文献   

20.
Knowledge in the area of genetic diversity could aid in providing useful information in the selection of material for breeding such as hybridization programs and quantitative trait loci mapping. To this end, 50 Nicotiana tabacum genotypes were genotyped with 21 primer combination of amplified fragment length polymorphism (AFLP). A total of 480 unambiguous DNA fragments and 373 polymorphic bands were produced with an average of 17.76 per primer combination. Also, the results revealed high polymorphic rate varing from 52.63 to 92.59 %, demonstrating that AFLP technique utilized in this research can be a powerful and valuable tool in the breeding program of N. tabacum. Cluster analysis based on complete linkage method using Jaccard’s genetic distance, grouped the 50 tobacco genotypes into eight clusters including three relatively big clusters, one cluster including Golden gift, Burly 7022 and Burly Kreuzung, one cluster consisting of two individuals (Pereg234, R9) and three single-member clusters (Pennbel69, Coker176 and Budisher Burley E), Recent genotypes showed high genetic distance from other genotypes belonging to cluster I and II. Association analysis between seven important traits and AFLP markers were performed using four statistical models. The results revealed the model containing both the factors, population structure (Q) and general similarity in genetic background arising from shared kinship (K), reduces false positive associations between markers and phenotype. According to the results nine markers were determined that could be considered to be the most interesting candidates for further studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号