共查询到20条相似文献,搜索用时 0 毫秒
1.
R. Pong-Wong J. A. Woolliams 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》1996,93(7):1090-1097
A method for estimating major gene effects using Gibbs sampling to infer genotype of individuals with unknown values, was compared with a standard mixed-model analysis. The purpose of this study was to evaluate the effect of including information of individuals with unknown genotypes on the estimates and their error variances (Ve) of the single-gene effects. When genotypes were known for all the individuals, results using the Gibbs method (GS) were similar to those obtained with the mixed model (MM). In the absence of selection, when information from individuals with unknown genotypes was included, GS yielded unbiased estimates of the major gene effects while reducing the Ve associated with them. This reduction in Ve depended on the gene frequency and mode of action of the major locus. For the additive effect, the reduction in Ve ranged from 29 to 69% of the total reduction which would have been obtained if all individuals had had a known genotype. Similarly the reduction in Ve found for the dominance effect ranged from 12 to 58%. Estimates using GS generally had small detectable biases when the polygenic heritability used in the analysis was inflated or estimated simultaneously. However, the benefit of using information from individuals with unknown genotypes was still maintained when comparing the mean square error of the estimates using either GS or MM when genotypes are only known for a subset of the population. When the population has been under selection, the use of Gibbs sampling to incorporate information of individuals without genotypes reduced substantially the bias and mean square error found for MM analysis on partial data. Nevertheless, there was some bias detected using Gibbs sampling. The gene frequency of the major gene in the base population was also well estimated despite its change over generations due to selection. 相似文献
2.
The Gibbs sampling method has been widely used for sequence analysis after it was successfully applied to the problem of identifying regulatory motif sequences upstream of genes. Since then, numerous variants of the original idea have emerged: however, in all cases the application has been to finding short motifs in collections of short sequences (typically less than 100 nucleotides long). In this paper, we introduce a Gibbs sampling approach for identifying genes in multiple large genomic sequences up to hundreds of kilobases long. This approach leverages the evolutionary relationships between the sequences to improve the gene predictions, without explicitly aligning the sequences. We have applied our method to the analysis of genomic sequence from 14 genomic regions, totaling roughly 1.8 Mb of sequence in each organism. We show that our approach compares favorably with existing ab initio approaches to gene finding, including pairwise comparison based gene prediction methods which make explicit use of alignments. Furthermore, excellent performance can be obtained with as little as four organisms, and the method overcomes a number of difficulties of previous comparison based gene finding approaches: it is robust with respect to genomic rearrangements, can work with draft sequence, and is fast (linear in the number and length of the sequences). It can also be seamlessly integrated with Gibbs sampling motif detection methods. 相似文献
3.
Biclustering microarray data by Gibbs sampling 总被引:1,自引:0,他引:1
MOTIVATION: Gibbs sampling has become a method of choice for the discovery of noisy patterns, known as motifs, in DNA and protein sequences. Because handling noise in microarray data presents similar challenges, we have adapted this strategy to the biclustering of discretized microarray data. RESULTS: In contrast with standard clustering that reveals genes that behave similarly over all the conditions, biclustering groups genes over only a subset of conditions for which those genes have a sharp probability distribution. We have opted for a simple probabilistic model of the biclusters because it has the key advantage of providing a transparent probabilistic interpretation of the biclusters in the form of an easily interpretable fingerprint. Furthermore, Gibbs sampling does not suffer from the problem of local minima that often characterizes Expectation-Maximization. We demonstrate the effectiveness of our approach on two synthetic data sets as well as a data set from leukemia patients. 相似文献
4.
5.
Markov chain Monte Carlo (MCMC) methods have been widely used to overcome computational problems in linkage and segregation analyses. Many variants of this approach exist and are practiced; among the most popular is the Gibbs sampler. The Gibbs sampler is simple to implement but has (in its simplest form) mixing and reducibility problems; furthermore in order to initiate a Gibbs sampling chain we need a starting genotypic or allelic configuration which is consistent with the marker data in the pedigree and which has suitable weight in the joint distribution. We outline a procedure for finding such a configuration in pedigrees which have too many loci to allow for exact peeling. We also explain how this technique could be used to implement a blocking Gibbs sampler. 相似文献
6.
D. P. Gwaze J. A. Woolliams 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2001,103(1):63-69
The use of Gibbs sampling in making decisions about the optimal selection environment was demonstrated. Marginal posterior
distributions of the efficiency of selection across sites were obtained using the Gibbs sampler, a Bayesian method, from which
the probability that the efficiency of selection lay between specified values and the variance of the distribution were computed,
providing a lot of information on which to make decisions regarding the location of genetic tests. The heritability, genetic
correlations and efficiencies of selection estimated using REML and Gibbs sampling were similar. However, the latter approach
showed that the point estimates of the efficiencies of selection were subject to substantial error. The decision regarding
selection at maturity was consistent with that obtained using point estimates from REML, but Gibbs sampling allowed the efficiencies
of selection to be interpreted with more confidence. The decision regarding early selection differed from that based on REML
point estimates. Generally, the decisions to make early selections at site B for planting at both site B and A, and to make
selections at maturity at each individual site, were robust to different priors in the Gibbs sampling.
Received: 19 June 2000 / Accepted: 18 October 2000 相似文献
7.
8.
9.
Analysis of gene expression in single cells 总被引:4,自引:0,他引:4
A cell's structural and functional characteristics are dependent on the specific complement of genes it expresses. The ability to study and compare gene usage at the cellular level will therefore provide valuable insights into cell physiology. Such analyses are complicated by problems associated with sample collection, sample size and the limited sensitivity of expression assays. Advances have been made in approaches to the collection of cellular material and the performance of single-cell gene expression analysis. Recent development in global amplification of mRNA may soon permit expression analyses of single cells to be performed on DNA microarrays. 相似文献
10.
11.
Estimating single gene effects on quantitative traits 总被引:1,自引:0,他引:1
D. G. Gilbert 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》1985,69(5-6):631-636
Summary Experimental designs for measuring the effects of single loci on quantitative traits are compared for statistical properties. The designs tested are single population, combined strains, multiple strains, diallel of strains, and co-isogenic strains. Testing was done by simulating population genotypic and phenotypic arrays. Statistical properties measured are type I error, power, bias and efficiency. The relative ranking of designs is consistent for all properties and over eight conditions examined. The co-isogenic design is superior, followed closely by the single population method. The other three designs are similar in ability, with the diallel design somewhat superior. Based on its good statistical performance and wide feasibility, the single population method is recommended. The diallel method provides the most information on genetic components of variation. 相似文献
12.
MOTIVATION: Over the last decade, a large variety of clustering algorithms have been developed to detect coregulatory relationships among genes from microarray gene expression data. Model-based clustering approaches have emerged as statistically well-grounded methods, but the properties of these algorithms when applied to large-scale data sets are not always well understood. An in-depth analysis can reveal important insights about the performance of the algorithm, the expected quality of the output clusters, and the possibilities for extracting more relevant information out of a particular data set. RESULTS: We have extended an existing algorithm for model-based clustering of genes to simultaneously cluster genes and conditions, and used three large compendia of gene expression data for Saccharomyces cerevisiae to analyze its properties. The algorithm uses a Bayesian approach and a Gibbs sampling procedure to iteratively update the cluster assignment of each gene and condition. For large-scale data sets, the posterior distribution is strongly peaked on a limited number of equiprobable clusterings. A GO annotation analysis shows that these local maxima are all biologically equally significant, and that simultaneously clustering genes and conditions performs better than only clustering genes and assuming independent conditions. A collection of distinct equivalent clusterings can be summarized as a weighted graph on the set of genes, from which we extract fuzzy, overlapping clusters using a graph spectral method. The cores of these fuzzy clusters contain tight sets of strongly coexpressed genes, while the overlaps exhibit relations between genes showing only partial coexpression. AVAILABILITY: GaneSh, a Java package for coclustering, is available under the terms of the GNU General Public License from our website at http://bioinformatics.psb.ugent.be/software 相似文献
13.
The composition of the genome after introgression of a marker gene from a donor to a recipient breed was studied using analytical and simulation methods. Theoretical predictions of proportional genomic contributions, including donor linkage drag, from ancestors used at each generation of crossing after an introgression programme agreed closely with simulated results. The obligate drag, the donor genome surrounding the target locus that cannot be removed by subsequent selection, was also studied. It was shown that the number of backcross generations and the length of the chromosome affected proportional genomic contributions to the carrier chromosomes. Population structure had no significant effect on ancestral contributions and linkage drag but it did have an effect on the obligate drag whereby larger offspring groups resulted in smaller obligate drag. The implications for an introgression programme of the number of backcross generations, the population structure and the carrier chromosome length are discussed. The equations derived describing contributions to the genome from individuals from a given generation provide a framework to predict the genomic composition of a population after the introgression of a favourable donor allele. These ancestral contributions can be assigned a value and therefore allow the prediction of genetic lag. 相似文献
14.
Comparison of REML and Gibbs sampling estimates of multi-trait genetic parameters in Scots pine 总被引:2,自引:0,他引:2
Waldmann P Ericsson T 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2006,112(8):1441-1451
Multi-trait (co)variance estimation is an important topic in plant and animal breeding. In this study we compare estimates obtained with restricted maximum likelihood (REML) and Bayesian Gibbs sampling of simulated data and of three traits (diameter, height and branch angle) from a 26-year-old partial diallel progeny test of Scots pine (Pinus sylvestris L.). Based on the results from the simulated data we can conclude that the REML estimates are accurate but the mode of posterior distributions from the Gibbs sampling can be overestimated depending on the level of the heritability. The mean and median of the posteriors were considerably higher than the expected values of the heritabilities. The confidence intervals calculated with the delta method were biased downwardly. The highest probablity density (HPD) interval provides a better interval estimate, but could be slightly biased at the lower level. Similar differences between REML and Gibbs sampling estimates were found for the Scots pine data. We conclude that further simulation studies are needed in order to evaluate the effect of different priors on (co)variance components in the genetic individual model. 相似文献
15.
16.
17.
Kannan Tharakaraman Leonardo Mari?o-Ramírez Sergey L Sheetlin David Landsman John L Spouge 《BMC bioinformatics》2006,7(1):408
Background
Many DNA regulatory elements occur as multiple instances within a target promoter. Gibbs sampling programs for finding DNA regulatory elements de novo can be prohibitively slow in locating all instances of such an element in a sequence set. 相似文献18.
We apply the method of "blocking Gibbs" sampling to a problem of great importance and complexity-linkage analysis. Blocking Gibbs sampling combines exact local computations with Gibbs sampling, in a way that complements the strengths of both. The method is able to handle problems with very high complexity, such as linkage analysis in large pedigrees with many loops, a task that no other known method is able to handle. New developments of the method are outlined, and it is applied to a highly complex linkage problem in a human pedigree. 相似文献
19.
Kappes SM 《Theriogenology》1999,51(1):135-147
A number of recent advances in genomic research will change and improve livestock production in the near future. Genetic linkage maps have been developed for a number of livestock species including cattle, sheep, and pigs. These maps allow scientists to identify chromosomal regions that influence traits of economic importance. This information will lead to improved genetic selection practices by identifying animals with superior copies of the chromosomal regions that affect the selected trait. This mapping information will also be used to identify the genes controlling the trait. A number of genomic regions or loci have already been reported that affect production, carcass or disease traits, and in a few cases, a specific gene has been identified. Production of transgenic animals with sequence changes in these genes may be beneficial for evaluating the effect of the gene upon the selected trait and more specifically the effect of certain polymorphisms (mutations) within the gene. 相似文献