共查询到20条相似文献,搜索用时 15 毫秒
1.
A prediction-based resampling method for estimating the number of clusters in a dataset 总被引:10,自引:0,他引:10 下载免费PDF全文
Background
Microarray technology is increasingly being applied in biological and medical research to address a wide range of problems, such as the classification of tumors. An important statistical problem associated with tumor classification is the identification of new tumor classes using gene-expression profiles. Two essential aspects of this clustering problem are: to estimate the number of clusters, if any, in a dataset; and to allocate tumor samples to these clusters, and assess the confidence of cluster assignments for individual samples. Here we address the first of these problems. 相似文献2.
On the use of resampling tests for evaluating statistical significance of binding-site co-occurrence
Background
In eukaryotes, most DNA-binding proteins exert their action as members of large effector complexes. The presence of these complexes are revealed in high-throughput genome-wide assays by the co-occurrence of the binding sites of different complex components. Resampling tests are one route by which the statistical significance of apparent co-occurrence can be assessed. 相似文献3.
A resampling approach for evaluating effects of pasture abandonment on subalpine plant species diversity 总被引:1,自引:0,他引:1
Abstract. The decline of species‐rich semi‐natural calcareous grasslands is a major conservation problem throughout Europe. Maintenance of traditional animal husbandry is often recommended as an important management strategy. However, results that underpin such management recommendations were derived predominantly from lowland studies and may not be easily applicable to high mountain areas. In this study we analyse the importance of traditional low‐intensity summer farming (cattle grazing) for vascular plant species diversity of a subalpine region in the northern calcareous Alps in Austria by resampling from an existing dataset on its vegetation. Results indicate a significant long term decline of plant species diversity following abandonment at the landscape scale. In contrast, within‐community effects of pasture abandonment on plant species diversity are equivocal and strongly depend on the plant community. We suppose these differences to be due to diet preferences of cattle as well as to the differential importance of competition for structuring the respective communities. From our results we infer that the main mechanism by which pasture abandonment affects vascular plant species diversity, at least during the first ca. 100 yr documented here, are not local‐scale competitive exclusion processes within persisting communities. Instead, post‐abandonment successional community displacements that cause a landscape scale homogenization of the vegetation cover seem to be primarily responsible for a decline of species diversity. We conclude, that successful management of vascular plant species diversity in subalpine regions of the Northeastern Calcareous Alps will depend on the maintenance of large scale pasture systems with a spatially variable disturbance regime. 相似文献
4.
A resampling method based on pivotal estimating functions 总被引:6,自引:0,他引:6
5.
A simple resampling method by perturbing the minimand 总被引:3,自引:0,他引:3
6.
Antithetic resampling for the bootstrap 总被引:1,自引:0,他引:1
7.
8.
Assessing genome-wide statistical significance is an important and difficult problem in multipoint linkage analysis. Due to multiple tests on the same genome, the usual pointwise significance level based on the chi-square approximation is inappropriate. Permutation is widely used to determine genome-wide significance. Theoretical approximations are available for simple experimental crosses. In this article, we propose a resampling procedure to assess the significance of genome-wide QTL mapping for experimental crosses. The proposed method is computationally much less intensive than the permutation procedure (in the order of 10(2) or higher) and is applicable to complex breeding designs and sophisticated genetic models that cannot be handled by the permutation and theoretical methods. The usefulness of the proposed method is demonstrated through simulation studies and an application to a Drosophila backcross. 相似文献
9.
以玉米秸秆为代表性纤维质原料,尝试建立一种评估预处理效果的新方法——持水率测定法,即:将试样在室温下浸泡1 h,在分离因数1 000下离心5 min后测定持水率。结果表明:在机理上木质纤维的持水率与可消化性具有一致性,在某种程度上具有正相关性;持水率作为一种简单快捷的新型测定方法,能够用来评估木质纤维素类生物质的预处理效果,不同预处理方法通过打破木质纤维的复杂致密结构,破坏氢键和酯键作用增加其孔径和孔穴,同时使其暴露出更多的游离羟基等亲水性基团,最终增加了木质纤维的持水率。 相似文献
10.
Visual crossdating of tree-ring series focusses on high-frequency variations. Automated correlation-based crossdating tools mimic this by transforming raw ring widths into indices that emphasise the high frequency signal, prior to calculating the goodness-of-fit between series. Here we present a resampling methodology to determine the relative merits of alternative simple high-pass filters and demonstrate it using two tree-ring data sets (British Isles oak, New Zealand kauri). Results indicate that: (a) high-pass filtering is a critical step; (b) the efficacy of alternative filters is variable, and; (c) efficacy appears to be species specific. These results have implications for crossdating in the two contexts investigated, and also for future software developments, especially the desirability of flexible implementations of high-pass filtering. 相似文献
11.
A new method, which can be called as isothermal acid-titration calorimetry (IATC), was proposed for evaluating the enthalpy of protein molecules as a function of pH using isothermal titration calorimetry (ITC). This measurement was used to analyze the acid-denaturation of bovine ribonuclease A. The enthalpy change by acid-denaturation of this protein was estimated as 310 kJ/mol at pH 2.8 and 40 degrees C. This value agreed well with the enthalpy change obtained by differential scanning calorimetry. The midpoint pH and proton binding-number difference observed by IATC agreed well with those of the acid transition of the three-dimensional structure monitored by circular dichroism spectrometry. The van't Hoff enthalpy of the transition was derived from the temperature dependence of the midpoint pH and the proton binding-number difference. It agreed well with the calorimetric enthalpy change directly observed by IATC, strongly indicating that there was no stable intermediate state during the acid transition of this protein. 相似文献
12.
On importance resampling for the bootstrap 总被引:1,自引:0,他引:1
13.
R Antonicelli G Coppa M Piani I Testa P Russo 《Bollettino della Società italiana di biologia sperimentale》1985,61(1):151-158
The measurements of intracellular "Na+ activity" was performed in 10 ml of heparinized venous blood. First the blood was three times washed in isotonic magnesium chloride solution (114 mmol/l). Thereby the buffy coat was removed. Then the microhematrocrit was taken for packet cell volume determination. After the erythrocytes were lysed by ultrasound. Sodium "Na+ Activity" is measured in the hemolysate by Ion-Selective electrode. With this method all "pipetting" operations are eliminated and for the "Na+ activity" determination was used ion-selective electrode with an indirect measurements, which is less influenced by the matrix. Reference intervals determined for a healthy population were 7.3 +/- 0.6 mmol/l. 相似文献
14.
15.
16.
Wright FA Huang H Guan X Gamiel K Jeffries C Barry WT de Villena FP Sullivan PF Wilhelmsen KC Zou F 《Bioinformatics (Oxford, England)》2007,23(19):2581-2588
MOTIVATION: Reductions in genotyping costs have heightened interest in performing whole genome association scans and in the fine mapping of candidate regions. Improvements in study design and analytic techniques will require the simulation of datasets with realistic patterns of linkage disequilibrium and allele frequencies for typed SNPs. METHODS: We describe a general approach to simulate genotyped datasets for standard case-control or affected child trio data, by resampling from existing phased datasets. The approach allows for considerable flexibility in disease models, potentially involving a large number of interacting loci. The method is most applicable for diseases caused by common variants that have not been under strong selection, a class specifically targeted by the International HapMap project. RESULTS: Using the three population Phase I/II HapMap data as a testbed for our approach, we have implemented the approach in HAP-SAMPLE, a web-based simulation tool. 相似文献
17.
In the classical approach to tree reconstruction schemes, such as pair group methods, maximum parsimony or minimum spanning trees, two major problems are not addressed at a fundamental level. First, for numerous kinds of experimental data, these methods produce equivalent solutions, but provide no way of handling those degeneracies. Second, the real-life data fed to these methods is treated as exact data, and possible measurement errors cannot be taken into account. We provide a statistical solution for both the degeneracy and data imperfection problem, which is built as a framework around the clustering method. It is therefore independent of the particular choice of clustering or population modeling algorithm and is applicable to any of the presently known methods that are subject to one or both of these problems. 相似文献
18.
De novo protein structure prediction requires location of the lowest energy state of the polypeptide chain among a vast set of possible conformations. Powerful approaches include conformational space annealing, in which search progressively focuses on the most promising regions of conformational space, and genetic algorithms, in which features of the best conformations thus far identified are recombined. We describe a new approach that combines the strengths of these two approaches. Protein conformations are projected onto a discrete feature space which includes backbone torsion angles, secondary structure, and beta pairings. For each of these there is one “native” value: the one found in the native structure. We begin with a large number of conformations generated in independent Monte Carlo structure prediction trajectories from Rosetta. Native values for each feature are predicted from the frequencies of feature value occurrences and the energy distribution in conformations containing them. A second round of structure prediction trajectories are then guided by the predicted native feature distributions. We show that native features can be predicted at much higher than background rates, and that using the predicted feature distributions improves structure prediction in a benchmark of 28 proteins. The advantages of our approach are that features from many different input structures can be combined simultaneously without producing atomic clashes or otherwise physically inviable models, and that the features being recombined have a relatively high chance of being correct. Proteins 2010. © 2009 Wiley‐Liss, Inc. 相似文献
19.
Background
Microarray technology is often used to identify the genes that are differentially expressed between two biological conditions. On the other hand, since microarray datasets contain a small number of samples and a large number of genes, it is usually desirable to identify small gene subsets with distinct pattern between sample classes. Such gene subsets are highly discriminative in phenotype classification because of their tightly coupling features. Unfortunately, such identified classifiers usually tend to have poor generalization properties on the test samples due to overfitting problem.Results
We propose a novel approach combining both supervised learning with unsupervised learning techniques to generate increasingly discriminative gene clusters in an iterative manner. Our experiments on both simulated and real datasets show that our method can produce a series of robust gene clusters with good classification performance compared with existing approaches.Conclusion
This backward approach for refining a series of highly discriminative gene clusters for classification purpose proves to be very consistent and stable when applied to various types of training samples.20.
We propose a simple and general resampling strategy to estimatevariances for parameter estimators derived from nonsmooth estimatingfunctions. This approach applies to a wide variety of semiparametricand nonparametric problems in biostatistics. It does not requiresolving estimating equations and is thus much faster than theexisting resampling procedures. Its usefulness is illustratedwith heteroscedastic quantile regression and censored data rankregression. Numerical results based on simulated and real dataare provided. 相似文献