首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A simulation study was performed to investigate the effects of missing values, typing errors and distorted segregation ratios in molecular marker data on the construction of genetic linkage maps, and to compare the performance of three locus-ordering criteria (weighted least squares, maximum likelihood and minimum sum of adjacent recombination fractions criteria) in the presence of such effects. The study was based upon three linkage groups of 10 loci at 2, 6, and 10 cM spacings simulated from a doubled-haploid population of size 150. Criteria performance were assessed using the number of replicates with correctly estimated orders, the mean rank correlation between the estimated and the true order and the mean total map length. Bootstrap samples from replicates in the maximum likelihood analysis produced a measure of confidence in the estimated locus order. The effects of missing values and/or typing errors in the data are to reduce the proportion of correctly ordered maps, and this problem worsens as the distances between loci decreases. The maximum likelihood criterion is most successful at ordering loci correctly, but gives estimated map lengths, which are substantially inflated when typing errors are present. The presence of missing values in the data produces shorter map lengths for more widely spaced markers, especially under the weighted least-squares criterion. Overall, the presence of segregation distortion has little effect on this population.  相似文献   

2.
The effect of replication on gene expression microarray experiments   总被引:5,自引:0,他引:5  
MOTIVATION: We examine the effect of replication on the detection of apparently differentially expressed genes in gene expression microarray experiments. Our analysis is based on a random sampling approach using real data sets from 16 published studies. We consider both the ability to find genes that meet particular statistical criteria as well as the stability of the results in the face of changing levels of replication. RESULTS: While dependent on the data source, our findings suggest that stable results are typically not obtained until at least five biological replicates have been used. Conversely, for most studies, 10-15 replicates yield results that are quite stable, and there is less improvement in stability as the number of replicates is further increased. Our methods will be of use in evaluating existing data sets and in helping to design new studies.  相似文献   

3.
Optimal experimental design is important for the efficient use of modern highthroughput technologies such as microarrays and proteomics. Multiple factors including the reliability of measurement system, which itself must be estimated from prior experimental work, could influence design decisions. In this study, we describe how the optimal number of replicate measures (technical replicates) for each biological sample (biological replicate) can be determined. Different allocations of biological and technical replicates were evaluated by minimizing the variance of the ratio of technical variance (measurement error) to the total variance (sum of sampling error and measurement error). We demonstrate that if the number of biological replicates and the number of technical replicates per biological sample are variable, while the total number of available measures is fixed, then the optimal allocation of replicates for measurement evaluation experiments requires two technical replicates for each biological replicate. Therefore, it is recommended to use two technical replicates for each biological replicate if the goal is to evaluate the reproducibility of measurements.  相似文献   

4.
Since AMBI was published originally in 2000, it has been used in an increasing number of investigations with monitoring purposes, or to analyse impacts on soft-bottom macrobenthic communities. Some guidelines for its correct use were published in 2005; however, a main issue remained without an answer — which are the minimal area and number of replicates necessary, to obtain a precise estimate for AMBI? In this study, new methodologies such as bootstrap techniques have been applied to this particular problem.Data were obtained from sampling carried out in 1995, within the framework of the Littoral Water Quality Monitoring and Control Network of the Basque Country (northern Spain). The sampling strategy consisted of 11 intertidal estuarine sampling stations (0.25m2, sampled for each of six replicates) and 17 subtidal estuarine and coastal sampling stations (0.125m2, sampled for each of six replicates).Two replicates have been established as being sufficient, both for intertidal and subtidal sampling stations, to classify 80% of the pseudosamples into the same disturbance level, in terms of AMBI, for 64% of the stations.For the minimal area, it has been determined also (for both intertidal and subtidal sampling stations) that 0.25m2 is sufficient to classify 80% of the iterations into the same disturbance level, for 64% of the stations.  相似文献   

5.
Environmental DNA (eDNA) is DNA that has been isolated from field samples, and it is increasingly used to infer the presence or absence of particular species in an ecosystem. However, the combination of sampling procedures and subsequent molecular amplification of eDNA can lead to spurious results. As such, it is imperative that eDNA studies include a statistical framework for interpreting eDNA presence/absence data. We reviewed published literature for studies that utilized eDNA where the species density was known and compared the probability of detecting the focal species to the sampling and analysis protocols. Although biomass of the target species and the volume per sample did not impact detectability, the number of field replicates and number of samples from each replicate were positively related to detection. Additionally, increased number of PCR replicates and increased primer specificity significantly increased detectability. Accordingly, we advocate for increased use of occupancy modelling as a method to incorporate effects of sampling effort and PCR sensitivity in eDNA study design. Based on simulation results and the hierarchical nature of occupancy models, we suggest that field replicates, as opposed to molecular replicates, result in better detection probabilities of target species.  相似文献   

6.
Climate change is likely to become an increasingly major obstacle to slowing the rate of species extinctions. Several new assessment approaches have been proposed for identifying climate‐vulnerable species, based on the assumption that established systems such as the IUCN Red List need revising or replacing because they were not developed to explicitly consider climate change. However, no assessment approach has been tested to determine its ability to provide advanced warning time for conservation action for species that might go extinct due to climate change. To test the performance of the Red List system in this capacity, we used linked niche‐demographic models with habitat dynamics driven by a ‘business‐as‐usual’ climate change scenario. We generated replicate 100‐year trajectories for range‐restricted reptiles and amphibians endemic to the United States. For each replicate, we categorized the simulated species according to IUCN Red List criteria at annual, 5‐year, and 10‐year intervals (the latter representing current practice). For replicates that went extinct, we calculated warning time as the number of years the simulated species was continuously listed in a threatened category prior to extinction. To simulate data limitations, we repeated the analysis using a single criterion at a time (disregarding other listing criteria). Results show that when all criteria can be used, the Red List system would provide several decades of warning time (median = 62 years; >20 years for 99% of replicates), but suggest that conservation actions should begin as soon as a species is listed as Vulnerable, because 50% of replicates went extinct within 20 years of becoming uplisted to Critically Endangered. When only one criterion was used, warning times were substantially shorter, but more frequent assessments increased the warning time by about a decade. Overall, we found that the Red List criteria reliably provide a sensitive and precautionary way to assess extinction risk under climate change.  相似文献   

7.
Artificial substrates were compared with a Ponar grab for sampling benthic macroinvertebrates in Lake Anna, Louisa Co., Virginia. The objective was t0 find which technique was best for assessment 0f thermal effluent effects using the following criteria: 1) provide reliable data on density and composition 0f the macrobenthos with a reasonable number 0f replicates; 2) collect the most taxa; and, 3) require the least amount 0f time. Leaves, 3M Corporation's #200 conservation web, and limestone rocks were compared. Each material was tested separately in chicken wire baskets placed 0n the bottom at several depths. Three replicates of each type were retrieved monthly from each depth using SCUBA and cloth flour sacks and compared with grab samples taken from the same depths. Lesser amounts of these materials were tested separately in smaller plastic containers. All large artificial substrate samplers collected significantly more individuals (P = 0.05) and taxa than the Ponar grab. Small web and leaf samplers best met all three 0f the established criteria. The SCUBA system developed in the study is a fast and reliable sampling method.  相似文献   

8.
Y Peng  Y Zhang  G Kou  Y Shi 《PloS one》2012,7(7):e41713
Determining the number of clusters in a data set is an essential yet difficult step in cluster analysis. Since this task involves more than one criterion, it can be modeled as a multiple criteria decision making (MCDM) problem. This paper proposes a multiple criteria decision making (MCDM)-based approach to estimate the number of clusters for a given data set. In this approach, MCDM methods consider different numbers of clusters as alternatives and the outputs of any clustering algorithm on validity measures as criteria. The proposed method is examined by an experimental study using three MCDM methods, the well-known clustering algorithm-k-means, ten relative measures, and fifteen public-domain UCI machine learning data sets. The results show that MCDM methods work fairly well in estimating the number of clusters in the data and outperform the ten relative measures considered in the study.  相似文献   

9.
DNA from 14 Pythium species, Verrucalvus flavofaciens and Zoophagus insidians was characterized by CsCl-bisbenzimide density gradients in order to investigate its taxonomic potential. A few incomplete analyses were made for other species. All clearly assignable Pythium species produced three DNA bands in the gradient. Pythium undulatum along with Verrucalvus and Zoophagus produced only two bands. Another possible exception, which needs further investigation, is P. vexans. The DNA had a relatively constant banding pattern in CsCl gradients. The small number (eight) of DNA criteria that were available were subjected to cluster analysis to assess the relationships between replicates and species. This restricted database, similar in size to the number of criteria used in morphological taxonomy, provided an independent assessment of the values that have been attached to generic and subgeneric classifications. This approach enabled assessments to be made of relationships between species that have incomplete life-histories and which therefore lack features essential for traditional taxonomic decisions.  相似文献   

10.
Bogdan M  Ghosh JK  Doerge RW 《Genetics》2004,167(2):989-999
The problem of locating multiple interacting quantitative trait loci (QTL) can be addressed as a multiple regression problem, with marker genotypes being the regressor variables. An important and difficult part in fitting such a regression model is the estimation of the QTL number and respective interactions. Among the many model selection criteria that can be used to estimate the number of regressor variables, none are used to estimate the number of interactions. Our simulations demonstrate that epistatic terms appearing in a model without the related main effects cause the standard model selection criteria to have a strong tendency to overestimate the number of interactions, and so the QTL number. With this as our motivation we investigate the behavior of the Schwarz Bayesian information criterion (BIC) by explaining the phenomenon of the overestimation and proposing a novel modification of BIC that allows the detection of main effects and pairwise interactions in a backcross population. Results of an extensive simulation study demonstrate that our modified version of BIC performs very well in practice. Our methodology can be extended to general populations and higher-order interactions.  相似文献   

11.
Statistical analysis of microarray data: a Bayesian approach   总被引:2,自引:0,他引:2  
The potential of microarray data is enormous. It allows us to monitor the expression of thousands of genes simultaneously. A common task with microarray is to determine which genes are differentially expressed between two samples obtained under two different conditions. Recently, several statistical methods have been proposed to perform such a task when there are replicate samples under each condition. Two major problems arise with microarray data. The first one is that the number of replicates is very small (usually 2-10), leading to noisy point estimates. As a consequence, traditional statistics that are based on the means and standard deviations, e.g. t-statistic, are not suitable. The second problem is that the number of genes is usually very large (approximately 10,000), and one is faced with an extreme multiple testing problem. Most multiple testing adjustments are relatively conservative, especially when the number of replicates is small. In this paper we present an empirical Bayes analysis that handles both problems very well. Using different parametrizations, we develop four statistics that can be used to test hypotheses about the means and/or variances of the gene expression levels in both one- and two-sample problems. The methods are illustrated using experimental data with prior knowledge. In addition, we present the result of a simulation comparing our methods to well-known statistics and multiple testing adjustments.  相似文献   

12.
Marker pair selection for mapping quantitative trait loci   总被引:10,自引:0,他引:10  
Piepho HP  Gauch HG 《Genetics》2001,157(1):433-444
Mapping of quantitative trait loci (QTL) for backcross and F(2) populations may be set up as a multiple linear regression problem, where marker types are the regressor variables. It has been shown previously that flanking markers absorb all information on isolated QTL. Therefore, selection of pairs of markers flanking QTL is useful as a direct approach to QTL detection. Alternatively, selected pairs of flanking markers can be used as cofactors in composite interval mapping (CIM). Overfitting is a serious problem, especially if the number of regressor variables is large. We suggest a procedure denoted as marker pair selection (MPS) that uses model selection criteria for multiple linear regression. Markers enter the model in pairs, which reduces the number of models to be considered, thus alleviating the problem of overfitting and increasing the chances of detecting QTL. MPS entails an exhaustive search per chromosome to maximize the chance of finding the best-fitting models. A simulation study is conducted to study the merits of different model selection criteria for MPS. On the basis of our results, we recommend the Schwarz Bayesian criterion (SBC) for use in practice.  相似文献   

13.
The use of covariates in the designing of experiments having a block and treatment structure is discussed. It is argued that the replicates of each treatment should cover the complete range of each quantitative covariate, so far as is possible. Two optimality criteria are discussed for this purpose, one based on a neighbourhood distance calculated from the means and standard deviations of the covariate, the other based on an extended definition of the canonical efficiencies of the design.  相似文献   

14.
MOTIVATION: Due to advances in experimental technologies, such as microarray, mass spectrometry and nuclear magnetic resonance, it is feasible to obtain large-scale data sets, in which measurements for a large number of features can be simultaneously collected. However, the sample sizes of these data sets are usually small due to their relatively high costs, which leads to the issue of concordance among different data sets collected for the same study: features should have consistent behavior in different data sets. There is a lack of rigorous statistical methods for evaluating this concordance or discordance. METHODS: Based on a three-component normal-mixture model, we propose two likelihood ratio tests for evaluating the concordance and discordance between two large-scale data sets with two sample groups. The parameter estimation is achieved through the expectation-maximization (E-M) algorithm. A normal-distribution-quantile-based method is used for data transformation. RESULTS: To evaluate the proposed tests, we conducted some simulation studies, which suggested their satisfactory performances. As applications, the proposed tests were applied to three SELDI-MS data sets with replicates. One data set has replicates from different platforms and the other two have replicates from the same platform. We found that data generated by SELDI-MS showed satisfactory concordance between replicates from the same platform but unsatisfactory concordance between replicates from different platforms. AVAILABILITY: The R codes are freely available at http://home.gwu.edu/~ylai/research/Concordance.  相似文献   

15.

Background  

Studies of differential expression that use Affymetrix GeneChip arrays are often carried out with a limited number of replicates. Reasons for this include financial considerations and limits on the available amount of RNA for sample preparation. In addition, failed hybridizations are not uncommon leading to a further reduction in the number of replicates available for analysis. Most existing methods for studying differential expression rely on the availability of replicates and the demand for alternative methods that require few or no replicates is high.  相似文献   

16.
Augmentation mammaplasty: a comparative analysis   总被引:1,自引:0,他引:1  
With the continuation of augmentation mammaplasty as a desirable operation for a large segment of the female population in the United States, the problem of fibrous capsular contracture that has been present since the inception of the operation has persisted. Various approaches to the problem have been entertained, and a lessening of the incidence has occurred as reviewed in our earlier report, which follows augmentation mammaplasty in our clinic from 1962 through 1979. In this retrospective study, no significant difference in contracture rate was seen based on patient smoking habits, operative approach used, or implant type. It is important to note that the total experience with the low-bleed implant was significantly lower in terms of number of patients meeting the criteria of this retrospective study than the standard gel mammary implant. Greater follow-up time and number of patients will be evaluated in future retrospective studies. We have demonstrated in this study that placement of the implant beneath the pectoral muscle has significantly diminished the incidence of capsular contracture both as Baker grades II, III, and IV and as Baker grades III and IV. The retropectoral site has become the preferred location for the prosthesis in our clinic. There is no appreciable alteration in the overall shape of the breasts from this approach, and therefore, it will continue to be the preferred method. Rates of incidence of hematoma, the most frequent adverse reaction after contracture, were not significantly different between the retropectoral and retromammary implant sites.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

17.
ABSTRACT: BACKGROUND: Variations in DNA copy number carry information on the modalities of genome evolution and mis-regulation of DNA replication in cancer cells. Their study can help localize tumor suppressor genes, distinguish different populations of cancerous cells, and identify genomic variations responsible for disease phenotypes. A number of different high throughput technologies can be used to identify copy number variable sites, and the literature documents multiple effective algorithms. We focus here on the specific problem of detecting regions where variation in copy number is relatively common in the sample at hand. This problem encompasses the cases of copy number polymorphisms, related samples, technical replicates, and cancerous sub-populations from the same individual. RESULTS: We present a segmentation method named generalized fused lasso (GFL) to reconstruct copy number variant regions, that is based on penalized estimation and is capable of processing multiple signals jointly. Our approach is computationally very attractive and leads to sensitivity and specificity levels comparable to those of state-of-the-art specialized methodologies. We illustrate its applicability with simulated and real data sets. CONCLUSIONS: The flexibility of our framework makes it applicable to data obtained with a wide range of technology. Its versatility and speed make GFL particularly useful in the initial screening stages of large data sets.  相似文献   

18.
Marechal JP  Hellio C  Sebire M  Clare AS 《Biofouling》2004,20(4-5):211-217
Submerged marine surfaces are rapidly colonized by fouling organisms. Current research is aimed at finding new, non-toxic, or at least environmentally benign, solutions to this problem. Barnacles are a major target organism for such control as they constitute a key component of the hard fouling community. A range of standard settlement assays is available for screening test compounds against barnacle cypris larvae, but they generally provide little information on mechanism(s) of action. Towards this end, a quick and reliable video-tracking protocol has been developed to study the behaviour of the cypris larvae of the barnacle, Balanus amphitrite, at settlement. EthoVision 3.0 was used to track individual cyprids in 30-mm Petri dishes. Experiments were run to determine the optimal conditions vis-a-vis acclimation time, tracking duration, number of replicates, temperature and lighting. A protocol was arrived at involving a two Petri dish system with backlighting, and tracking over a 5-min period after first acclimating the cyprids to test conditions for 2 min. A minimum of twenty replicates was required to account for individual variability in cyprid behaviour from the same batch of larvae. This methodology should be widely applicable to both fundamental and applied studies of larval settlement and with further refinements, to that of smaller fouling organisms such as microalgae and bacteria.  相似文献   

19.
The experimental variance of enzymic steady-state kinetic experiments depends on velocity as approximated by a power function (Var(v) = K1 . valpha (Askel?f, P., Korsfeldt, M. and Mannervik, B. (1976) Eur. J. Biochem. 69, 61--67). The values of the constants (K1, alpha) can be estimated by making replicate measurements of velocity, and the inverse of the function can then be used as a weighting factor. In order to avoid measurement of a large number of replicates to establish the error structure of a kinetic data set, a different approach was tested. After a preliminary regression using a 'good model', which satisfies reasonable goodness-of-fit criteria, the residuals were taken to represent the experimental error. The neighbouring residuals were grouped together and the sum of their mean squared values was used as a measure of the variance in the neighbourhood of the corresponding measurements. The values of the constants obtained in this way agreed with those obtained by replicates.  相似文献   

20.
模拟青霉素分批补料发酵过程的细胞自动机模型   总被引:2,自引:0,他引:2  
根据青霉素产生菌的生长机理和青霉素分批补料发酵过程的动力学特性,在Paull等建立的形态学结构动力学模型的基础上,建立了模拟青霉素分批补料发酵过程的细胞自动机模型。模型采用三维细胞自动机作为菌体生长空间,采用Moore型邻域作为细胞邻域,其演化规则根据青霉素分批补料发酵过程中菌体生长机理和简化动力学结构模型设计。模型中的每一个细胞既可代表单个产黄青霉菌体细胞,又可代表特定数量的这种菌体细胞,它具有不同的状态。对模型进行的仿真实验结果表明:模型不但能一致地复现形态学结构动力学模型所描述的青霉素分批补料发酵过程的演化特性,而且较形态学结构动力学模型更加直观地刻画了青霉素分批补料发酵过程的演化行为。最后,对所建模型在实际生产过程中的应用问题进行了分析,指出了需要进一步研究的问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号