首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A method is developed that caters for the application of correspondence analysis to two-way contingency tables with one and two ordered sets of categories. The method involves calculating orthogonal polynomials of the type described by EMERSON (1968), and partitioning the chi-square statistic using the method described in LANCASTER (1953). The method has all the features of simple correspondence analysis, although allows for additional information about the structure and association of the data to be made by isolating location, dispersion and higher order components of the rows and columns.  相似文献   

2.
3.
We consider sample size determination for ordered categorical data when the alternative assumption is the proportional odds model. In this paper the sample size formula proposed by Whitehead (Statistics in Medicine, 12 , 2257–2271, 1993) is compared with the methods based on exact and asymptotic linear rank tests with Wilcoxon and trend scores. We show that Whitehead's formula, which is based on a normal approximation, works well when the sample size is moderate to large but recommend the exact method with Wilcoxon scores for small sample sizes. The consequences of misspecification in models are also investigated.  相似文献   

4.

Purpose

A number of previous studies have shown inconsistencies between sub-scale scores and component summary scores using traditional scoring methods of the SF-36 version 1. This study addresses the issue in Version 2 and asks if the previous problems of disagreement between the eight SF-36 Version 1 sub-scale scores and the Physical and Mental Component Summary persist in version 2. A second study objective is to review the recommended scoring methods for the creation of factor scoring weights and the effect on producing summary scale scores

Methods

The 2004 South Australian Health Omnibus Survey dataset was used for the production of coefficients. There were 3,014 observations with full data for the SF-36. Data were analysed in LISREL V8.71. Confirmatory factor analysis models were fit to the data producing diagonally weighted least squares estimates. Scoring coefficients were validated on an independent dataset, the 2008 South Australian Health Omnibus Survey.

Results

Problems of agreement were observed with the recommended orthogonal scoring methods which were corrected using confirmatory factor analysis.

Conclusions

Confirmatory factor analysis is the preferred method to analyse SF-36 data, allowing for the correlation between physical and mental health.  相似文献   

5.
  • 1 Methods used for the study of species–environment relationships can be grouped into: (i) simple indirect and direct gradient analysis and multivariate direct gradient analysis (e.g. canonical correspondence analysis), all of which search for non-symmetric patterns between environmental data sets and species data sets; and (ii) analysis of juxtaposed tables, canonical correlation analysis, and intertable ordination, which examine species–environment relationships by considering each data set equally. Different analytical techniques are appropriate for fulfilling different objectives.
  • 2 We propose a method, co-inertia analysis, that can synthesize various approaches encountered in the ecological literature. Co-inertia analysis is based on the mathematically coherent Euclidean model and can be universally reproduced (i.e. independently of software) because of its numerical stability. The method performs simultaneous analysis of two tables. The optimizing criterion in co-inertia analysis is that the resulting sample scores (environmental scores and faunistic scores) are the most covariant. Such analysis is particularly suitable for the simultaneous detection of faunistic and environmental features in studies of ecosystem structure.
  • 3 The method was demonstrated using faunistic and environmental data from Friday (Freshwater Biology 18, 87-104, 1987). In this example, non-symmetric analyses is inappropriate because of the large number of variables (species and environmental variables) compared with the small number of samples.
  • 4 Co-inertia analysis is an extension of the analysis of cross tables previously attempted by others. It serves as a general method to relate any kinds of data set, using any kinds of standard analysis (e.g. principal components analysis, correspondence analysis, multiple correspondence analysis) or between-class and within-class analyses.
  相似文献   

6.
MOTIVATION: Background distribution statistics for profile-based sequence alignment algorithms cannot be calculated analytically, and hence such algorithms must resort to measuring the significance of an alignment score by assessing its location among a distribution of background alignment scores. The Gumbel parameters that describe this background distribution are usually pre-computed for a limited number of scoring systems, gap schemes, and sequence lengths and compositions. The use of such look-ups is known to introduce errors, which compromise the significance assessment of a remote homology relationship. One solution is to estimate the background distribution for each pair of interest by generating a large number of sequence shuffles and use the distribution of their scores to approximate the parameters of the underlying extreme value distribution. This is computationally very expensive, as a large number of shuffles are needed to precisely estimate the score statistics. RESULTS: Convergent Island Statistics (CIS) is a computationally efficient solution to the problem of calculating the Gumbel distribution parameters for an arbitrary pair of sequences and an arbitrary set of gap and scoring schemes. The basic idea behind our method is to recognize the lack of similarity for any pair of sequences early in the shuffling process and thus save on the search time. The method is particularly useful in the context of profile-profile alignment algorithms where the normalization of alignment scores has traditionally been a challenging task. CONTACT: aleksandar@eidogen.com SUPPLEMENTARY INFORMATION: http://www.eidogen-sertanty.com/Documents/convergent_island_stats_sup.pdf.  相似文献   

7.

Background, aim and scope

A characterisation model based on multi-criteria indicators has been developed for each of four impact categories representing the labour rights according to the conventions of the International Labour Organisation (ILO) covering: forced labour, discrimination, restrictions of freedom of association and collective bargaining and child labour (Dreyer et al., Int J Life Cycle Assess, 2010a, in press). These impact categories are considered by the authors to be among the obligatory impact categories in a Social LCA. The characterisation models combine information about the way a company manages its behaviour towards some of its important stakeholders, its employees, with information about the geographical location and branch of industry of the company and the risk of violations of these workers' rights inherent in the setting of the company. The result is an indicator score which for each impact category represents the risk that violations occur in the company. In order to test the feasibility and relevance of the developed methodology, it is tested on real cases.

Materials and methods

The developed characterisation models are applied to six cases representing individual manufacturing companies from three different continents. Five of the case companies are manufacturing companies while the sixth is a knowledge company. The application involves scoring the management efforts of the case company in a multi-criteria scorecard and translating the scores into an aggregated performance score, which represents the effort of the management in order to prevent violations of the workers' rights to occur in the company. The company performance score is multiplied by a contextual adjustment score which reflects the risk of violations taking place in the context (in terms of geographical location or industrial branch or sector) of the company. The resulting indicator score represents the risk that violations take place of the labour right represented by the impact category.

Results

The social impact characterisation is performed for each of the six case studies using the methodology earlier developed. The procedure and outcome are documented through all the intermediary results shown for all four obligatory impact categories for each of the six case studies.

Discussion

The results are judged against the risk which was observed during visits and interviews at each of the six case companies, and their realism and relevance are discussed. They are found to be satisfactory for all four impact categories for the manufacturing companies, but there are some problems for two of the impact categories in the case company which represents knowledge work, and it is discussed how these problems may be addressed through change of the underlying scorecard or the way in which the scoring is translated into a company performance score.

Conclusions

It is concluded that it is feasible to perform a characterisation of the impacts related to the four obligatory impact categories representing the labour rights according to the conventions of the ILO covering: forced labour, discrimination, restrictions of freedom of association and collective bargaining and child labour. When compared with the observed situation in the companies, the results are also found to be relevant and realistic.

Recommendations and perspectives

The proposed characterisation method is rather time-consuming and cannot realistically be applied to all companies in the product system. It must therefore be combined with less time-requiring screening methods which can help identify the key companies in the life cycle for which a detailed analysis is required. The possibility to apply country- or industry sector-based information is discussed, and while it is found useful to identify low-risk companies and eliminate them from more detailed studies, the ability of the screening methods to discriminate between companies located in medium and high-risk contexts is questionable.  相似文献   

8.
A fully Bayesian analysis using Gibbs sampling and data augmentation in a multivariate model of Gaussian, right censored, and grouped Gaussian traits is described. The grouped Gaussian traits are either ordered categorical traits (with more than two categories) or binary traits, where the grouping is determined via thresholds on the underlying Gaussian scale, the liability scale. Allowances are made for unequal models, unknown covariance matrices and missing data. Having outlined the theory, strategies for implementation are reviewed. These include joint sampling of location parameters; efficient sampling from the fully conditional posterior distribution of augmented data, a multivariate truncated normal distribution; and sampling from the conditional inverse Wishart distribution, the fully conditional posterior distribution of the residual covariance matrix. Finally, a simulated dataset was analysed to illustrate the methodology. This paper concentrates on a model where residuals associated with liabilities of the binary traits are assumed to be independent. A Bayesian analysis using Gibbs sampling is outlined for the model where this assumption is relaxed.  相似文献   

9.
Joint plots of species and site scores in correspondence analysis can be interpreted so that points which are in the same direction from the origin are closely associated. Species and site scores can thus be compared. Within each set of scores the location of points is also meaningful. Owing to the unimodal species response model which can be recovered by the correspondence analysis, the location of the species points in respect to the site points indicates the location of the optimum when direct weighted averages are used. However, when the eigenvalue of the solution is low, very different joint plots are derived when the set which is used for computing the weighted averages for the other set is changed. This makes the interpretation of the between-set proximities in the joint plot vague. Generally, it is not possible to deduce the species composition of a site or sites where a species occurs from the joint plot, although for some pairs of species and sites this is justified. Although the proximity interpretation is not possible in every case, a joint plot display can greatly enhance the interpretation of results.  相似文献   

10.
This study examined two problems in the measurement of chimpanzee behavior: (1) comparability among data sets varying in length of total observation time; and (2) the longest interval for scoring reliable numbers of sample points with instantaneous sampling (this required procedures for evaluating the chi-square statistics of the sampled data). During a 4.5-month field study conducted at the Mahale Mountains National Park, Tanzania, one adult male was observed as a focal animal for about 300 hr with continuous recording. His behavior was classified into five categories. Data sets varying in total time were prepared by extraction from the raw data. Comparability among the data sets was evaluated using Pearson's correlation coefficients and Kendall's coefficients of concordance calculated from two kinds of measures obtained from the raw and simulated data sets: (a) the percentages of time spent by the focal animal in each behavior category; and (b) those of the time spent by adult males in his proximity. The results revealed that observation time of 25 hr was the critical length for scoring the above measures reliably. Sample points for the focal animal's behavior categories and for adult males in his proximity were simulated with intervals of various lengths for data sets differing in total time. The longest interval was measured by comparing the simulated scores with confidence limits calculated for the number of sample points to be scored with the respective intervals. It was found that the interval for sampling should be set at 3 min or shorter, and that chi-square statistics calculated from the data sampled with such an interval should be evaluated after their modification into the values to be obtained from the data sampled with a 5-min interval. These results may not be directly applicable to studies dealing with other behavior categories, other age/sex classes of focal animals, etc. However, the above problems should be examined widely in studies attempting to measure animal behavior, and the methods employed in this study are applicable to such studies.  相似文献   

11.
B I Graubard  E L Korn 《Biometrics》1987,43(2):471-476
The numerous statistical methods for testing no association between a binary response (rows) and K ordered categories (columns) group naturally into two classes: those that require preassigned numerical column scores and those that do not. An example of the former would be a logistic regression analysis, and of the latter would be a Wilcoxon rank-sum test. In this paper we demonstrate that the perceived advantage of not preassigning scores is illusory. We do this by presenting an example from our consulting experience in which the midrank scores used by the rank tests that do not require preassigned scores are clearly inappropriate. Our recommendations are to assign reasonable column scores whenever possible, and to consider equally spaced scores when the choice is not apparent. Midranks as scores should always be examined for their appropriateness before a rank test is applied.  相似文献   

12.
《Cancer epidemiology》2014,38(2):200-208
PurposeMany malignancy scores have been developed without comprehensive statistical or measurement validation, and in particular without verification of the necessary property of unidimensionality. Here, we used Rasch analysis to assess unidimensionality and identify measurement biases of malignancy scores.MethodsThe Weiss histopathological system (WHS), summing nine items of histopathological alteration, was used to evaluate 247 adrenocortical tumors. Rasch model analysis was implemented and compared to classical factor analytic methods to investigate the validity of item-score summation for both the original and modified WHS, to assess differential functioning of the WHS items across various factors related to patient and tumors, and to identify items or subtypes of tumors that could be considered for removal or exclusion from the WHS with the aims of improving measurement and relieving the burden on pathologists.ResultsThe WHS does not meet the necessary property of unidimensionality and is severely affected by differential item functioning in relation to size and weight of the tumor. Moreover, items are not well distributed along the spectrum of malignancy, most being located in the upper part and several at the same place.ConclusionThe WHS in its present form should be applied only to small or moderate size tumors, and better scoring systems could be developed by using more appropriately distributed items. Rasch analysis is a powerful method for developing, evaluating, refining and simplifying malignancy scores.  相似文献   

13.
Diatom composition of four Lake Erie estuaries was related to seasonal factors, year, location within the estuaries, and water quality parameters including nutrient and metals concentrations. Canonical correspondence analysis (CCA) revealed seasonality as the most important factor determining variability in diatom species composition among sites and dates. Alkalinity, pH, silicate, orthophosphate, and nitrite concentrations were water chemistry parameters correlated with diatom community composition. Eigenvalues for the first two CCA axes of nutrient/physical data and species data were higher than the first two CCA axes of metals data and species data. In addition, the water quality of these estuaries was evaluated using an index composed of Lange-Bertalot pollution tolerance values. The Lange-Bertalot index scores indicated that the Ashtabula estuary had the best water quality of the study sites. Lange-Bertalot index scores were highly correlated with a gradient of disturbance represented by the first axis of a principle components analysis of sites and nutrient data (Spearman ρ = 0.7). The Lange-Bertalot tolerance values could be useful for discriminating ‘good’ sites from “bad” sites among the Lake Erie estuaries.  相似文献   

14.
Goal, Scope, and Background Uncertainty analysis in LCA is important for sound decision support. Nevertheless, the actual influence of uncertainty on decision making in specific LCA case-studies has only been little studied so far. Therefore, we assessed the uncertainty in an LCA comparing two plant-protection products.Methods Uncertainty and variability in LCI flows and characterization factors (CML-baseline method) were expressed as generic uncertainty factors and subsequently propagated into impact scores using Monte-Carlo simulation. Uncertainty in assumptions on production efficiency for chemicals, which is of specific interest for the case study, was depicted by scenarios. Results and Discussion Impact scores concerning acidification, eutrophication, and global warming display relatively small dispersions. Differences in median impact scores of a factor of 1.6 were sufficient in the case study for a significant distinction of the products. Results of toxicity impact-categories show large dispersions due to uncertainty in characterization factors and in the composition of sum parameters. Therefore, none of the two products was found to be significantly environmentally preferable to the other. Considering the case study results and inherent characteristics of the impact categories, a tentative rule of thumb is put forward that quantifies differences in impact scores necessary to obtain significant results in product comparisons.Conclusion Published LCA case-studies may have overestimated the significance of results. It is therefore advisable to routinely carry out quantitative uncertainty analyses in LCA. If this is not feasible, for example due to time restrictions, the rule of thumb proposed here may be helpful to evaluate the significance of results for the impact categories of global warming, acidification, eutrophication, and photooxidant creation.  相似文献   

15.
There is a paucity of data in the literature concerning the validation of the grant application peer review process, which is used to help direct billions of dollars in research funds. Ultimately, this validation will hinge upon empirical data relating the output of funded projects to the predictions implicit in the overall scientific merit scores from the peer review of submitted applications. In an effort to address this need, the American Institute of Biological Sciences (AIBS) conducted a retrospective analysis of peer review data of 2,063 applications submitted to a particular research program and the bibliometric output of the resultant 227 funded projects over an 8-year period. Peer review scores associated with applications were found to be moderately correlated with the total time-adjusted citation output of funded projects, although a high degree of variability existed in the data. Analysis over time revealed that as average annual scores of all applications (both funded and unfunded) submitted to this program improved with time, the average annual citation output per application increased. Citation impact did not correlate with the amount of funds awarded per application or with the total annual programmatic budget. However, the number of funded applications per year was found to correlate well with total annual citation impact, suggesting that improving funding success rates by reducing the size of awards may be an efficient strategy to optimize the scientific impact of research program portfolios. This strategy must be weighed against the need for a balanced research portfolio and the inherent high costs of some areas of research. The relationship observed between peer review scores and bibliometric output lays the groundwork for establishing a model system for future prospective testing of the validity of peer review formats and procedures.  相似文献   

16.
17.
Accurate knowledge of species’ habitat associations is important for conservation planning and policy. Assessing habitat associations is a vital precursor to selecting appropriate indicator species for prioritising sites for conservation or assessing trends in habitat quality. However, much existing knowledge is based on qualitative expert opinion or local scale studies, and may not remain accurate across different spatial scales or geographic locations. Data from biological recording schemes have the potential to provide objective measures of habitat association, with the ability to account for spatial variation. We used data on 50 British butterfly species as a test case to investigate the correspondence of data-derived measures of habitat association with expert opinion, from two different butterfly recording schemes. One scheme collected large quantities of occurrence data (c. 3 million records) and the other, lower quantities of standardised monitoring data (c. 1400 sites). We used general linear mixed effects models to derive scores of association with broad-leaf woodland for both datasets and compared them with scores canvassed from experts.Scores derived from occurrence and abundance data both showed strongly positive correlations with expert opinion. However, only for occurrence data did these fell within the range of correlations between experts. Data-derived scores showed regional spatial variation in the strength of butterfly associations with broad-leaf woodland, with a significant latitudinal trend in 26% of species. Sub-sampling of the data suggested a mean sample size of 5000 occurrence records per species to gain an accurate estimation of habitat association, although habitat specialists are likely to be readily detected using several hundred records. Occurrence data from recording schemes can thus provide easily obtained, objective, quantitative measures of habitat association.  相似文献   

18.
Question: We provide a method to calculate the power of ordinal regression models for detecting temporal trends in plant abundance measured as ordinal cover classes. Does power depend on the shape of the unobserved (latent) distribution of percentage cover? How do cover class schemes that differ in the number of categories affect power? Methods: We simulated cover class data by “cutting‐up” a continuous logit‐beta distributed variable using 7‐point and 15‐point cover classification schemes. We used Monte Carlo simulation to estimate power for detecting trends with two ordinal models, proportional odds logistic regression (POM) and logistic regression with cover classes re‐binned into two categories, a model we term an assessment point model (APM). We include a model fit to the logit‐transformed percentage cover data for comparison, which is a latent model. Results: The POM had equal or higher power compared to the APM and latent model, but power varied in complex ways as a function of the assumed latent beta distribution. We discovered that if the latent distribution is skewed, a cover class scheme with more categories might yield higher power to detect trend. Conclusions: Our power analysis method maintains the connection between the observed ordinal cover classes and the unmeasured (latent) percentage cover variable, allowing for a biologically meaningful trend to be defined on the percentage cover scale. Both the shape of the latent beta distribution and the alternative hypothesis should be considered carefully when determining sample size requirements for long‐term vegetation monitoring using cover class measurements.  相似文献   

19.
A major limitation in identifying peptides from complex mixtures by shotgun proteomics is the ability of search programs to accurately assign peptide sequences using mass spectrometric fragmentation spectra (MS/MS spectra). Manual analysis is used to assess borderline identifications; however, it is error-prone and time-consuming, and criteria for acceptance or rejection are not well defined. Here we report a Manual Analysis Emulator (MAE) program that evaluates results from search programs by implementing two commonly used criteria: 1) consistency of fragment ion intensities with predicted gas phase chemistry and 2) whether a high proportion of the ion intensity (proportion of ion current (PIC)) in the MS/MS spectra can be derived from the peptide sequence. To evaluate chemical plausibility, MAE utilizes similarity (Sim) scoring against theoretical spectra simulated by MassAnalyzer software (Zhang, Z. (2004) Prediction of low-energy collision-induced dissociation spectra of peptides. Anal. Chem. 76, 3908-3922) using known gas phase chemical mechanisms. The results show that Sim scores provide significantly greater discrimination between correct and incorrect search results than achieved by Sequest XCorr scoring or Mascot Mowse scoring, allowing reliable automated validation of borderline cases. To evaluate PIC, MAE simplifies the DTA text files summarizing the MS/MS spectra and applies heuristic rules to classify the fragment ions. MAE output also provides data mining functions, which are illustrated by using PIC to identify spectral chimeras, where two or more peptide ions were sequenced together, as well as cases where fragmentation chemistry is not well predicted.  相似文献   

20.
SUMMARY: Sequence-structure alignments are a common means for protein structure prediction in the fields of fold recognition and homology modeling, and there is a broad variety of programs that provide such alignments based on sequence similarity, secondary structure or contact potentials. Nevertheless, finding the best sequence-structure alignment in a pool of alignments remains a difficult problem. QUASAR (quality of sequence-structure alignments ranking) provides a unifying framework for scoring sequence-structure alignments that aids finding well-performing combinations of well-known and custom-made scoring schemes. Those scoring functions can be benchmarked against widely accepted quality scores like MaxSub, TMScore, Touch and APDB, thus enabling users to test their own alignment scores against 'standard-of-truth' structure-based scores. Furthermore, individual score combinations can be optimized with respect to benchmark sets based on known structural relationships using QUASAR's in-built optimization routines.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号