首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 468 毫秒
1.
The view is widely held that experimental methods (randomised controlled trials) are the "gold standard" for evaluation and that observational methods (cohort and case control studies) have little or no value. This ignores the limitations of randomised trials, which may prove unnecessary, inappropriate, impossible, or inadequate. Many of the problems of conducting randomised trials could often, in theory, be overcome, but the practical implications for researchers and funding bodies mean that this is often not possible. The false conflict between those who advocate randomised trials in all situations and those who believe observational data provide sufficient evidence needs to be replaced with mutual recognition of the complementary roles of the two approaches. Researchers should be united in their quest for scientific rigour in evaluation, regardless of the method used.  相似文献   

2.
Golder S  Loke YK  Bland M 《PLoS medicine》2011,8(5):e1001026

Background

There is considerable debate as to the relative merits of using randomised controlled trial (RCT) data as opposed to observational data in systematic reviews of adverse effects. This meta-analysis of meta-analyses aimed to assess the level of agreement or disagreement in the estimates of harm derived from meta-analysis of RCTs as compared to meta-analysis of observational studies.

Methods and Findings

Searches were carried out in ten databases in addition to reference checking, contacting experts, citation searches, and hand-searching key journals, conference proceedings, and Web sites. Studies were included where a pooled relative measure of an adverse effect (odds ratio or risk ratio) from RCTs could be directly compared, using the ratio of odds ratios, with the pooled estimate for the same adverse effect arising from observational studies. Nineteen studies, yielding 58 meta-analyses, were identified for inclusion. The pooled ratio of odds ratios of RCTs compared to observational studies was estimated to be 1.03 (95% confidence interval 0.93–1.15). There was less discrepancy with larger studies. The symmetric funnel plot suggests that there is no consistent difference between risk estimates from meta-analysis of RCT data and those from meta-analysis of observational studies. In almost all instances, the estimates of harm from meta-analyses of the different study designs had 95% confidence intervals that overlapped (54/58, 93%). In terms of statistical significance, in nearly two-thirds (37/58, 64%), the results agreed (both studies showing a significant increase or significant decrease or both showing no significant difference). In only one meta-analysis about one adverse effect was there opposing statistical significance.

Conclusions

Empirical evidence from this overview indicates that there is no difference on average in the risk estimate of adverse effects of an intervention derived from meta-analyses of RCTs and meta-analyses of observational studies. This suggests that systematic reviews of adverse effects should not be restricted to specific study types. Please see later in the article for the Editors'' Summary  相似文献   

3.
Stepped wedge cluster randomised trials introduce interventions to groups of clusters in a random order and have been used to evaluate interventions for health and wellbeing. Standardised guidance for reporting stepped wedge trials is currently absent, and a range of potential analytic approaches have been described. We systematically identified and reviewed recently published (2010 to 2014) analyses of stepped wedge trials. We extracted data and described the range of reporting and analysis approaches taken across all studies. We critically appraised the strategy described by three trials chosen to reflect a range of design characteristics. Ten reports of completed analyses were identified. Reporting varied: seven of the studies included a CONSORT diagram, and only five also included a diagram of the intervention rollout. Seven assessed the balance achieved by randomisation, and there was considerable heterogeneity among the approaches used. Only six reported the trend in the outcome over time. All used both ‘horizontal’ and ‘vertical’ information to estimate the intervention effect: eight adjusted for time with a fixed effect, one used time as a condition using a Cox proportional hazards model, and one did not account for time trends. The majority used simple random effects to account for clustering and repeat measures, assuming a common intervention effect across clusters. Outcome data from before and after the rollout period were often included in the primary analysis. Potential lags in the outcome response to the intervention were rarely investigated. We use three case studies to illustrate different approaches to analysis and reporting. There is considerable heterogeneity in the reporting of stepped wedge cluster randomised trials. Correct specification of the time-trend underlies the validity of the analytical approaches. The possibility that intervention effects vary by cluster or over time should be considered. Further work should be done to standardise the reporting of the design, attrition, balance, and time-trends in stepped wedge trials.  相似文献   

4.

Background

Increasing active travel (primarily walking and cycling) has been widely advocated for reducing obesity levels and achieving other population health benefits. However, the strength of evidence underpinning this strategy is unclear. This study aimed to assess the evidence that active travel has significant health benefits.

Methods

The study design was a systematic review of (i) non-randomised and randomised controlled trials, and (ii) prospective observational studies examining either (a) the effects of interventions to promote active travel or (b) the association between active travel and health outcomes. Reports of studies were identified by searching 11 electronic databases, websites, reference lists and papers identified by experts in the field. Prospective observational and intervention studies measuring any health outcome of active travel in the general population were included. Studies of patient groups were excluded.

Results

Twenty-four studies from 12 countries were included, of which six were studies conducted with children. Five studies evaluated active travel interventions. Nineteen were prospective cohort studies which did not evaluate the impact of a specific intervention. No studies were identified with obesity as an outcome in adults; one of five prospective cohort studies in children found an association between obesity and active travel. Small positive effects on other health outcomes were found in five intervention studies, but these were all at risk of selection bias. Modest benefits for other health outcomes were identified in five prospective studies. There is suggestive evidence that active travel may have a positive effect on diabetes prevention, which may be an important area for future research.

Conclusions

Active travel may have positive effects on health outcomes, but there is little robust evidence to date of the effectiveness of active transport interventions for reducing obesity. Future evaluations of such interventions should include an assessment of their impacts on obesity and other health outcomes.  相似文献   

5.
Objective To compare the results of a randomised and an observational evaluation of the same policy that restricted reimbursement for nebulised respiratory medications in adult patients in a community setting.Designs Cluster randomised controlled trial and observational time series with historical controls.Setting Pharmacare, the government funded drug benefits plan for elderly people and patients receiving social assistance in British Columbia, Canada.Participants In the randomised controlled trial 104 clusters of medical practices, pair matched by geography and approximately by practice size, were randomised to the intervention group (449 patients affected by the policy on 1 March 1999), and the control group (offered a six month exemption, affecting 386 patients). The observational analysis included all Pharmacare beneficiaries (excluding the 386 exempt patients) who had used any nebulised drugs six months before the policy (4624 patients).Intervention Pharmacare restricted reimbursement for nebulised bronchodilators, steroids, and cromoglycate to patients whose doctors applied for an individual patient''s exemption, giving an appropriate clinical reason.Main outcome measures Number of contacts with doctors and services, emergency admissions to hospital, and utilisation of and expenditure for respiratory drugs in databases of British Columbia''s Ministry of Health.Results Contacts with doctors or emergency admissions to hospital did not increase in association with the restriction, regardless of the analytical approach. In the observational analysis, we found a reduction of $C24 per patient month in all nebulised drug use (95% confidence interval 19 to 29) and an increase of $C3 per patient month in all expenditure for inhalers (1.4 to 4.5). The randomised evaluation found savings of $C8 per patient month for nebulisers (P = 0.24) and no increase in spending on inhalers (P = 0.79). Correcting for 60% non-compliance by exempt doctors in a sensitivity analysis yielded similar results as the observational evaluation.Conclusions Observational as well as randomised analyses found moderate net savings and no increase in unintended healthcare outcomes after restricting reimbursement for nebulised respiratory drugs. Randomised policy trials are feasible and, if carefully implemented, likely to be concordant with observational evaluations.  相似文献   

6.
Objective: To develop an index to measure oral health care priority among nursing staff. Background: Nursing staff, working on hospital wards, at nursing homes and at other facilities, have to deal with oral health care and there are many reports about the low priority that is given to oral health care by nursing staff. It is difficult to measure oral health care priority among nursing staff. A Dental Coping Beliefs Scale (DCBS) index was used in an intervention study and was found to be easy to handle but did not have the ability to reveal significant differences in small study samples. A development process consisting of added items and item numbering by chance was carried out. During this process, different nursing staff test groups were used. The aim was to develop an oral health care priority index that can be used both on hospital wards and at special facilities to measure oral health care priority among nursing staff over time and between groups. Material and methods: Nursing staff at both special facilities and hospital wards and nursing students. Results: It was found that the index, the nursing DCBS, was more stable compared with the version that was used in the initial intervention study. It was also noted that its ability to discriminate between the items was improved. Conclusion: The nursing DCBS index is a suitable tool for use in further studies where the aim is to measure how different nursing staff groups give priority to and allocate responsibility for oral health care, even where study samples are small.  相似文献   

7.
Objective To determine the effectiveness of programmes of screening in general practice for excessive alcohol use and providing brief interventions.Design Systematic review and meta-analysis of randomised controlled trials that used screening as a precursor to brief intervention.Setting General practice.Main outcome measures Number needed to treat, proportion of patients positive on screening, proportion given brief interventions, and effect of screening.Results The eight studies included for meta-analysis all used health questionnaires for screening, and the brief interventions included feedback, information, and advice. The studies contained several sources of bias that might lead to overestimates of the effects of intervention. External validity was compromised because typically three out of four people identified by screening as excessive users of alcohol did not qualify for the intervention after a secondary assessment. Overall, in 1000 screened patients, 90 screened positive and required further assessment, after which 25 qualified for brief intervention; after one year 2.6 (95% confidence interval 1.7 to 3.4) reported they drank less than the maximum recommended level.Conclusions Although even brief advice can reduce excessive drinking, screening in general practice does not seem to be an effective precursor to brief interventions targeting excessive alcohol use. This meta-analysis raises questions about the feasibility of screening in general practice for excessive use of alcohol.  相似文献   

8.
Abstract Manipulative experimentation that features random assignment of treatments, replication, and controls is an effective way to determine causal relationships. Wildlife ecologists, however, often must take a more passive approach to investigating causality. Their observational studies lack one or more of the 3 cornerstones of experimentation: controls, randomization, and replication. Although an observational study can be analyzed similarly to an experiment, one is less certain that the presumed treatment actually caused the observed response. Because the investigator does not actively manipulate the system, the chance that something other than the treatment caused the observed results is increased. We reviewed observational studies and contrasted them with experiments and, to a lesser extent, sample surveys. We identified features that distinguish each method of learning and illustrate or discuss some complications that may arise when analyzing results of observational studies. Findings from observational studies are prone to bias. Investigators can reduce the chance of reaching erroneous conclusions by formulating a priori hypotheses that can be pursued multiple ways and by evaluating the sensitivity of study conclusions to biases of various magnitudes. In the end, however, professional judgment that considers all available evidence is necessary to render a decision regarding causality based on observational studies. (JOURNAL OF WILDLIFE MANAGEMENT 72(1):4–13; 2008)  相似文献   

9.
Recurrent events are common in medical research, yet the best ways to measure their occurrence remain controversial. Moreover, the correct statistical techniques to compare the occurrence of such events across populations or treatment groups are not widely known. In both observational studies and randomised clinical trials one natural and intuitive measure of occurrence is the event rate, defined as the number of events (possibly including multiple events per person) divided by the total person-years of experience. This is often a more relevant and clinically interpretable measure of disease burden in a population than considering only the first event that occurs. Appropriate statistical tests to compare such event rates among treatment groups or populations require the recognition that some individuals may be especially likely to experience recurrent events. Straightforward approaches are available to account for this tendency in crude and stratified analyses. Recently developed regression models can appropriately examine the association of several variables with rates of recurrent events.  相似文献   

10.
There is a growing interest in assessing dietary intake more accurately across different population groups, and biomarkers have emerged as a complementary tool to replace traditional dietary assessment methods. The purpose of this study was to conduct a systematic review of the literature available and evaluate the applicability and validity of biomarkers of legume intake reported across various observational and intervention studies. A systematic search in PubMed, Scopus, and ISI Web of Knowledge identified 44 studies which met the inclusion criteria for the review. Results from observational studies focused on soy or soy-based foods and demonstrated positive correlations between soy intake and urinary, plasma or serum isoflavonoid levels in different population groups. Similarly, intervention studies demonstrated increased genistein and daidzein levels in urine and plasma following soy intake. Both genistein and daidzein exhibited dose-response relationships. Other isoflavonoid levels such as O-desmethylangolensin (O-DMA) and equol were also reported to increase following soy consumption. Using a developed scoring system, genistein and daidzein can be considered as promising candidate markers for soy consumption. Furthermore, genistein and daidzein also served as good estimates of soy intake as evidenced from long-term exposure studies marking their status as validated biomarkers. On the contrary, only few studies indicated proposed biomarkers for pulses intake, with pipecolic acid and S-methylcysteine reported as markers reflecting dry bean consumption, unsaturated aliphatic, hydroxyl-dicarboxylic acid related to green beans intake and trigonelline reported as marker of peas consumption. However, data regarding criteria such as specificity, dose-response and time-response relationship, reliability, and feasibility to evaluate the validity of these markers is lacking. In conclusion, despite many studies suggesting proposed biomarkers for soy, there is a lack of information on markers of other different subtypes of legumes. Further discovery and validation studies are needed in order to identify reliable biomarkers of legume intake.  相似文献   

11.
The establishment of cause and effect relationships is a fundamental objective of scientific research. Many lines of evidence can be used to make cause–effect inferences. When statistical data are involved, alternative explanations for the statistical relationship need to be ruled out. These include chance (apparent patterns due to random factors), confounding effects (a relationship between two variables because they are each associated with an unmeasured third variable), and sampling bias (effects due to preexisting properties of compared groups). The gold standard for managing these issues is a controlled randomized experiment. In disciplines such as biological anthropology, where controlled experiments are not possible for many research questions, causal inferences are made from observational data. Methods that statisticians recommend for this difficult objective have not been widely adopted in the biological anthropology literature. Issues involved in using statistics to make valid causal inferences from observational data are discussed.  相似文献   

12.

Background

The possible effects of research assessments on participant behaviour have attracted research interest, especially in studies with behavioural interventions and/or outcomes. Assessments may introduce bias in randomised controlled trials by altering receptivity to intervention in experimental groups and differentially impacting on the behaviour of control groups. In a Solomon 4-group design, participants are randomly allocated to one of four arms: (1) assessed experimental group; (2) unassessed experimental group (3) assessed control group; or (4) unassessed control group. This design provides a test of the internal validity of effect sizes obtained in conventional two-group trials by controlling for the effects of baseline assessment, and assessing interactions between the intervention and baseline assessment. The aim of this systematic review is to evaluate evidence from Solomon 4-group studies with behavioural outcomes that baseline research assessments themselves can introduce bias into trials.

Methodology/Principal Findings

Electronic databases were searched, supplemented by citation searching. Studies were eligible if they reported appropriately analysed results in peer-reviewed journals and used Solomon 4-group designs in non-laboratory settings with behavioural outcome measures and sample sizes of 20 per group or greater. Ten studies from a range of applied areas were included. There was inconsistent evidence of main effects of assessment, sparse evidence of interactions with behavioural interventions, and a lack of convincing data in relation to the research question for this review.

Conclusions/Significance

There were too few high quality completed studies to infer conclusively that biases stemming from baseline research assessments do or do not exist. There is, therefore a need for new rigorous Solomon 4-group studies that are purposively designed to evaluate the potential for research assessments to cause bias in behaviour change trials.  相似文献   

13.
Objectives To assess aspects of the internal validity of recently published cluster randomised trials and explore the reporting of information useful in assessing the external validity of these trials.Design Review of 34 cluster randomised trials in primary care published in 2004 and 2005 in seven journals (British Medical Journal, British Journal of General Practice, Family Practice, Preventive Medicine, Annals of Internal Medicine, Journal of General Internal Medicine, Pediatrics).Data sources National Library of Medicine (Medline) via PubMed.Data extraction To assess aspects of internal validity we extracted data on appropriateness of sample size calculations and analyses, methods of identifying and recruiting individual participants, and blinding. To explore reporting of information useful in assessing external validity we extracted data on cluster eligibility, cluster inclusion and retention, cluster generalisability, and the feasibility and acceptability of the intervention to health providers in clusters.Results 21 (62%) trials accounted for clustering in sample size calculations and 30 (88%) in the analysis; about a quarter were potentially biased because of procedures surrounding recruitment and identification of patients; individual participants were blind to allocation status in 19 (56%) and outcome assessors were blind in 15 (44%). In almost half the reports, information relating to generalisability of clusters was poorly reported, and in two fifths there was no information about the feasibility and acceptability of the intervention.Conclusions Cluster randomised trials are essential for evaluating certain types of interventions. Issues affecting their internal validity, such as appropriate sample size calculations and analysis, have been widely disseminated and are now better addressed by researchers. Blinding of those identifying and recruiting patients to allocation status is recommended but is not always carried out. There may be fewer barriers to internal validity in trials in which individual participants are not recruited. External validity seems poorly addressed in many trials, yet is arguably as important as internal validity in judging quality as a basis for healthcare intervention.  相似文献   

14.
Theories of phenotypic integration have relied heavily on the concept of modularity in order to model the ways in which traits in an organism correlate and covary. Recent investigations suggest that, while some functional and developmental processes may be morphologically and ontogenetically localized, and thus modular in a developmental sense, there is a great deal of overlap among these influences on patterns of integration in the adult form. This can result in blurry boundaries between hypothesized modules constructed to test hypotheses about phenotypic integration. This investigation tests hypotheses about the contribution of pleiotropic quantitative trait loci (QTL) to phenotypic integration in the mouse mandible without using a priori categorical hypotheses about which traits constitute a module. We ask two main questions: (1) Are the effects of pleiotropic QTL localized to highly correlated traits or more spread out among traits than one might expect by chance? (2) Does the pattern of trait influence when all pleiotropic QTL are considered together deviate from what we might expect if QTL affect traits without regard for the correlations among traits? We find that a large proportion of pleiotropic QTL affect traits that are more highly correlated than we expect by chance with the remainder having effects that are distributed as if by chance. Furthermore, the overall distribution of the effects of pleiotropic QTL differs significantly from the null distribution of no association between pleiotropic effects on traits and correlations among traits. The main modular hypothesis used by earlier studies often does not predict the distribution of sets of traits sharing a common QTL. These results suggest that there is a clear tendency for pleiotropic effects of QTL to be localized but that the localization may be best thought of as occurring in a continuous space rather being clustered in discrete modules.  相似文献   

15.

Background

In conventional epidemiology confounding of the exposure of interest with lifestyle or socioeconomic factors, and reverse causation whereby disease status influences exposure rather than vice versa, may invalidate causal interpretations of observed associations. Conversely, genetic variants should not be related to the confounding factors that distort associations in conventional observational epidemiological studies. Furthermore, disease onset will not influence genotype. Therefore, it has been suggested that genetic variants that are known to be associated with a modifiable (nongenetic) risk factor can be used to help determine the causal effect of this modifiable risk factor on disease outcomes. This approach, mendelian randomization, is increasingly being applied within epidemiological studies. However, there is debate about the underlying premise that associations between genotypes and disease outcomes are not confounded by other risk factors. We examined the extent to which genetic variants, on the one hand, and nongenetic environmental exposures or phenotypic characteristics on the other, tend to be associated with each other, to assess the degree of confounding that would exist in conventional epidemiological studies compared with mendelian randomization studies.

Methods and Findings

We estimated pairwise correlations between nongenetic baseline variables and genetic variables in a cross-sectional study comparing the number of correlations that were statistically significant at the 5%, 1%, and 0.01% level (α = 0.05, 0.01, and 0.0001, respectively) with the number expected by chance if all variables were in fact uncorrelated, using a two-sided binomial exact test. We demonstrate that behavioural, socioeconomic, and physiological factors are strongly interrelated, with 45% of all possible pairwise associations between 96 nongenetic characteristics (n = 4,560 correlations) being significant at the p < 0.01 level (the ratio of observed to expected significant associations was 45; p-value for difference between observed and expected < 0.000001). Similar findings were observed for other levels of significance. In contrast, genetic variants showed no greater association with each other, or with the 96 behavioural, socioeconomic, and physiological factors, than would be expected by chance.

Conclusions

These data illustrate why observational studies have produced misleading claims regarding potentially causal factors for disease. The findings demonstrate the potential power of a methodology that utilizes genetic variants as indicators of exposure level when studying environmentally modifiable risk factors.  相似文献   

16.
A large number of observational epidemiological studies show that regular use of aspirin and other NSAID's is associated with a reduction in the risk of developing both colorectal adenomas and cancer. Furthermore, the prodrug sulindac appears clinically to be able to reduce and reverse the growth of existing polyps in familial adenomatous polyposis (FAP). For aspirin and NSAID's the dose, duration of effect and length of protection seen after cessation remain to be fully established. The available data for aspirin suggest that doses higher than those needed for heart disease prevention are required. It is also likely that the drug needs to be taken continuously for a number of years. With regards to randomised controlled trials to evaluate chemopreventive strategies there are so far only limited data available. The only trial reported to date found no effect but employed a relatively low dose of 325 mg of aspirin every other day and the randomised intervention period was relatively short (5 years). Further trials of intermediate endpoints (adenomas) are currently underway in the UK and USA and are employing higher doses of aspirin. Randomised clinical studies of sulindac have been more encouraging demonstrating that it is a useful drug for therapeutic applications in FAP patients. Its relatively greater side effects, however, prevent its consideration for primary chemoprevention. The mechanisms by which NSAID's act are still sought. Strategies for possible primary and secondary chemoprevention in humans also require evaluation.  相似文献   

17.
IntroductionWithin the context of Person Centred Care, the present paper shows the creation and validation process of an observational tool for the assessment of the wellbeing of people with dementia, from a perspective that seeks to highlight the effects that the physical and social environment have on the person, and how these are reflected in the well-being.MethodsThe List of Wellbeing Indicators (LIBE) was created following an inductive iterative process with professionals from different disciplines, until the validated version was reached. It was then validated in two successive studies with a sample of 79 people with dementia. Discrimination capacity of the scale indicators, internal consistency, inter-rater reliability, and convergent and divergent validity were determined.ResultsAn internal consistency of Cronbach́s alpha 0.81 was obtained. The inter-rater reliability, analysing intraclass correlation coefficient (ICC) within the 3 raters, was significant for all the indicators in the tool, with scores between 0.59-1.00. Convergent validity was studied comparing scores in each LIBE indicator with scores in each QUALID indicator, and some significant associations were found between response categories in both tools. For the discriminant validity, the scores obtained in each LIBE indicator were compared with the scores in each PAINAD-Sp item, and no significant associations were found.ConclusionLIBE offers an observational measure of behaviours that can be considered well-being indicators in people with dementia living in residential care. LIBE is a valid and reliable tool that offers a different perspective of measuring a construct that has been infrequently explored in dementia population. Is also an easy to apply tool, with different uses (clinical, intervention, research), and applicable for professionals of several disciplines.  相似文献   

18.
A large number of foetuses are scanned on a routine basis and although it is generally assumed that prenatal ultrasound is safe; very few studies have in fact focused on possible adverse effects in humans. The epidemiological tools when studying possible adverse effects of prenatal ultrasound are randomised controlled trials and observational studies such as cohort and case–control studies. There are advantages and disadvantages with all study designs. In this review, some of the challenges that have to be met are discussed based on experiences from a randomised controlled trial, cohort studies and an ongoing case–control study.  相似文献   

19.
Amphibian declines are occurring on a global scale, and infectious disease has been implicated as a factor in some species. Batrachochytrium dendrobatidis (Bd) has been associated with amphibian declines and/or extinctions in many locations, however, few of the studies have actually performed detailed pathological investigations to link the emergence of the disease with mortality rates large enough to cause the declines. Many studies are based solely on the presence of infection, not disease, because of the reliance on molecular tests for Bd. The emphasis of the importance of Bd combined with easy molecular tests has resulted in poor investigations into amphibian mortality and declines in many areas. The line between infection and disease has been blurred, and a step back to basic pathological and biological investigations is needed as other disease risks to amphibians, such as ranaviruses, are likely being missed. In this article, starting points for proper investigative techniques for amphibian mortalities and declines are identified and areas that need to be improved, especially communication between biologist and veterinarians involved in amphibian disease research, are suggested. It is hoped that this will start a much needed discussion in the area and lead to some consensus building about methodologies used in amphibian disease research.  相似文献   

20.
Objective To summarise comparisons of randomised clinical trials and non-randomised clinical trials, trials with adequately concealed random allocation versus inadequately concealed random allocation, and high quality trials versus low quality trials where the effect of randomisation could not be separated from the effects of other methodological manoeuvres.Design Systematic review.Selection criteria Cohorts or meta-analyses of clinical trials that included an empirical assessment of the relation between randomisation and estimates of effect.Data sources Cochrane Review Methodology Database, Medline, SciSearch, bibliographies, hand searching of journals, personal communication with methodologists, and the reference lists of relevant articles.Main outcome measures Relation between randomisation and estimates of effect.Results Eleven studies that compared randomised controlled trials with non-randomised controlled trials (eight for evaluations of the same intervention and three across different interventions), two studies that compared trials with adequately concealed random allocation and inadequately concealed random allocation, and five studies that assessed the relation between quality scores and estimates of treatment effects, were identified. Failure to use random allocation and concealment of allocation were associated with relative increases in estimates of effects of 150% or more, relative decreases of up to 90%, inversion of the estimated effect and, in some cases, no difference. On average, failure to use randomisation or adequate concealment of allocation resulted in larger estimates of effect due to a poorer prognosis in non-randomly selected control groups compared with randomly selected control groups.Conclusions Failure to use adequately concealed random allocation can distort the apparent effects of care in either direction, causing the effects to seem either larger or smaller than they really are. The size of these distortions can be as large as or larger than the size of the effects that are to be detected.

Key messages

  • Empirical studies support using random allocation in clinical trials and ensuring that the allocation process is concealed—that is, that assignment is impervious to any influence by the people making the allocation
  • The effect of not using concealed random allocation can be as large or larger than the effects of worthwhile interventions
  • On average, failure to use concealed random allocation results in overestimates of effect due to a poorer prognosis in non-randomly selected control groups compared with randomly selected control groups, but it can result in underestimates of effect, reverse the direction of effect, mask an effect, or give similar estimates of effect
  • The adequacy of allocation concealment may be a more sensitive measure of bias in clinical trials than scales used to assess the quality of clinical trials
  • It is a paradox that the unpredictability of randomisation is the best protection against the unpredictability of the extent and direction of bias in clinical trials that are not properly randomised
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号