首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Multiple imputation (MI) is increasingly popular for handling multivariate missing data. Two general approaches are available in standard computer packages: MI based on the posterior distribution of incomplete variables under a multivariate (joint) model, and fully conditional specification (FCS), which imputes missing values using univariate conditional distributions for each incomplete variable given all the others, cycling iteratively through the univariate imputation models. In the context of longitudinal or clustered data, it is not clear whether these approaches result in consistent estimates of regression coefficient and variance component parameters when the analysis model of interest is a linear mixed effects model (LMM) that includes both random intercepts and slopes with either covariates or both covariates and outcome contain missing information. In the current paper, we compared the performance of seven different MI methods for handling missing values in longitudinal and clustered data in the context of fitting LMMs with both random intercepts and slopes. We study the theoretical compatibility between specific imputation models fitted under each of these approaches and the LMM, and also conduct simulation studies in both the longitudinal and clustered data settings. Simulations were motivated by analyses of the association between body mass index (BMI) and quality of life (QoL) in the Longitudinal Study of Australian Children (LSAC). Our findings showed that the relative performance of MI methods vary according to whether the incomplete covariate has fixed or random effects and whether there is missingnesss in the outcome variable. We showed that compatible imputation and analysis models resulted in consistent estimation of both regression parameters and variance components via simulation. We illustrate our findings with the analysis of LSAC data.  相似文献   

2.
The health effects of airborne fine particles are the subject of government regulation and scientific debate. The aerodynamics of airborne particulate matter, the deposition patterns in the human lung, and the available experimental and epidemiological data on health effects lead us to focus on airborne particulate matter with an aerodynamic mean diameter less than 2.5 microm (PM(2.5)) as the fraction of the particles with the largest impact in health. In this article we present a novel hypothesis to explain the continuous production of reactive oxygen species produced by PM(2.5) when it is deposited in the lung. We find PM(2.5) contains abundant persistent free radicals, typically 10(16) to 10(17) unpaired spins/gram, and that these radicals are stable for several months. These radicals are consistent with the stability and electron paramagnetic resonance spectral characteristics of semiquinone radicals. Catalytic redox cycling by semiquinone radicals is well documented in the literature and we had studied in detail its role on the health effects of cigarette smoke particulate matter. We believe that we have for the first time shown that the same, or similar radicals, are not confined to cigarette smoke particulate matter but are also present in PM(2.5). We hypothesize that these semiquinone radicals undergo redox cycling, thereby reducing oxygen and generating reactive oxygen species while consuming tissue-reducing equivalents, such as NAD(P)H and ascorbate. These reactive oxygen species generated by particles cause oxidative stress at sites of deposition and produce deleterious effects observed in the lung.  相似文献   

3.
Avian influenza virus-infected poultry can release a large amount of virus-contaminated droppings that serve as sources of infection for susceptible birds. Much research so far has focused on virus spread within flocks. However, as fecal material or manure is a major constituent of airborne poultry dust, virus-contaminated particulate matter from infected flocks may be dispersed into the environment. We collected samples of suspended particulate matter, or the inhalable dust fraction, inside, upwind and downwind of buildings holding poultry infected with low-pathogenic avian influenza virus, and tested them for the presence of endotoxins and influenza virus to characterize the potential impact of airborne influenza virus transmission during outbreaks at commercial poultry farms. Influenza viruses were detected by RT-PCR in filter-rinse fluids collected up to 60 meters downwind from the barns, but virus isolation did not yield any isolates. Viral loads in the air samples were low and beyond the limit of RT-PCR quantification except for one in-barn measurement showing a virus concentration of 8.48x104 genome copies/m3. Air samples taken outside poultry barns had endotoxin concentrations of ~50 EU/m3 that declined with increasing distance from the barn. Atmospheric dispersion modeling of particulate matter, using location-specific meteorological data for the sampling days, demonstrated a positive correlation between endotoxin measurements and modeled particulate matter concentrations, with an R2 varying from 0.59 to 0.88. Our data suggest that areas at high risk for human or animal exposure to airborne influenza viruses can be modeled during an outbreak to allow directed interventions following targeted surveillance.  相似文献   

4.
King R  Brooks SP  Coulson T 《Biometrics》2008,64(4):1187-1195
SUMMARY: We consider the issue of analyzing complex ecological data in the presence of covariate information and model uncertainty. Several issues can arise when analyzing such data, not least the need to take into account where there are missing covariate values. This is most acutely observed in the presence of time-varying covariates. We consider mark-recapture-recovery data, where the corresponding recapture probabilities are less than unity, so that individuals are not always observed at each capture event. This often leads to a large amount of missing time-varying individual covariate information, because the covariate cannot usually be recorded if an individual is not observed. In addition, we address the problem of model selection over these covariates with missing data. We consider a Bayesian approach, where we are able to deal with large amounts of missing data, by essentially treating the missing values as auxiliary variables. This approach also allows a quantitative comparison of different models via posterior model probabilities, obtained via the reversible jump Markov chain Monte Carlo algorithm. To demonstrate this approach we analyze data relating to Soay sheep, which pose several statistical challenges in fully describing the intricacies of the system.  相似文献   

5.
Missing data is a common issue in research using observational studies to investigate the effect of treatments on health outcomes. When missingness occurs only in the covariates, a simple approach is to use missing indicators to handle the partially observed covariates. The missing indicator approach has been criticized for giving biased results in outcome regression. However, recent papers have suggested that the missing indicator approach can provide unbiased results in propensity score analysis under certain assumptions. We consider assumptions under which the missing indicator approach can provide valid inferences, namely, (1) no unmeasured confounding within missingness patterns; either (2a) covariate values of patients with missing data were conditionally independent of treatment or (2b) these values were conditionally independent of outcome; and (3) the outcome model is correctly specified: specifically, the true outcome model does not include interactions between missing indicators and fully observed covariates. We prove that, under the assumptions above, the missing indicator approach with outcome regression can provide unbiased estimates of the average treatment effect. We use a simulation study to investigate the extent of bias in estimates of the treatment effect when the assumptions are violated and we illustrate our findings using data from electronic health records. In conclusion, the missing indicator approach can provide valid inferences for outcome regression, but the plausibility of its assumptions must first be considered carefully.  相似文献   

6.
Algae are used in biomonitoring systems to detect water or soil pollution. So it is conceivable to establish a biomonitoring system for the detection of airborne pollutants (ozone and particulate matter (PM-10)) in urban habitats by algae. Autotrophic biofilms are widely present, cover nearly every exposed surface, especially tree bark and consist of a large variety of species of algae, cyanobacteria and fungi. To explore the diversity of green algae at different air pollution monitoring sites we choose trees with different structures of bark at three locations in and near Leipzig. We compared the measured levels of air pollution with the algal species and communities present. The sites differed in the quality and amount of airborne pollutants, among which we concentrated on ozone and particulate matter (PM-10). The collection sites were Leipzig-Centre, Leipzig-West and a forest area east of Leipzig (Collmberg). Autotrophic biofilms were collected, algae cultures established and taxonomic and morphological studies were carried out with light microscopy. Green algae were present on tree bark at all sites and forty-eight different algal species and cyanobacteria were isolated. Preliminary results suggested a correlation between pollutants and occurrence of some specific algal species and the specific algal assemblages at a given site. It is concluded that this could provide the basis for a biomonitoring system involving aero-terrestrial algae for the detection of airborne pollutants. Presented at the International Symposium Biology and Taxonomy of Green Algae V, Smolenice, June 26–29, 2007, Slovakia.  相似文献   

7.
Exposure to airborne particulate matter has adverse effects on human health and ecosystem. Mutagenic activity of airborne particulate organic matter extracts in three time periods from total suspended particles (TSP) and particles less than 10 microm (PM10) was evaluated in an area under the influence of a petrochemical industry located in the town of Triunfo, Brazil. The extracts were investigated using the Salmonella/microsome assay, with the microsuspension method. The extracts were obtained by sonication extracted using dichloromethane (DCM) solvent. The fractions were tested for mutagenicity with the Salmonella typhimurium strains TA98 (with and without metabolic activation), TA98NR and TA98/1,8DNP(6); or YG1021 and YG1024. A positive frameshift mutagenic response was observed for the environmental samples during the different periods. The responses according to percentage of extractable organic matter (EOM%), EOM/m(3), revertants/microg (rev/microg) and revertants/m(3) (rev/m(3)) were lower for TSP than for PM10 extracts. The highest rev/m(3) values were observed in PM10 extract samples collected in winter, July 2005, in the presence (13.79 rev/m(3)) or absence (6.87 rev/m(3)) of S9 fraction. Similarly in the first (1995) or second period (2000) the highest values for TSP were observed in winter, but with lower activity (3.00 and 0.89 rev/m(3) respectively). The responses observed for the nitrosensitive strains suggest the contribution of nitro, amino and/or hydroxylamino derivatives of PAHs to the total mutagenicity of matter extracted from airborne particles. The Salmonella/microsome assay was a sensitive method to define areas contaminated by genotoxic compounds, even in samples with TSP or PM10 values that are acceptable according to legal environmental quality standards, favoring environmental control measures with an effective response seen in the population's improved quality of life.  相似文献   

8.
Dominici F 《Biometrics》2000,56(2):546-553
We propose a methodology for estimating the cell probabilities in a multiway contingency table by combining partial information from a number of studies when not all of the variables are recorded in all studies. We jointly model the full set of categorical variables recorded in at least one of the studies, and we treat the variables that are not reported as missing dimensions of the study-specific contingency table. For example, we might be interested in combining several cohort studies in which the incidence in the exposed and nonexposed groups is not reported for all risk factors in all studies while the overall numbers of cases and cohort size is always available. To account for study-to-study variability, we adopt a Bayesian hierarchical model. At the first stage of the model, the observation stage, data are modeled by a multinomial distribution with fixed total number of observations. At the second stage, we use the logistic normal (LN) distribution to model variability in the study-specific cells' probabilities. Using this model and data augmentation techniques, we reconstruct the contingency table for each study regardless of which dimensions are missing, and we estimate population parameters of interest. Our hierarchical procedure borrows strength from all the studies and accounts for correlations among the cells' probabilities. The main difficulty in combining studies recording different variables is in maintaining a consistent interpretation of parameters across studies. The approach proposed here overcomes this difficulty and at the same time addresses the uncertainty arising from the missing dimensions. We apply our modeling strategy to analyze data on air pollution and mortality from 1987 to 1994 for six U.S. cities by combining six cross-classifications of low, medium, and high levels of mortality counts, particulate matter, ozone, and carbon monoxide with the complication that four of the six cities do not report all the air pollution variables. Our goals are to investigate the association between air pollution and mortality by reconstructing the tables with missing dimensions, to determine the most harmful pollutant combinations, and to make predictions about these key issues for a city other than the six sampled. We find that, for high levels of ozone and carbon monoxide, the number of cases with a high number of deaths increases as the levels of particulate matter, PM10, increases and that the most harmful combinations corresponds to high levels of PM10, confirming prior findings that levels of PM10 higher than the NAAQS standard are harmful.  相似文献   

9.
There are a number of applied settings where a response is measured repeatedly over time, and the impact of a stimulus at one time is distributed over several subsequent response measures. In the motivating application the stimulus is an air pollutant such as airborne particulate matter and the response is mortality. However, several other variables (e.g. daily temperature) impact the response in a possibly non-linear fashion. To quantify the effect of the stimulus in the presence of covariate data we combine two established regression techniques: generalized additive models and distributed lag models. Generalized additive models extend multiple linear regression by allowing for continuous covariates to be modeled as smooth, but otherwise unspecified, functions. Distributed lag models aim to relate the outcome variable to lagged values of a time-dependent predictor in a parsimonious fashion. The resultant, which we call generalized additive distributed lag models, are seen to effectively quantify the so-called 'mortality displacement effect' in environmental epidemiology, as illustrated through air pollution/mortality data from Milan, Italy.  相似文献   

10.
In order to document cytogenetic damage associated with air pollution and, possibly, with health effects in the city of Catania, Sicily (Italy), we analyzed the induction of chromosomal aberrations by extractable agents from airborne particulate matter in a Chinese hamster epithelial liver (CHEL) cells. These cells retain their metabolic competence to activate different classes of promutagens/procarcinogens into biologically active metabolites. Airborne particulate matter was obtained from two stationary samplers (stations I and II) in two areas endowed by an elevated car transit in the centre of Catania. The results obtained clearly indicated that airborne particulate matter from both stations I and II proved to be clastogens in CHEL cells but not in Chinese hamster ovary (CHO) cells without metabolic activation, indicating that airborne particulate mixtures need to be metabolically converted before exerting their genotoxic potential. On the basis of these results we can assert that the test system employed to identify the cytogenetic potential of airborne particulate matter is useful and profitable for environmental control, and helpful to plan specific actions aimed at reducing the hazards derived from exposure to polluted air.  相似文献   

11.
雾霾空气中细菌特征及对健康的潜在影响   总被引:4,自引:0,他引:4  
论文从群落结构、浓度变化及粒径分布的角度论述了雾霾天气对空气细菌特征的影响,并结合了雾霾天气相关疾病的发病率,综合评价了雾霾天气空气病原菌导致的人群潜在健康风险的变化,最后提出了以前研究存在的不足及未来的发展趋势。综合研究结果,雾霾天气对空气细菌群落结构、浓度变化及粒径分布的影响,研究人员在不同城市得出的研究结果不同,可能由样点的时空环境、气象因素、雾霾程度及采集、检验、鉴定空气细菌方法等多种因素的差异引起。雾霾空气中已发现病原细菌均为条件致病菌,在空气中含量很低,但雾霾天气下部分病原菌的相对丰度增加,致病力会显著增强。此外,高浓度的细颗粒物和化学污染物可损伤皮肤黏膜屏障,打破呼吸道和皮肤的微生态平衡,为病原菌侵入创造较好的机会。两者的协同作用,显著增加了雾霾天气空气中病原菌的健康风险。  相似文献   

12.
春季城区道路不同绿地配置模式对大气颗粒物的削减作用   总被引:1,自引:0,他引:1  
杨貌  张志强  陈立欣  刘辰明  邹瑞 《生态学报》2016,36(7):2076-2083
研究城市道路不同绿地类型对大气颗粒物的吸附削减作用,是提高城市绿地大气污染治理功能绿地配置模式优化的重要基础。以位于北京市海淀区的3条典型主干道道路为对象,选取乔木、灌木、草本、乔-灌、乔-草、乔-灌-草6种典型绿地配置模式,在大气颗粒物污染严重以及城市植被发芽、开花、展叶完成的春季(3月中旬至4月上旬),采用Dustmate便携式颗粒物采样器和NK4500手持自动气象仪分1.5m和3m两个高度同步测定距污染源不同位置的大气颗粒物浓度与小气候因子,分析不同绿地配置模式对颗粒物削减能力的差异及其主要影响因素。研究结果表明:复合配置模式比单一配置模式下空气颗粒物浓度稳定程度高,其主要受风速与空气相对湿度的影响;大气颗粒物粒径越大绿地对其削减作用越强;地表覆盖程度是影响不同绿地配置模式对大气颗粒物垂直削减的关键因素,地表覆盖越好垂直削减效果越好,且垂直削减率与温度成正相关关系;草本、灌木对大气颗粒物的垂直削减作用比其他4种配置模式更好;由于受植被郁闭度、疏透度以及配置种类的综合影响,乔-草、灌木绿地配置对大气颗粒物的水平削减作用比其他4种模式更好。  相似文献   

13.
Exposure to air pollution is associated with increased morbidity and mortality. Recent technological advancements permit the collection of time-resolved personal exposure data. Such data are often incomplete with missing observations and exposures below the limit of detection, which limit their use in health effects studies. In this paper, we develop an infinite hidden Markov model for multiple asynchronous multivariate time series with missing data. Our model is designed to include covariates that can inform transitions among hidden states. We implement beam sampling, a combination of slice sampling and dynamic programming, to sample the hidden states, and a Bayesian multiple imputation algorithm to impute missing data. In simulation studies, our model excels in estimating hidden states and state-specific means and imputing observations that are missing at random or below the limit of detection. We validate our imputation approach on data from the Fort Collins Commuter Study. We show that the estimated hidden states improve imputations for data that are missing at random compared to existing approaches. In a case study of the Fort Collins Commuter Study, we describe the inferential gains obtained from our model including improved imputation of missing data and the ability to identify shared patterns in activity and exposure among repeated sampling days for individuals and among distinct individuals.  相似文献   

14.
The analyses of observational longitudinal studies involving concurrent changes in treatment and medical conditions present difficulties because of the multitude of directions of potential relationships: past medication influences current symptoms; past symptoms influence current medication; and current medication is associated with current symptoms. In the context of a long-term study of non-randomized pharmacological treatment of schizophrenic relapse, we present an analysis of bivariate discrete-time transitional data with binary responses in an attempt to understand the transitional and concurrent relationships between schizophrenia relapse and medication use. A naive analysis does not show any association between previous medication and current relapse. However, we provide evidence suggesting that current treatment may impact current relapse for those who have previously taken medication, but not for those who haven't taken medication in the past. When univariate models are specified to assess these associations, the bivariate nature of the problem requires a choice of which response, relapse or medication, should be the dependent variable. In this case, the choice of relapse or medication as a dependent variable does matter. Hence, our results derive from models where both relapse and medication are treated as dependent variables. Specifically, we specify a bivariate log odds ratio for current relapse and current medication use and a separate univariate logit component for each of these outcomes. Each of these components contains transitional associations with previous relapse and medication. Such models represent extensions of univariate transitional association models (e.g. Diggle et al. (1994)) and correspond to bivariate transitional models (e.g. Zeger and Liang (1991)). We incorporate changes in transitional associations into the full-data parametric model for final inference, and investigate if these temporal changes are due to learning effects or the impact of drop-out. We also perform residual analyses and sensitivity analyses in the context of missing data patterns.  相似文献   

15.
Marginal structural models (MSMs) have been proposed for estimating a treatment's effect, in the presence of time‐dependent confounding. We aimed to evaluate the performance of the Cox MSM in the presence of missing data and to explore methods to adjust for missingness. We simulated data with a continuous time‐dependent confounder and a binary treatment. We explored two classes of missing data: (i) missed visits, which resemble clinical cohort studies; (ii) missing confounder's values, which correspond to interval cohort studies. Missing data were generated under various mechanisms. In the first class, the source of the bias was the extreme treatment weights. Truncation or normalization improved estimation. Therefore, particular attention must be paid to the distribution of weights, and truncation or normalization should be applied if extreme weights are noticed. In the second case, bias was due to the misspecification of the treatment model. Last observation carried forward (LOCF), multiple imputation (MI), and inverse probability of missingness weighting (IPMW) were used to correct for the missingness. We found that alternatives, especially the IPMW method, perform better than the classic LOCF method. Nevertheless, in situations with high marker's variance and rarely recorded measurements none of the examined method adequately corrected the bias.  相似文献   

16.
Chen HY  Xie H  Qian Y 《Biometrics》2011,67(3):799-809
Multiple imputation is a practically useful approach to handling incompletely observed data in statistical analysis. Parameter estimation and inference based on imputed full data have been made easy by Rubin's rule for result combination. However, creating proper imputation that accommodates flexible models for statistical analysis in practice can be very challenging. We propose an imputation framework that uses conditional semiparametric odds ratio models to impute the missing values. The proposed imputation framework is more flexible and robust than the imputation approach based on the normal model. It is a compatible framework in comparison to the approach based on fully conditionally specified models. The proposed algorithms for multiple imputation through the Markov chain Monte Carlo sampling approach can be straightforwardly carried out. Simulation studies demonstrate that the proposed approach performs better than existing, commonly used imputation approaches. The proposed approach is applied to imputing missing values in bone fracture data.  相似文献   

17.
The genotoxic activity of benzo[a]pyrene (BAP), 2-nitrofluorene (NF) and airborne particulate matter was evaluated in the DNA-repair host-mediated assay after intraperitoneal or intratracheal administration. Dimethylnitrosamine (DMNA), used as a positive control, showed a genotoxic effect after both intraperitoneal and intratracheal administration, the strongest effect being found in liver, followed by lungs and kidneys, whereas a weak effect was observed in the spleen. In general no difference in genotoxicity was found between the 2 administration routes used. For BAP, although clearly positive in vitro, a moderate dose-dependent effect was found only in the liver after intraperitoneal administration. NF, which was positive in vitro both with and without a metabolizing system, produced no genotoxic effect in any of the organs tested after intraperitoneal administration. Extracts of airborne particulate matter which were genotoxic in vitro failed to cause a genotoxic effect in vivo by either route of administration. Possible explanations for the differences between the data obtained in vitro and in vivo are discussed.  相似文献   

18.
In order to investigate the organic compound fraction of the Naples aerosol a chromatographic method was used for the separation and analysis of the polycyclic aromatic (PAH). As a first step a suitable one-step thin-layer chromatography (TLC) separation of the cyclohexane extractable material from airborne particulate was sought. After the TLC separation the concentrated samples were analyzed by reverse-phase liquid chromatography with fluorescence detection. We obtained chromatographic separation of five PAH on the EPA Priority Pollutant List and we determined the concentration of these PAHs present in atmospheric matter.  相似文献   

19.
Exposure to airborne particulate matter has adverse effects on human health and ecosystem. Mutagenic activity of airborne particulate organic matter extracts in three time periods from total suspended particles (TSP) and particles less than 10 μm (PM10) was evaluated in an area under the influence of a petrochemical industry located in the town of Triunfo, Brazil. The extracts were investigated using the Salmonella/microsome assay, with the microsuspension method. The extracts were obtained by sonication extracted using dichloromethane (DCM) solvent. The fractions were tested for mutagenicity with the Salmonella typhimurium strains TA98 (with and without metabolic activation), TA98NR and TA98/1,8DNP6; or YG1021 and YG1024. A positive frameshift mutagenic response was observed for the environmental samples during the different periods. The responses according to percentage of extractable organic matter (EOM%), EOM/m3, revertants/μg (rev/μg) and revertants/m3 (rev/m3) were lower for TSP than for PM10 extracts. The highest rev/m3 values were observed in PM10 extract samples collected in winter, July 2005, in the presence (13.79 rev/m3) or absence (6.87 rev/m3) of S9 fraction. Similarly in the first (1995) or second period (2000) the highest values for TSP were observed in winter, but with lower activity (3.00 and 0.89 rev/m3 respectively). The responses observed for the nitrosensitive strains suggest the contribution of nitro, amino and/or hydroxylamino derivatives of PAHs to the total mutagenicity of matter extracted from airborne particles. The Salmonella/microsome assay was a sensitive method to define areas contaminated by genotoxic compounds, even in samples with TSP or PM10 values that are acceptable according to legal environmental quality standards, favoring environmental control measures with an effective response seen in the population's improved quality of life.  相似文献   

20.

Introduction

A common problem in metabolomics data analysis is the existence of a substantial number of missing values, which can complicate, bias, or even prevent certain downstream analyses. One of the most widely-used solutions to this problem is imputation of missing values using a k-nearest neighbors (kNN) algorithm to estimate missing metabolite abundances. kNN implicitly assumes that missing values are uniformly distributed at random in the dataset, but this is typically not true in metabolomics, where many values are missing because they are below the limit of detection of the analytical instrumentation.

Objectives

Here, we explore the impact of nonuniformly distributed missing values (missing not at random, or MNAR) on imputation performance. We present a new model for generating synthetic missing data and a new algorithm, No-Skip kNN (NS-kNN), that accounts for MNAR values to provide more accurate imputations.

Methods

We compare the imputation errors of the original kNN algorithm using two distance metrics, NS-kNN, and a recently developed algorithm KNN-TN, when applied to multiple experimental datasets with different types and levels of missing data.

Results

Our results show that NS-kNN typically outperforms kNN when at least 20–30% of missing values in a dataset are MNAR. NS-kNN also has lower imputation errors than KNN-TN on realistic datasets when at least 50% of missing values are MNAR.

Conclusion

Accounting for the nonuniform distribution of missing values in metabolomics data can significantly improve the results of imputation algorithms. The NS-kNN method imputes missing metabolomics data more accurately than existing kNN-based approaches when used on realistic datasets.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号