首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In order to have confidence in model-based phylogenetic analysis, the model of nucleotide substitution adopted must be selected in a statistically rigorous manner. Several model-selection methods are applicable to maximum likelihood (ML) analysis, including the hierarchical likelihood-ratio test (hLRT), Akaike information criterion (AIC), Bayesian information criterion (BIC), and decision theory (DT), but their performance relative to empirical data has not been investigated thoroughly. In this study, we use 250 phylogenetic data sets obtained from TreeBASE to examine the effects that choice in model selection has on ML estimation of phylogeny, with an emphasis on optimal topology, bootstrap support, and hypothesis testing. We show that the use of different methods leads to the selection of two or more models for approximately 80% of the data sets and that the AIC typically selects more complex models than alternative approaches. Although ML estimation with different best-fit models results in incongruent tree topologies approximately 50% of the time, these differences are primarily attributable to alternative resolutions of poorly supported nodes. Furthermore, topologies and bootstrap values estimated with ML using alternative statistically supported models are more similar to each other than to topologies and bootstrap values estimated with ML under the Kimura two-parameter (K2P) model or maximum parsimony (MP). In addition, Swofford-Olsen-Waddell-Hillis (SOWH) tests indicate that ML trees estimated with alternative best-fit models are usually not significantly different from each other when evaluated with the same model. However, ML trees estimated with statistically supported models are often significantly suboptimal to ML trees made with the K2P model when both are evaluated with K2P, indicating that not all models perform in an equivalent manner. Nevertheless, the use of alternative statistically supported models generally does not affect tests of monophyletic relationships under either the Shimodaira-Hasegawa (S-H) or SOWH methods. Our results suggest that although choice in model selection has a strong impact on optimal tree topology, it rarely affects evolutionary inferences drawn from the data because differences are mainly confined to poorly supported nodes. Moreover, since ML with alternative best-fit models tends to produce more similar estimates of phylogeny than ML under the K2P model or MP, the use of any statistically based model-selection method is vastly preferable to forgoing the model-selection process altogether.  相似文献   

2.
ABSTRACT: BACKGROUND: : Abnormal blood glucose (BG) concentrations have been associated with increased morbidity and mortality in both critically ill adults and infants. Furthermore, hypoglycaemia and glycaemic variability have both been independently linked to mortality in these patients. Continuous Glucose Monitoring (CGM) devices have the potential to improve detection and diagnosis of these glycaemic abnormalities. However, sensor noise is a trade-off of the high measurement rate and must be managed effectively if CGMs are going to be used to monitor, diagnose and potentially help treat glycaemic abnormalities. AIM: To develop a tool that will aid clinicians in identifying unusual CGM behaviour and highlight CGM data that potentially need to be interpreted with care. METHOD: S: CGM data and BG measurements from 50 infants at risk of hypoglycaemia were used. Unusual CGM measurements were classified using a stochastic model based on the kernel density method and historical CGM measurements from the cohort. CGM traces were colour coded with very unusual measurements coloured red, highlighting areas to be interpreted with care. A 5-fold validation of the model was Monte Carlo simulated 25 times to ensure an adequate model fit. RESULTS: : The stochastic model was generated using ~67,000 CGM measurements, spread across the glycaemic range ~2-10mmol/L. A 5-fold validation showed a good model fit: the model 80% confidence interval (CI) captured 83% of clinical CGM data, the model 90% CI captured 91% of clinical CGM data, and the model 99% CI captured 99% of clinical CGM data. Three patient examples show the stochastic classification method in use with 1) A stable, low variability patient which shows no unusual CGM measurements, 2) A patient with a very sudden, short hypoglycaemic event (classified as unusual), and, 3) A patient with very high, potentially un-physiological, glycaemic variability after day 3 of monitoring (classified as very unusual). CONCLUSIONS: : This study has produced a stochastic model and classification method capable of highlighting unusual CGM behaviour. This method has the potential to classify important glycaemic events (e.g. hypoglycaemia) as true clinical events or sensor noise, and to help identify possible sensor degradation. Colour coded CGM traces convey the information quickly and efficiently, while remaining computationally light enough to be used retrospectively or in real-time.  相似文献   

3.
Species distribution models (SDMs) relate presence/absence data to environmental variables, allowing to predict species environmental requirements and potential distribution. They have been increasingly used in fields such as ecology, biogeography and evolution, and often support conservation priorities and strategies. Thus, it becomes crucial to understand how trustworthy and reliable their predictions are. Different approaches, such as using ensemble methods (combining forecasts of different single models), or applying the most suitable threshold to transform continuous probability maps into species presences or absences, have been used to reduce model-based uncertainty. Taking into account the influence of biased sampling imprecision in species location, small datasets and species ecological characteristics, may also help to detect and compensate for uncertainty in the model building process. To investigate the effect of applying an ensemble approach, several threshold selection criteria and different datasets representing seasonal and spatial sampling bias, on models' accuracy, SDMs were built for four estuarine fish species with distinct use of the estuarine systems. Overall, predictions obtained with the ensemble approach were more accurate. Variability in accuracy metrics obtained with the nine threshold selection criteria applied was more pronounced for species with low prevalence and when sensitivity was calculated. Higher values of accuracy measures were registered with the threshold that maximizes the sum of sensitivity and specificity, and the threshold where the predicted prevalence equals the observed, whereas the 0.5 cut-off was unreliable, originating the lowest values for these metrics. Accuracy of models created from a spatially biased sampling was overall higher than accuracy of models created with a seasonally biased sampling or with the multi-year database created and this pattern was consistently obtained for marine migrant species, which use estuaries as nursery areas, presenting a seasonally and regular use of these ecosystems. The ecological dependence between these fish species and estuaries may add difficulties in the model building process, and needs to be taken into account, to improve their accuracy. The present study highlights the need for a thorough analysis of the critical underlying issues of the complete model building process to predict the distribution of estuarine fish species, due to the particular and dynamic nature of these ecosystems.  相似文献   

4.
Hurst LD  Williams EJ 《Gene》2000,261(1):107-114
Many attempts to test selectionist and neutralist models employ estimates of synonymous (Ks) and non-synonymous (Ka) substitution rates of orthologous genes. For example, a stronger Ka-Ks correlation than expected under neutrality has been argued to indicate a role for selection and the absence of a Ks-GC4 correlation has been argued to be inconsistent with neutral models for isochore evolution. However, both of these results, we have shown previously, are sensitive to the method by which Ka and Ks are estimated. Using a maximum likelihood (ML) estimator (GY94) we found a positive correlation between Ks and GC4 and only a weak correlation between Ka and Ks, lower than expected under neutral expectations. This ML method is computationally slow. Recently, a new ad hoc approximation of this ML method has been provided (YN00). This is effectively an extension of Li's protocol but that also allows for codon usage bias. This method is computationally near-instantaneous and therefore potentially of great utility for analysis of large datasets. Here we ask whether this method might have such applicability. To this end we ask whether it too recovers the two unusual results. We report that when the ML and earlier ad hoc methods disagree, YN00 recovers the results described by the ML methods, i.e. a positive correlation between GC4 and Ks and only a weak correlation between Ks and Ka. If the ML method can be trusted, then YN00 can also be considered an adequately reliable method for analysis of large datasets. Assuming this to be so we also analyze further the patterns. We show, for example, that the positive correlation between GC4 and Ks is probably in part a mutational bias, there being more methyl induced CpG-->TpG mutations in GC rich regions. As regards the evolution of isochores, it seems inappropriate to use the claimed lack of a correlation between GC and Ks as definitive evidence either against or for any model. If the positive correlation is real then, we argue, this is hard to reconcile with the biased gene conversion model for isochore formation as this predicts a negative correlation.  相似文献   

5.
Summary Governments across Australia have long been investing in revegetation in an effort to restore biodiversity and, more recently, mitigate climate change. However, no readily available methods have been described to assist project leaders identify species and provenance material likely to be sustainable under the changing climatic conditions of coming decades. Focussing particularly on trees, as trees are important for biosequestration as well as for providing habitat for other native species, Paper 1 of this two part series briefly reviews species distribution models and growth simulation models that could provide the scientific underpinning to improve and refine selection processes. While these previous scientific studies provide useful insights into how trees may respond to climate change, it is concluded that a readily accessible and easy‐to‐use approach is required to consider the potential adaptability of the many trees, shrubs and ground cover species that may be needed for biodiverse plantings. In Part 2 of this paper, the Atlas of Living Australia is used to provide preliminary information to assist species selection by assessing the climatic range of individual species based on their current distributions and, where available, cultivated locations. While using the Atlas can assist current selections, ways are outlined in Part 2 in which more reliable selections for changing climatic conditions could be made, building on the methods described here.  相似文献   

6.
Significant demographic changes in patient populations have contributed to an increasing awareness of the impact of cultural diversity on the provision of health care. For this reason methods are being developed to improve the cultural sensitivity of persons responsible for giving health care to patients whose health beliefs may be at variance with biomedical models.Building on methods of elicitation suggested in the literature, we have developed a set of guidelines within a framework called the LEARN model. Health care providers who have been exposed to this educational framework and have incorporated this model into the normal structure of the therapeutic encounter have been able to improve communication, heighten awareness of cultural issues in medical care and obtain better patient acceptance of treatment plans.The emphasis of this teaching model is not on the dissemination of particular cultural information, though this too is helpful. The primary focus is rather on a suggested process for improved communication, which we see as the fundamental need in cross-cultural patient-physician interactions.  相似文献   

7.
Tegumentary leishmaniasis is an endemic protozoan disease that, in Brazil, is caused by parasites from Viannia or Leishmania complex. The clinical forms of cutaneous disease comprise localized, disseminated, mucosal or mucocutaneous, and diffuse leishmaniasis. Viannia complex parasites are not easy to isolate from patient lesions, especially from mucosal lesions, and they are difficult to culture. The aim of the present study was to compare the efficiency of ex vivo (culture) and in vivo (IFNγ-deficient mice) parasite isolation methods to improve the isolation rate and storage of stocks of New World Leishmania sp that cause cutaneous leishmaniasis (CL) or mucosal leishmaniasis (ML). Biopsy fragments from cutaneous or mucosal lesions were inoculated into culture medium or mouse footpads. We evaluated 114 samples (86 CL, 28 ML) using both methods independently. Samples from CL patients had a higher isolation rate in ex vivo cultures than in mice (34.1% vs. 18.7%, P<0.05). Nevertheless, almost twice the number of isolates from ML lesions was isolated using the mouse model compared to ex vivo cultures (mouse, 6/25; culture, 3/27). The overall rates of isolation were 40.2% for CL samples and 29.6% for ML samples. Of the 43 isolations, we successfully stocked 35 isolates (81.4%; 27 CL, 8 ML). Contaminations were more frequently detected in cultures of ML than CL lesions. For comparison, the use of both methods simultaneously was performed in 74 samples of CL and 25 samples of ML, and similar results were obtained. Of the eight ML isolates, five were isolated only in mice, indicating the advantage of using the in vivo method to obtain ML parasites. All parasites obtained from in vivo isolation were cryopreserved, whereas only 68% of ex vivo isolations from CL lesions were stocked. In conclusion, the use of genetically modified mice can improve the isolation of parasites from ML. Isolation and stocking of New World Leishmania parasites, especially those from ML that are almost absent in laboratory stocks, are critical for evaluating parasite genetic diversity as well as studying host-parasite interactions to identify biological markers of Leishmania. In this paper, we also discuss some of the difficulties associated with isolating and stocking parasites.  相似文献   

8.
LARaLink 2.0 (Loci Analysis for Rearrangement Link) is an enabling web technology that permits the rapid retrieval of clinical cytogenetic and molecular data. New data mining capabilities have been incorporated into version 2.0, building upon LARaLink 1.0, to extend the utility of the system for applications in both the clinical and basic sciences. These include access to the Chromosomal Variation in Man database and the GEO database. Together these new resources enhance the user's ability to associate genotype with phenotype to identify potential gene candidates. Unlimited access for researchers exploring disease-gene relationships and for clinicians extending practice in patient care is available at LARaLink.bioinformatics.wayne.edu:8080/ unigene.  相似文献   

9.

Background

Abnormal blood glucose (BG) concentrations have been associated with increased morbidity and mortality in both critically ill adults and infants. Furthermore, hypoglycaemia and glycaemic variability have both been independently linked to mortality in these patients. Continuous Glucose Monitoring (CGM) devices have the potential to improve detection and diagnosis of these glycaemic abnormalities. However, sensor noise is a trade-off of the high measurement rate and must be managed effectively if CGMs are going to be used to monitor, diagnose and potentially help treat glycaemic abnormalities.

Aim

To develop a tool that will aid clinicians in identifying unusual CGM behaviour and highlight CGM data that potentially need to be interpreted with care.

Methods

CGM data and BG measurements from 50 infants at risk of hypoglycaemia were used. Unusual CGM measurements were classified using a stochastic model based on the kernel density method and historical CGM measurements from the cohort. CGM traces were colour coded with very unusual measurements coloured red, highlighting areas to be interpreted with care. A 5-fold validation of the model was Monte Carlo simulated 25 times to ensure an adequate model fit.

Results

The stochastic model was generated using ~67,000 CGM measurements, spread across the glycaemic range ~2-10?mmol/L. A 5-fold validation showed a good model fit: the model 80% confidence interval (CI) captured 83% of clinical CGM data, the model 90% CI captured 91% of clinical CGM data, and the model 99% CI captured 99% of clinical CGM data. Three patient examples show the stochastic classification method in use with 1) A stable, low variability patient which shows no unusual CGM measurements, 2) A patient with a very sudden, short hypoglycaemic event (classified as unusual), and, 3) A patient with very high, potentially un-physiological, glycaemic variability after day 3 of monitoring (classified as very unusual).

Conclusions

This study has produced a stochastic model and classification method capable of highlighting unusual CGM behaviour. This method has the potential to classify important glycaemic events (e.g. hypoglycaemia) as true clinical events or sensor noise, and to help identify possible sensor degradation. Colour coded CGM traces convey the information quickly and efficiently, while remaining computationally light enough to be used retrospectively or in real-time.  相似文献   

10.
Mammographic density has been proven as an independent risk factor for breast cancer. Women with dense breast tissue visible on a mammogram have a much higher cancer risk than women with little density. A great research effort has been devoted to incorporate breast density into risk prediction models to better estimate each individual’s cancer risk. In recent years, the passage of breast density notification legislation in many states in USA requires that every mammography report should provide information regarding the patient’s breast density. Accurate definition and measurement of breast density are thus important, which may allow all the potential clinical applications of breast density to be implemented. Because the two-dimensional mammography-based measurement is subject to tissue overlapping and thus not able to provide volumetric information, there is an urgent need to develop reliable quantitative measurements of breast density. Various new imaging technologies are being developed. Among these new modalities, volumetric mammographic density methods and three-dimensional magnetic resonance imaging are the most well studied. Besides, emerging modalities, including different x-ray–based, optical imaging, and ultrasound-based methods, have also been investigated. All these modalities may either overcome some fundamental problems related to mammographic density or provide additional density and/or compositional information. The present review article aimed to summarize the current established and emerging imaging techniques for the measurement of breast density and the evidence of the clinical use of these density methods from the literature.  相似文献   

11.
Summary Accurately assessing a patient’s risk of a given event is essential in making informed treatment decisions. One approach is to stratify patients into two or more distinct risk groups with respect to a specific outcome using both clinical and demographic variables. Outcomes may be categorical or continuous in nature; important examples in cancer studies might include level of toxicity or time to recurrence. Recursive partitioning methods are ideal for building such risk groups. Two such methods are Classification and Regression Trees (CART) and a more recent competitor known as the partitioning Deletion/Substitution/Addition (partDSA) algorithm, both of which also utilize loss functions (e.g., squared error for a continuous outcome) as the basis for building, selecting, and assessing predictors but differ in the manner by which regression trees are constructed. Recently, we have shown that partDSA often outperforms CART in so‐called “full data” settings (e.g., uncensored outcomes). However, when confronted with censored outcome data, the loss functions used by both procedures must be modified. There have been several attempts to adapt CART for right‐censored data. This article describes two such extensions for partDSA that make use of observed data loss functions constructed using inverse probability of censoring weights. Such loss functions are consistent estimates of their uncensored counterparts provided that the corresponding censoring model is correctly specified. The relative performance of these new methods is evaluated via simulation studies and illustrated through an analysis of clinical trial data on brain cancer patients. The implementation of partDSA for uncensored and right‐censored outcomes is publicly available in the R package, partDSA .  相似文献   

12.
With the advent of high-throughput technologies for measuring genome-wide expression profiles, a large number of methods have been proposed for discovering diagnostic markers that can accurately discriminate between different classes of a disease. However, factors such as the small sample size of typical clinical data, the inherent noise in high-throughput measurements, and the heterogeneity across different samples, often make it difficult to find reliable gene markers. To overcome this problem, several studies have proposed the use of pathway-based markers, instead of individual gene markers, for building the classifier. Given a set of known pathways, these methods estimate the activity level of each pathway by summarizing the expression values of its member genes, and use the pathway activities for classification. It has been shown that pathway-based classifiers typically yield more reliable results compared to traditional gene-based classifiers. In this paper, we propose a new classification method based on probabilistic inference of pathway activities. For a given sample, we compute the log-likelihood ratio between different disease phenotypes based on the expression level of each gene. The activity of a given pathway is then inferred by combining the log-likelihood ratios of the constituent genes. We apply the proposed method to the classification of breast cancer metastasis, and show that it achieves higher accuracy and identifies more reproducible pathway markers compared to several existing pathway activity inference methods.  相似文献   

13.
Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML''s desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI''s long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias—which is apparent under both controlled simulation conditions and in analyses of empirical sequence data—also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI''s bias is caused by one of the method''s stated advantages—that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.  相似文献   

14.
15.
The models of nucleotide substitution used by most maximum likelihood-based methods assume that the evolutionary process is stationary, reversible, and homogeneous. We present an extension of the Barry and Hartigan model, which can be used to estimate parameters by maximum likelihood (ML) when the data contain invariant sites and there are violations of the assumptions of stationarity, reversibility, and homogeneity. Unlike most ML methods for estimating invariant sites, we estimate the nucleotide composition of invariant sites separately from that of variable sites. We analyze a bacterial data set where problems due to lack of stationarity and homogeneity have been previously well noted and use the parametric bootstrap to show that the data are consistent with our general Markov model. We also show that estimates of invariant sites obtained using our method are fairly accurate when applied to data simulated under the general Markov model.  相似文献   

16.
Sequence data often have competing signals that are detected by network programs or Lento plots. Such data can be formed by generating sequences on more than one tree, and combining the results, a mixture model. We report that with such mixture models, the estimates of edge (branch) lengths from maximum likelihood (ML) methods that assume a single tree are biased. Based on the observed number of competing signals in real data, such a bias of ML is expected to occur frequently. Because network methods can recover competing signals more accurately, there is a need for ML methods allowing a network. A fundamental problem is that mixture models can have more parameters than can be recovered from the data, so that some mixtures are not, in principle, identifiable. We recommend that network programs be incorporated into best practice analysis, along with ML and Bayesian trees.  相似文献   

17.
Much has been learned in recent years about the mechanisms by which breastfeeding improves child health and survival. However, there has been little progress in using these insights to improve pediatric care. The aim of this study was to review all clinical studies of lactoferrin (LF) in children in an effort to determine which interventions may improve pediatric care or require further research. We conducted a systematic and critical review of published literature and found 19 clinical studies that have used human or bovine LF for different outcomes: iron metabolisms and anemia (6 studies), fecal flora (5 studies), enteric infections (3 studies), common pediatric illnesses (1 study), immunomodulation (3 studies), and neonatal sepsis (1 study). Although the efficacies have varied in each trial, the main finding of all published studies is the safety of the intervention. Protection against enteric infections and neonatal sepsis are the most likely biologically relevant activities of LF in children. Future studies on neonatal sepsis should answer critically important questions. If the data from these sepsis studies are proven to be correct, it will profoundly affect the treatment of low birth weight neonates and will aid in the reduction of child mortality worldwide.  相似文献   

18.
Effective methods developed to review and study the care of patients in hospital have not been applicable to ambulatory care, in which definitive diagnosis is the exception rather than the rule. A reasonable alternative to using diagnosis as the basis for assessing ambulatory care is to use the problems or requests presented by the patients themselves. A system has been developed for classifying and coding this information for flexible computer retrieval. Testing indicates that the system is simple in design, easily mastered by nonphysicians and provides reliable, useful data at a low cost.  相似文献   

19.
A model for a central equipment pool managed by a clinical engineering department has been presented. The advantages to patient care and to the clinical engineering department are many. The distribution of portable technology that has been traditionally managed by the materials management function is a logical match to the expanding role of clinical engineering departments in technology management. Accurate asset management tools have allowed us to provide reliable measures of infusion pump utilization, permitting us to predict future needs as programs expand. Thus we are more actively involved in strategic technology planning. The central equipment pool is an excellent opportunity for the clinical engineering department to increase its technology management activities.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号