首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
SUMMARY: We present the web-based program CREx for heuristically determining pairwise rearrangement events in unichromosomal genomes. CREx considers transpositions, reverse transpositions, reversals and tandem-duplication-random-loss (TDRL) events. It supports the user in finding parsimonious rearrangement scenarios given a phylogenetic hypothesis. CREx is based on common intervals, which reflect genes that appear consecutively in several of the input gene orders. AVAILABILITY: CREx is freely available at http://pacosy.informatik.uni-leipzig.de/crex  相似文献   

2.
Accounting for spatial pattern when modeling organism-environment interactions   总被引:10,自引:0,他引:10  
Statistical models of environment-abundance relationships may be influenced by spatial autocorrelation in abundance, environmental variables, or both. Failure to account for spatial autocorrelation can lead to incorrect conclusions regarding both the absolute and relative importance of environmental variables as determinants of abundance. We consider several classes of statistical models that are appropriate for modeling environment-abundance relationships in the presence of spatial autocorrelation, and apply these to three case studies: 1) abundance of voles in relation to habitat characteristics; 2) a plant competition experiment; and 3) abundance of Orbatid mites along environmental gradients. We find that when spatial pattern is accounted for in the modeling process, conclusions about environmental control over abundance can change dramatically. We conclude with five lessons: 1) spatial models are easy to calculate with several of the most common statistical packages; 2) results from spatially-structured models may point to conclusions radically different from those suggested by a spatially independent model; 3) not all spatial autocorrelation in abundances results from spatial population dynamics; it may also result from abundance associations with environmental variables not included in the model; 4) the different spatial models do have different mechanistic interpretations in terms of ecological processes – thus ecological model selection should take primacy over statistical model selection; 5) the conclusions of the different spatial models are typically fairly similar – making any correction is more important than quibbling about which correction to make.  相似文献   

3.
During the last years gene interaction networks are increasingly being used for the assessment and interpretation of biological measurements. Knowledge of the interaction partners of an unknown protein allows scientists to understand the complex relationships between genetic products, helps to reveal unknown biological functions and pathways, and get a more detailed picture of an organism''s complexity. Being able to measure all protein interactions under all relevant conditions is virtually impossible. Hence, computational methods integrating different datasets for predicting gene interactions are needed. However, when integrating different sources one has to account for the fact that some parts of the information may be redundant, which may lead to an overestimation of the true likelihood of an interaction. Our method integrates information derived from three different databases (Bioverse, HiMAP and STRING) for predicting human gene interactions. A Bayesian approach was implemented in order to integrate the different data sources on a common quantitative scale. An important assumption of the Bayesian integration is independence of the input data (features). Our study shows that the conditional dependency cannot be ignored when combining gene interaction databases that rely on partially overlapping input data. In addition, we show how the correlation structure between the databases can be detected and we propose a linear model to correct for this bias. Benchmarking the results against two independent reference data sets shows that the integrated model outperforms the individual datasets. Our method provides an intuitive strategy for weighting the different features while accounting for their conditional dependencies.  相似文献   

4.
5.

Background  

Segmentation of the coronary angiogram is important in computer-assisted artery motion analysis or reconstruction of 3D vascular structures from a single-plan or biplane angiographic system. Developing fully automated and accurate vessel segmentation algorithms is highly challenging, especially when extracting vascular structures with large variations in image intensities and noise, as well as with variable cross-sections or vascular lesions.  相似文献   

6.
7.
Reiter  Jerome P. 《Biometrika》2008,95(4):933-946
When some of the records used to estimate the imputation modelsin multiple imputation are not used or available for analysis,the usual multiple imputation variance estimator has positivebias. We present an alternative approach that enables unbiasedestimation of variances and, hence, calibrated inferences insuch contexts. First, using all records, the imputer samplesm values of the parameters of the imputation model. Second,for each parameter draw, the imputer simulates the missing valuesfor all records n times. From these mn completed datasets, theimputer can analyse or disseminate the appropriate subset ofrecords. We develop methods for interval estimation and significancetesting for this approach. Methods are presented in the contextof multiple imputation for measurement error.  相似文献   

8.
The CBCAnalyzer (CBC=compensatory base change) is a custom written software toolbox consisting of three parts, CTTransform, CBCDetect, and CBCTree. CTTransform reads several ct-file formats, and generates a so called "bracket-dot-bracket" format that typically is used as input for other tools such as RNAforester, RNAmovie or MARNA. The latter one creates a multiple alignment based on primary sequences and secondary structures that now can be used as input for CBCDetect. CBCDetect counts CBCs in all against all of the aligned sequences. This is important in detecting species that are discriminated by their sexual incompatibility. The count (distance) matrix obtained by CBCDetect is used as input for CBCTree that reconstructs a phylogram by using the algorithm of BIONJ. In this note we describe the features of the toolbox as well as application examples. The toolbox provides a graphical user interface. It is written in C++ and freely available at: http://cbcanalyzer.bioapps.biozentrum.uni-wuerzburg.de.  相似文献   

9.

Background

Making accurate patient care decision, as early as possible, is a constant challenge, especially for physicians in the emergency department. The increasing volumes of electronic medical records (EMRs) open new horizons for automatic diagnosis. In this paper, we propose to use machine learning approaches for automatic infection detection based on EMRs. Five categories of information are utilized for prediction, including personal information, admission note, vital signs, diagnose test results and medical image diagnose.

Results

Experimental results on a newly constructed EMRs dataset from emergency department show that machine learning models can achieve a decent performance for infection detection with area under the receiver operator characteristic curve (AUC) of 0.88. Out of all the five types of information, admission note in text form makes the most contribution with the AUC of 0.87.

Conclusions

This study provides a state-of-the-art EMRs processing system to automatically make medical decisions. It extracts five types of features associated with infection and achieves a decent performance on automatic infection detection based on machine learning models.
  相似文献   

10.
Area disease estimation based on sentinel hospital records   总被引:2,自引:0,他引:2  
Wang JF  Reis BY  Hu MG  Christakos G  Yang WZ  Sun Q  Li ZJ  Li XZ  Lai SJ  Chen HY  Wang DC 《PloS one》2011,6(8):e23428

Background

Population health attributes (such as disease incidence and prevalence) are often estimated using sentinel hospital records, which are subject to multiple sources of uncertainty. When applied to these health attributes, commonly used biased estimation techniques can lead to false conclusions and ineffective disease intervention and control. Although some estimators can account for measurement error (in the form of white noise, usually after de-trending), most mainstream health statistics techniques cannot generate unbiased and minimum error variance estimates when the available data are biased.

Methods and Findings

A new technique, called the Biased Sample Hospital-based Area Disease Estimation (B-SHADE), is introduced that generates space-time population disease estimates using biased hospital records. The effectiveness of the technique is empirically evaluated in terms of hospital records of disease incidence (for hand-foot-mouth disease and fever syndrome cases) in Shanghai (China) during a two-year period. The B-SHADE technique uses a weighted summation of sentinel hospital records to derive unbiased and minimum error variance estimates of area incidence. The calculation of these weights is the outcome of a process that combines: the available space-time information; a rigorous assessment of both, the horizontal relationships between hospital records and the vertical links between each hospital''s records and the overall disease situation in the region. In this way, the representativeness of the sentinel hospital records was improved, the possible biases of these records were corrected, and the generated area incidence estimates were best linear unbiased estimates (BLUE). Using the same hospital records, the performance of the B-SHADE technique was compared against two mainstream estimators.

Conclusions

The B-SHADE technique involves a hospital network-based model that blends the optimal estimation features of the Block Kriging method and the sample bias correction efficiency of the ratio estimator method. In this way, B-SHADE can overcome the limitations of both methods: Block Kriging''s inadequacy concerning the correction of sample bias and spatial clustering; and the ratio estimator''s limitation as regards error minimization. The generality of the B-SHADE technique is further demonstrated by the fact that it reduces to Block Kriging in the case of unbiased samples; to ratio estimator if there is no correlation between hospitals; and to simple statistic if the hospital records are neither biased nor space-time correlated. In addition to the theoretical advantages of the B-SHADE technique over the two other methods above, two real world case studies (hand-foot-mouth disease and fever syndrome cases) demonstrated its empirical superiority, as well.  相似文献   

11.
Problems induced by heterogeneity in species and individuals detectability are now well recognized when analysing count data. Yet, most recent techniques developed to handle this problem are still hardly applicable to many monitoring schemes, and do not provide abundance estimates at the point count scale. Here, we show how using simple weather variables can be a useful surrogate to detect variability in species detectability. We further look for a potential bias or loss in statistical power based on count data while ignoring weather and time-of-day variables. We first used the French Breeding Bird Survey to test how each of the counts of the 97 most common breeding species was influenced by weather and time-of-day variables. We assessed how the estimation of each species response to fragmentation could be influenced by correcting counts with such variables. Among 97 species, 75 were affected by at least one of the five weather and time-of-day variables considered. Despite these strong influences, the relationship between species abundance and fragmentation was not biased when not controlling counts for weather and time-of-day variables and further found no improvement in statistical power when accounting for these variables. Our results show that simple variables can be very powerful to assess how species detectability is influenced by weather conditions but they are inconsistent with any specific bias due to heterogeneous detectability. We suggest that raw count data can be used without any correction in case the sources of variation in detectability could be considered independent to the factor of interest.  相似文献   

12.
13.
Data published by R. Y. Stanier, N. J. Palleroni, M. Doudoroff and their colleagues on Pseudomonas have been analysed by numerical taxonomy. Records on 401 strains were used, representing 155 characters, mostly utilization of substrates as carbon-energy sources. Twenty-nine phenons were recognized, which included 394 strains: the remaining 7 remained unclustered. The results were in very good accord with the conclusions of these authors. Almost all phenons were well separated with very little overlap. Many of them corresponded to distinct species, and others corresponded to recognized biotypes. Some small groups may represent unnamed new species.Analyses by Gower's Coefficient showed five major groupings: A) the fluorescent pseudomonads; B) biochemically active species (Pseudomonas cepacia, P. pseudomallei and allies); D) P. solanacearum and allies; and E) P. mallei. P. diminuta does not appear to be clearly distinct from P. vesicularis, nor does P. alcaligenes appear clearly distinct from P. pseudoalcaligenes. There may, however, be some difference between P. multivorans and P. cepacia.Analyses using the Pattern Coefficient differed mainly in the relationships shown by a few of the metabolically active species. Of the two coefficients, the Pattern Coefficient gave results that were in somewhat better agreement with evidence from nucleic acids, but it showed an unexpectedly close relationship between P. solanacearum and P. cepacia.  相似文献   

14.

Background

Somatic copy number alternations (SCNAs) can be utilized to infer tumor subclonal populations in whole genome seuqncing studies, where usually their read count ratios between tumor-normal paired samples serve as the inferring proxy. Existing SCNA based subclonal population inferring tools consider the GC bias of tumor and normal sample is of the same fature, and could be fully offset by read count ratio. However, we found that, the read count ratio on SCNA segments presents a Log linear biased pattern, which influence existing read count ratios based subclonal inferring tools performance. Currently no correction tools take into account the read ratio bias.

Results

We present Pre-SCNAClonal, a tool that improving tumor subclonal population inferring by correcting GC-bias at SCNAs level. Pre-SCNAClonal first corrects GC bias using Markov chain Monte Carlo probability model, then accurately locates baseline DNA segments (not containing any SCNAs) with a hierarchy clustering model. We show Pre-SCNAClonal’s superiority to exsiting GC-bias correction methods at any level of subclonal population.

Conclusions

Pre-SCNAClonal could be run independently as well as serving as pre-processing/gc-correction step in conjuntion with exsiting SCNA-based subclonal inferring tools.
  相似文献   

15.
Methods developed by the metrological community and principles used by the research community were integrated to provide a basis for a periodic maintenance interval analysis system. Engineering endpoints are used as measurement attributes on which to base two primary quality indicators: accuracy and reliability. Also key to establishing appropriate maintenance intervals is the ability to recognize two primary failure modes: random failure and time-related failure. The primary objective of the maintenance program is to avert predictable and preventable device failure, and understanding time-related failures enables service personnel to set intervals accordingly.  相似文献   

16.
The anode biofilm in a microbial fuel cell (MFC) is composed of diverse populations of bacteria, many of whose capacities for electricity generation are unknown. To identify functional populations in these exoelectrogenic communities, a culture-dependent approach based on dilution to extinction was combined with culture-independent community analysis. We analyzed the diversity and dynamics of microbial communities in single-chamber air-cathode MFCs with different anode surfaces using denaturing gradient gel electrophoresis based on the 16S rRNA gene. Phylogenetic analyses showed that the bacteria enriched in all reactors belonged primarily to five phylogenetic groups: Firmicutes, Actinobacteria, α-Proteobacteria, β-Proteobacteria, and γ-Proteobacteria. Dilution-to-extinction experiments further demonstrated that Comamonas denitrificans and Clostridium aminobutyricum were dominant members of the community. A pure culture isolated from an anode biofilm after dilution to extinction was identified as C. denitrificans DX-4 based on 16S rRNA sequence and physiological and biochemical characterizations. Strain DX-4 was unable to respire using hydrous Fe(III) oxide but produced 35 mW/m2 using acetate as the electron donor in an MFC. Power generation by the facultative C. denitrificans depends on oxygen and MFC configuration, suggesting that a switch of metabolic pathway occurs for extracellular electron transfer by this denitrifying bacterium.  相似文献   

17.
The elucidation of the complex machinery used by the human brain to segregate and integrate information while performing high cognitive functions is a subject of imminent future consequences. The most significant contributions to date in this field, known as cognitive neuroscience, have been achieved by using innovative neuroimaging techniques, such as electroencephalogram (EEG) and functional magnetic resonance imaging (fMRI), which measure variations in both the time and the space of some interpretable physical magnitudes. Extraordinary maps of cerebral activation involving function-restricted brain areas, as well as graphs of the functional connectivity between them, have been obtained from EEG and fMRI data by solving some spatio-temporal inverse problems, which constitutes a top-down approach. However, in many cases, a natural bridge between these maps/graphs and the causal physiological processes is lacking, leading to some misunderstandings in their interpretation. Recent advances in the comprehension of the underlying physiological mechanisms associated with different cerebral scales have provided researchers with an excellent scenario to develop sophisticated biophysical models that permit an integration of these neuroimage modalities, which must share a common aetiology. This paper proposes a bottom-up approach, involving physiological parameters in a specific mesoscopic dynamic equations system. Further observation equations encapsulating the relationship between the mesostates and the EEG/fMRI data are obtained on the basis of the physical foundations of these techniques. A methodology for the estimation of parameters from fused EEG/fMRI data is also presented. In this context, the concepts of activation and effective connectivity are carefully revised. This new approach permits us to examine and discuss some future prospects for the integration of multimodal neuroimages.  相似文献   

18.

Background  

As one of the most widely used parsimony methods for ancestral reconstruction, the Fitch method minimizes the total number of hypothetical substitutions along all branches of a tree to explain the evolution of a character. Due to the extensive usage of this method, it has become a scientific endeavor in recent years to study the reconstruction accuracies of the Fitch method. However, most studies are restricted to 2-state evolutionary models and a study for higher-state models is needed since DNA sequences take the format of 4-state series and protein sequences even have 20 states.  相似文献   

19.
Summary Responses from four generations of index selection for egg production to 280 days of age in four White Leghorn populations have been presented. A pedigreed randombred population derived from one of the lines was reared with the selected lines to measure the environmental trend. The magnitude of total as well as average response although varying from population to population was positive in all the lines studied. Close correspondence between predicted and realized gains indicated that natural selection, genotype environmental interactions and environmental fluctuations were unimportant during the course of selection. Realized heritabilities agreed fairly well with the estimated heritabilities in at least three out of four populations studied. Probable reasons for variable and insufficient response were investigated.  相似文献   

20.

Background  

Copy number variations (CNVs) may play an important role in disease risk by altering dosage of genes and other regulatory elements, which may have functional and, ultimately, phenotypic consequences. Therefore, determining whether a CNV is associated or not with a given disease might be relevant in understanding the genesis and progression of human diseases. Current stage technology give CNV probe signal from which copy number status is inferred. Incorporating uncertainty of CNV calling in the statistical analysis is therefore a highly important aspect. In this paper, we present a framework for assessing association between CNVs and disease in case-control studies where uncertainty is taken into account. We also indicate how to use the model to analyze continuous traits and adjust for confounding covariates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号