首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Biclustering has emerged as an important approach to the analysis of large-scale datasets. A biclustering technique identifies a subset of rows that exhibit similar patterns on a subset of columns in a data matrix. Many biclustering methods have been proposed, and most, if not all, algorithms are developed to detect regions of “coherence” patterns. These methods perform unsatisfactorily if the purpose is to identify biclusters of a constant level. This paper presents a two-step biclustering method to identify constant level biclusters for binary or quantitative data. This algorithm identifies the maximal dimensional submatrix such that the proportion of non-signals is less than a pre-specified tolerance δ. The proposed method has much higher sensitivity and slightly lower specificity than several prominent biclustering methods from the analysis of two synthetic datasets. It was further compared with the Bimax method for two real datasets. The proposed method was shown to perform the most robust in terms of sensitivity, number of biclusters and number of serotype-specific biclusters identified. However, dichotomization using different signal level thresholds usually leads to different sets of biclusters; this also occurs in the present analysis.  相似文献   

2.
Biological networks, such as genetic regulatory networks and protein interaction networks, provide important information for studying gene/protein activities. In this paper, we propose a new method, NetBoosting, for incorporating a priori biological network information in analyzing high dimensional genomics data. Specially, we are interested in constructing prediction models for disease phenotypes of interest based on genomics data, and at the same time identifying disease susceptible genes. We employ the gradient descent boosting procedure to build an additive tree model and propose a new algorithm to utilize the network structure in fitting small tree weak learners. We illustrate by simulation studies and a real data example that, by making use of the network information, NetBoosting outperforms a few existing methods in terms of accuracy of prediction and variable selection.  相似文献   

3.
基于二次判别的果蝇启动子识别   总被引:3,自引:0,他引:3  
通过对果蝇polⅡ启动子和非启动子的序列特征分析,计算了序列每个位点单碱基保守性M1(l)值和六联体保守性M6(l)值。从而分别选取两个区域的六联体频数作为离散源参数,利用离散增量结合二次判别函数(IDQD)对启动子进行了预测。对于从编码区和内含子中选取的非启动子数据集,启动子的预测成功率分别达到93%和89%。比较结果显示IDQD模型能够有效地提高启动子预测成功率。  相似文献   

4.
Electrical transmission signals have been used for decades to characterize the internal structure of composite materials. We theoretically analyze the transmission of an electrical signal through a composite material which consists of two phases with different chemical compositions. We assume that the temperature of the biphasic system increases as a result of Joule heating and its electrical resistivity varies linearly with temperature; this last consideration leads to simultaneously study the electrical and thermal effects. We propose a nonlinear conjugate thermo-electric model, which is solved numerically to obtain the current density and temperature profiles for each phase. We study the effect of frequency, resistivities and thermal conductivities on the current density and temperature. We validate the prediction of the model with comparisons with experimental data obtained from rock characterization tests.  相似文献   

5.
An exogenous avian leukosis virus (ALV) strain SDAU09C1 was isolated in DF-1 cells from one of 240 imported 1-day-old white meat-type grand parent breeder chicks. Inoculation of SDAU09C1 in ALV-free chickens induced antibody reactions specific to subgroup A or B. But gp85 amino acid sequence comparisons indicated that SDAU09C1 fell into subgroup A; it had homology of 88.8%-90.3% to 6 reference strains of subgroup A, much higher compared to other subgroups including subgroup B. This is the first report for A...  相似文献   

6.
禽白血病病毒J亚群内蒙株的分离与鉴定   总被引:4,自引:0,他引:4  
本研究从曾见典型禽骨髓性白血病(Myeloid Leukosis, ML)病例的内蒙古某肉种鸡场随机选取的淘汰肉种鸡中,分离出一株J亚群禽白血病病毒(Avian leukosis virus Subgroup J, ALV-J).利用PCR和间接免疫荧光反应进行鉴定,J亚群禽白血病病毒内蒙株可以被两对ALV-J特异性引物扩增(特异条带约2.2kb和545bp);且在特异性单抗的间接免疫荧光检测中呈现强阳性荧光反应.此外,对山东一例肉种鸡骨髓性白血病病例亦进行了J亚群禽白血病病毒的分离与鉴定.2.2kb PCR扩增物的测序结果表明,两株分离病毒的同源性为96.9%,与已分离的SD9901和YZ9901株ALV-J同源性达94.2%~95.4%.  相似文献   

7.
本研究从曾见典型禽骨髓性白血病(Myeloid Leukosis,ML)病例的内蒙古某肉种鸡场随机选取的淘汰肉种鸡中,分离出一株J亚群禽白血病病毒(Avian leukosis virus Subgroup J.ALV-J)。利用PCR和间接免疫荧光反应进行鉴定,J亚群禽白血病病毒内蒙株可以被两对ALV—J特异性引物扩增(特异条带约2.2kb和545bp);且在特异性单抗的间接免疫荧光检测中呈现强阳性荧光反应。此外,对山东一例肉种鸡骨髓性白血病病例亦进行了J亚群禽白血病病毒的分离与鉴定。2.2kb PCR扩增物的测序结果表明,两株分离病毒的同源性为96.9%,与已分离的SD9901和YZ9901株ALV-J同源性达94.2%~95.4%。  相似文献   

8.
9.
Lactococcal dairy starter strains are under constant threat from phages in dairy fermentation facilities, especially by members of the so-called 936, P335, and c2 species. Among these three phage groups, members of the P335 species are the most genetically diverse. Here, we present the complete genome sequences of two P335-type phages, Q33 and BM13, isolated in North America and representing a novel lineage within this phage group. The Q33 and BM13 genomes exhibit homology, not only to P335-type, but also to elements of the 936-type phage sequences. The two phage genomes also have close relatedness to phages infecting Enterococcus and Clostridium, a heretofore unknown feature among lactococcal P335 phages. The Q33 and BM13 genomes are organized in functionally related clusters with genes encoding functions such as DNA replication and packaging, morphogenesis, and host cell lysis. Electron micrographic analysis of the two phages highlights the presence of a baseplate more reminiscent of the baseplate of 936 phages than that of the majority of members of the P335 group, with the exception of r1t and LC3.  相似文献   

10.
The relationship between the commonly used indiréctly standardized comparative mortality indices, the standardized mortality ratio (SMR) and the standardized proportional mortality ratio (SPMR) is well known; whereas, the relationship between their less commonly used directly adjusted counterparts, the standardized risk ratio (SRR) and the recently proposed externally standardized proportional mortality ratio (SePMR) has been heretofore not developed. This paper fills this void by demonstrating the algebraic and statistical relationship between the SRR and the SePMR and showing how, under some modest assumptions, valid inferences about the SRR can be based on analysis of SePMR's. More specifically, an asymptotic prediction interval has been developed for the SePMRi which, with high probability, contains the ratio RSRRi = SRRi/SRR. The utility of the SePMRi is supported empirically using data from a recent retrospective cohort study of mineral fiber workers.  相似文献   

11.

Background

Bacteraemia is a frequent and severe condition with a high mortality rate. Despite profound knowledge about the pre-test probability of bacteraemia, blood culture analysis often results in low rates of pathogen detection and therefore increasing diagnostic costs. To improve the cost-effectiveness of blood culture sampling, we computed a risk prediction model based on highly standardizable variables, with the ultimate goal to identify via an automated decision support tool patients with very low risk for bacteraemia.

Methods

In this retrospective hospital-wide cohort study evaluating 15,985 patients with suspected bacteraemia, 51 variables were assessed for their diagnostic potency. A derivation cohort (n = 14.699) was used for feature and model selection as well as for cut-off specification. Models were established using the A2DE classifier, a supervised Bayesian classifier. Two internally validated models were further evaluated by a validation cohort (n = 1,286).

Results

The proportion of neutrophile leukocytes in differential blood count was the best individual variable to predict bacteraemia (ROC-AUC: 0.694). Applying the A2DE classifier, two models, model 1 (20 variables) and model 2 (10 variables) were established with an area under the receiver operating characteristic curve (ROC-AUC) of 0.767 and 0.759, respectively. In the validation cohort, ROC-AUCs of 0.800 and 0.786 were achieved. Using predefined cut-off points, 16% and 12% of patients were allocated to the low risk group with a negative predictive value of more than 98.8%.

Conclusion

Applying the proposed models, more than ten percent of patients with suspected blood stream infection were identified having minimal risk for bacteraemia. Based on these data the application of this model as an automated decision support tool for physicians is conceivable leading to a potential increase in the cost-effectiveness of blood culture sampling. External prospective validation of the model''s generalizability is needed for further appreciation of the usefulness of this tool.  相似文献   

12.
13.
本文讨论了武汉市区酸雨的预测与控制问题。  相似文献   

14.
肝细胞癌(hepatocellular carcinoma,HCC)是世界上高发病率和高死亡率的恶性肿瘤之一.研究目的是寻找HCC相关的mi RNA预后生物学标志物,预测HCC患者的风险程度和生存时间,为他们提供有效的预后信息.使用4种方法从TCGA中识别差异表达的mi RNAs(DEMs).并用Kaplan-Meier生存曲线、单因素和多因素Cox回归分析从DEMs中筛选肝癌预后相关的mi RNA.最终4个HCC的预后mi RNA生物学标志物(hsa-mi R-132-3p、hsa-mi R-139-5p、hsa-mi R-3677-3p、hsa-mi R-500a-3p)被筛选出来组合成一个风险评分模型.目前还没有实验证据表明组合中的hsa-mir-3677-3p与HCC相关,是本研究新发现的mi RNA.生存曲线、ROC曲线、卡方检验等多种生物信息学方法的评价结果均表明,该模型计算出的风险分值能有效预测患者的风险程度(P<0.000,风险比=2.551,95%置信区间=1.751-3.717).低风险组HCC患者1-5年生存率比高风险组高20%-30%.通过与临床数据分析发现,组合的生物学标志物较其他临床指标相比具有更好的预后效果,也可以作为独立的预后因子.最后,预测了4种mi RNA的靶基因,包括AGO2、FOXO1、ROCK2、RAP1B、CYLD等,并在细胞增殖、迁移、凋亡、免疫应答等生物学过程中富集.  相似文献   

15.

Background

Confident identification of microRNA-target interactions is significant for studying the function of microRNA (miRNA). Although some computational miRNA target prediction methods have been proposed for plants, results of various methods tend to be inconsistent and usually lead to more false positive. To address these issues, we developed an integrated model for identifying plant miRNA–target interactions.

Results

Three online miRNA target prediction toolkits and machine learning algorithms were integrated to identify and analyze Arabidopsis thaliana miRNA-target interactions. Principle component analysis (PCA) feature extraction and self-training technology were introduced to improve the performance. Results showed that the proposed model outperformed the previously existing methods. The results were validated by using degradome sequencing supported Arabidopsis thaliana miRNA-target interactions. The proposed model constructed on Arabidopsis thaliana was run over Oryza sativa and Vitis vinifera to demonstrate that our model is effective for other plant species.

Conclusions

The integrated model of online predictors and local PCA-SVM classifier gained credible and high quality miRNA-target interactions. The supervised learning algorithm of PCA-SVM classifier was employed in plant miRNA target identification for the first time. Its performance can be substantially improved if more experimentally proved training samples are provided.  相似文献   

16.
Nitrous oxide (N2O) is one of the greenhouse gases that can contribute to global warming. Spatial variability of N2O can lead to large uncertainties in prediction. However, previous studies have often ignored the spatial dependency to quantify the N2O – environmental factors relationships. Few researches have examined the impacts of various spatial correlation structures (e.g. independence, distance-based and neighbourhood based) on spatial prediction of N2O emissions. This study aimed to assess the impact of three spatial correlation structures on spatial predictions and calibrate the spatial prediction using Bayesian model averaging (BMA) based on replicated, irregular point-referenced data. The data were measured in 17 chambers randomly placed across a 271 m2 field between October 2007 and September 2008 in the southeast of Australia. We used a Bayesian geostatistical model and a Bayesian spatial conditional autoregressive (CAR) model to investigate and accommodate spatial dependency, and to estimate the effects of environmental variables on N2O emissions across the study site. We compared these with a Bayesian regression model with independent errors. The three approaches resulted in different derived maps of spatial prediction of N2O emissions. We found that incorporating spatial dependency in the model not only substantially improved predictions of N2O emission from soil, but also better quantified uncertainties of soil parameters in the study. The hybrid model structure obtained by BMA improved the accuracy of spatial prediction of N2O emissions across this study region.  相似文献   

17.
Extenics is a newly developed interdisciplinary subject combining mathematics, philosophy and engineering. It providesuseful formalized qualitative tools and quantitative tools for solving contradictory problems. In this paper, extension theory isintroduced briefly and the primary applications of this theory and methods in bionic engineering research are discussed. Theextension model of biological coupling functional system is established. In order to identify the primary and secondary sequencingof coupling elements, the Extension Analytic Hierarchy Process (EAHP) was adopted to analyze the contribution ofeach coupling element to the coupling functional system. Thus, the influence weight factor of each coupling element can bedetermined, so as to provide a new approach for solving primary and secondary sequencing problem of coupling elements in aquantitative way, and facilitate the subsequent bionic coupling study.  相似文献   

18.
ObjectiveInsulin pump discontinuation has mostly been studied in children and adolescents living with diabetes. We aimed to assess the rate of insulin pump continuation in a population of adult patients with diabetes, at 18 months after initiation; determine the factors associated with pump discontinuation; and develop a simple prediction model.MethodsThis single-center, retrospective study included all adult patients with type 1 diabetes or type 2 diabetes who started insulin pump treatment between January 2015 and June 2018. The exclusion criteria were pregnancy, short-term pregnancy plans, and insulin pump discontinuation within the previous 6 months. The probability of insulin pump continuation after 18 months was estimated using the Kaplan-Meier method. Factors associated with insulin pump discontinuation were studied using a Cox regression model, and an exponential model was built for prediction purposes.ResultsThe study included 315 patients. The mean age was 41 years, the mean duration of diabetes was 16 years, 50% were men, 74% had type 1 diabetes, and the mean hemoglobin A1c level was 9.1% (76 mmol/mol). After 18 months, the rate of insulin pump continuation was 0.80 (95% Confidence Interval (CI), 0.76-0.85). By multivariate analysis, the occurrence of severe hypoglycemia in the previous year was associated with insulin pump discontinuation (hazard ratio, 2.42; 95% CI, 1.30-4.51), while other factors did not reach statistical significance.ConclusionInsulin pump discontinuation occurred in 20% of patients at 18 months after initiation and was mainly associated with a recent history of severe hypoglycemia. The type of diabetes and glycemic control at baseline were not associated with treatment discontinuation.  相似文献   

19.
Personalized medicine aims to identify those patients who have good or poor prognosis for overall disease outcomes or therapeutic efficacy for a specific treatment. A well-established approach is to identify a set of biomarkers using statistical methods with a classification algorithm to identify patient subgroups for treatment selection. However, there are potential false positives and false negatives in classification resulting in incorrect patient treatment assignment. In this paper, we propose a hybrid mixture model taking uncertainty in class labels into consideration, where the class labels are modeled by a Bernoulli random variable. An EM algorithm was developed to estimate the model parameters, and a parametric bootstrap method was used to test the significance of the predictive variables that were associated with subgroup memberships. Simulation experiments showed that the proposed method averagely had higher accuracy in identifying the subpopulations than the Naïve Bayes classifier and logistic regression. A breast cancer dataset was analyzed to illustrate the proposed hybrid mixture model.  相似文献   

20.
A Model for Analysis of Population Structure   总被引:5,自引:3,他引:2       下载免费PDF全文
Arguments have been presented for the appropriateness of a multinomial Dirichlet distribution for describing single-locus genotypic frequencies in a subdivided population. This distribution is defined as a function of allele frequency, the average (over the entire population) inbreeding coefficient and the correlation between genotypes within a subdivision. Alternative parameterizations and their genetic interpretations are given.-We then show how information from a sample drawn from this subdivided population, in the absence of pedigrees, can be combined with the multinomial Dirichlet model to form a likelihood function. This likelihood function is then used as the basis for estimation and testing hypotheses concerning the genetic parameters of the model. Comparisons of this approach to the alternative procedure of Cockerham (1969) and (1973) are made using human data obtained from Tecumseh, Michigan and Monte Carlo simulations.-Finally, implications of these results to statistical inference and to mutation rates are presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号