首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

With the rapid development of high-throughput genotyping technologies, efficient methods for identifying linked regions using high-density SNP genotype data have become more and more important. Recently, a deterministic method that works very well on SNP genotyping data has been developed (Lin et al. Bioinformatics 2008, 24(1): 86–93). However, that program can only work on a limited number of family structures. In particular, the results (if any) will be poor when the genotype data for the whole chromosome of one of the parents in a nuclear family is missing.  相似文献   

2.

Background  

Safflower (Carthamus tinctorius L.) is a diploid oilseed crop whose origin is largely unknown. Safflower is widely believed to have been domesticated over 4,000 years ago somewhere in the Fertile Crescent. Previous hypotheses regarding the origin of safflower have focused primarily on two other species from sect. CarthamusC. oxyacanthus and C. palaestinus – as the most likely progenitors, although some attention has been paid to a third species (C. persicus) as a possible candidate. Here, we describe the results of a phylogenetic analysis of the entire section using data from seven nuclear genes.  相似文献   

3.

Background  

Microarray technology has become popular for gene expression profiling, and many analysis tools have been developed for data interpretation. Most of these tools require complete data, but measurement values are often missing A way to overcome the problem of incomplete data is to impute the missing data before analysis. Many imputation methods have been suggested, some na?ve and other more sophisticated taking into account correlation in data. However, these methods are binary in the sense that each spot is considered either missing or present. Hence, they are depending on a cutoff separating poor spots from good spots. We suggest a different approach in which a continuous spot quality weight is built into the imputation methods, allowing for smooth imputations of all spots to larger or lesser degree.  相似文献   

4.

Background  

It is an important pre-processing step to accurately estimate missing values in microarray data, because complete datasets are required in numerous expression profile analysis in bioinformatics. Although several methods have been suggested, their performances are not satisfactory for datasets with high missing percentages.  相似文献   

5.

Background  

Trait heterogeneity, which exists when a trait has been defined with insufficient specificity such that it is actually two or more distinct traits, has been implicated as a confounding factor in traditional statistical genetics of complex human disease. In the absence of detailed phenotypic data collected consistently in combination with genetic data, unsupervised computational methodologies offer the potential for discovering underlying trait heterogeneity. The performance of three such methods – Bayesian Classification, Hypergraph-Based Clustering, and Fuzzy k-Modes Clustering – appropriate for categorical data were compared. Also tested was the ability of these methods to detect trait heterogeneity in the presence of locus heterogeneity and/or gene-gene interaction, which are two other complicating factors in discovering genetic models of complex human disease. To determine the efficacy of applying the Bayesian Classification method to real data, the reliability of its internal clustering metrics at finding good clusterings was evaluated using permutation testing.  相似文献   

6.

Background  

The imputation of missing values is necessary for the efficient use of DNA microarray data, because many clustering algorithms and some statistical analysis require a complete data set. A few imputation methods for DNA microarray data have been introduced, but the efficiency of the methods was low and the validity of imputed values in these methods had not been fully checked.  相似文献   

7.
8.

Background  

Many E. coli genes show pH-dependent expression during logarithmic growth in acid (pH 5–6) or in base (pH 8–9). The effect of rapid pH change, however, has rarely been tested. Rapid acid treatment could distinguish between genes responding to external pH, and genes responding to cytoplasmic acidification, which occurs transiently following rapid external acidification. It could reveal previously unknown acid-stress genes whose effects are transient, as well as show which acid-stress genes have a delayed response.  相似文献   

9.

Background  

Position-specific priors have been shown to be a flexible and elegant way to extend the power of Gibbs sampler-based motif discovery algorithms. Information of many types–including sequence conservation, nucleosome positioning, and negative examples–can be converted into a prior over the location of motif sites, which then guides the sequence motif discovery algorithm. This approach has been shown to confer many of the benefits of conservation-based and discriminative motif discovery approaches on Gibbs sampler-based motif discovery methods, but has not previously been studied with methods based on expectation maximization (EM).  相似文献   

10.

Background  

With the exception of M. tuberculosis, little has been published on the problems of cross-contamination in bacteriology laboratories. We performed a retrospective analysis of subtyping data from the National Salmonella Reference Laboratory (Ireland) from 2000–2007 to identify likely incidents of laboratory cross contamination.  相似文献   

11.

Background  

Communicating risk is difficult. Although different methods have been proposed – using numbers, words, pictures or combinations – none has been extensively tested. We used electronic and bibliographic searches to review evidence concerning risk perception and presentation. People tend to underestimate common risk and overestimate rare risk; they respond to risks primarily on the basis of emotion rather than facts, seem to be risk averse when faced with medical interventions, and want information on even the rarest of adverse events.  相似文献   

12.
STEM: a tool for the analysis of short time series gene expression data   总被引:2,自引:0,他引:2  

Background  

Time series microarray experiments are widely used to study dynamical biological processes. Due to the cost of microarray experiments, and also in some cases the limited availability of biological material, about 80% of microarray time series experiments are short (3–8 time points). Previously short time series gene expression data has been mainly analyzed using more general gene expression analysis tools not designed for the unique challenges and opportunities inherent in short time series gene expression data.  相似文献   

13.

Purpose  

A phase I study was conducted to investigate the safety, tolerability, and immunological responses to vaccination with a combination of telomerase-derived peptides GV1001 (hTERT: 611–626) and p540 (hTERT: 540–548) using granulocyte–macrophage colony-stimulating factor (GM-CSF) or tuberculin as adjuvant in patients with cutaneous melanoma.  相似文献   

14.

Background  

Soluble Alzheimer's Aβ oligomers autoinsert into neuronal cell membranes, contributing to the pathology of Alzheimer's Disease (AD), and elevated serum cholesterol is a risk factor for AD, but the reason is unknown. We investigated potential connections between these two observations at the membrane level by testing the hypothesis that Aβ(1–42) relocates membrane cholesterol.  相似文献   

15.

Background  

A biomedical entity mention in articles and other free texts is often ambiguous. For example, 13% of the gene names (aliases) might refer to more than one gene. The task of Gene Symbol Disambiguation (GSD) – a special case of Word Sense Disambiguation (WSD) – is to assign a unique gene identifier for all identified gene name aliases in biology-related articles. Supervised and unsupervised machine learning WSD techniques have been applied in the biomedical field with promising results. We examine here the utilisation potential of the fact – one of the special features of biological articles – that the authors of the documents are known through graph-based semi-supervised methods for the GSD task.  相似文献   

16.

Background  

We present Pegasys – a flexible, modular and customizable software system that facilitates the execution and data integration from heterogeneous biological sequence analysis tools.  相似文献   

17.
Missing value imputation for epistatic MAPs   总被引:1,自引:0,他引:1  

Background  

Epistatic miniarray profiling (E-MAPs) is a high-throughput approach capable of quantifying aggravating or alleviating genetic interactions between gene pairs. The datasets resulting from E-MAP experiments typically take the form of a symmetric pairwise matrix of interaction scores. These datasets have a significant number of missing values - up to 35% - that can reduce the effectiveness of some data analysis techniques and prevent the use of others. An effective method for imputing interactions would therefore increase the types of possible analysis, as well as increase the potential to identify novel functional interactions between gene pairs. Several methods have been developed to handle missing values in microarray data, but it is unclear how applicable these methods are to E-MAP data because of their pairwise nature and the significantly larger number of missing values. Here we evaluate four alternative imputation strategies, three local (Nearest neighbor-based) and one global (PCA-based), that have been modified to work with symmetric pairwise data.  相似文献   

18.

Background  

Numerous gel-based softwares exist to detect protein changes potentially associated with disease. The data, however, are abundant with technical and structural complexities, making statistical analysis a difficult task. A particularly important topic is how the various softwares handle missing data. To date, no one has extensively studied the impact that interpolating missing data has on subsequent analysis of protein spots.  相似文献   

19.

Background  

Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples.  相似文献   

20.

Background  

Although extensive research has been performed to control differentiation of neural stem cells – still, the response of those cells to diverse cell culture conditions often appears to be random and difficult to predict. To this end, we strived to obtain stabilized protocol of NHA cells differentiation – allowing for an increase in percentage yield of neuronal cells.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号