首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Attempts to correlate protein abundance with mRNA expression levels have had variable success. We review the results of these comparisons, focusing on yeast. In the process, we survey experimental techniques for determining protein abundance, principally two-dimensional gel electrophoresis and mass-spectrometry. We also merge many of the available yeast protein-abundance datasets, using the resulting larger 'meta-dataset' to find correlations between protein and mRNA expression, both globally and within smaller categories.  相似文献   

2.
Mass spectrometry (MS)-based shotgun proteomics allows protein identifications even in complex biological samples. Protein abundances can then be estimated from the counts of tandem MS (MS/MS) spectra attributable to each protein, provided one accounts for differential MS detectability of contributing peptides. We developed a method, APEX, which calculates Absolute Protein EXpression levels based upon learned correction factors, MS/MS spectral counts and each protein's probability of correct identification. This protocol describes APEX-based calculations in three parts. (i) Using training data, peptide sequences and their sequence properties, a model is built to estimate MS detectability (O(i)) for any given protein. (ii) Absolute protein abundances are calculated from spectral counts, identification probabilities and the learned O(i)-values. (iii) Simple statistics allow calculation of differential expression in two distinct biological samples, i.e., measuring relative protein abundances. APEX-based protein abundances span 3-4 orders of magnitude and are applicable to mixtures of 100s to 1,000s of proteins.  相似文献   

3.
With the development of genomic study, researchers found that it is insufficient to predict protein expression from quantitative mRNA data in large scale, which is contrary to the traditional opinion that mRNA expression correlates with protein abundance at the single gene level. To try to solve the apparent conflicting views, here we set up a series of research models and chose soluble cytokines as targets. First, human peripheral blood mononuclear cell (PBMC) from one health donor was treated with 16 continuously changing conditions, the protein and mRNA profile were analyzed by multiplex Luminex and genomic microarray, respectively. Among the tested genes, around half mRNA correlated well with their corresponding proteins (ρ > 0.8), however if we put all the genes together, the correlation coefficient for the 16 conditions varied from 0.29 to 0.71. Second, PBMC from 14 healthy donors were stimulated with the same condition and it was found that the correlation coefficient went down (ρ < 0.6). Third, 28 rheumatoid arthritis (RA) patients were tested for their response to the same external stimuli and it turned out different individual displayed different protein expression pattern as expect. Lastly, autoimmune disease cohorts (8 diseases including RA, 103 patients in total) were assayed on the whole view. It was observed that there was still some similarity in the protein profile among patients from the single disease type although completely different patterns were displayed across different disease categories. This study built a good bridge between single gene analysis and the whole genome study and may give a reasonable explanation for the two conflicting views in current biological science.  相似文献   

4.

Background  

Isobaric Tags for Relative and Absolute Quantitation (iTRAQ™) [Applied Biosystems] have seen increased application in differential protein expression analysis. To facilitate the growing need to analyze iTRAQ data, especially for cases involving multiple iTRAQ experiments, we have developed a modeling approach, statistical methods, and tools for estimating the relative changes in protein expression under various treatments and experimental conditions.  相似文献   

5.
6.
7.
8.
Data analysis--not data production--is becoming the bottleneck in gene expression research. Data integration is necessary to cope with an ever increasing amount of data, to cross-validate noisy data sets, and to gain broad interdisciplinary views of large biological data sets. New Internet resources may help researchers to combine data sets across different gene expression platforms. However, noise and disparities in experimental protocols strongly limit data integration. A detailed review of four selected studies reveals how some of these limitations may be circumvented and illustrates what can be achieved through data integration.  相似文献   

9.
Zhao Y  Lin YH 《Proteomics》2005,5(4):853-855
Instead of using the probability mean, a simple and yet effective heuristic approach was employed to treat experimentally obtained tandem mass spectrometry (MS/MS) data for protein identification. The proposed approach is based on the total number (T) of identified experimental MS/MS data. To warrant the subsequent ranking, the total number of identified b- and y-type ions (Tb+y) must be greater than 50% of T. Peptides having the same T and Tb+y are either ranked by the contiguity of identified ions or discarded during identification. When compared to other protein identification tools, good agreement with the searched results was seen.  相似文献   

10.
BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences   总被引:49,自引:0,他引:49  
'BLAST 2 Sequences', a new BLAST-based tool for aligning two protein or nucleotide sequences, is described. While the standard BLAST program is widely used to search for homologous sequences in nucleotide and protein databases, one often needs to compare only two sequences that are already known to be homologous, coming from related species or, e.g. different isolates of the same virus. In such cases searching the entire database would be unnecessarily time-consuming. 'BLAST 2 Sequences' utilizes the BLAST algorithm for pairwise DNA-DNA or protein-protein sequence comparison. A World Wide Web version of the program can be used interactively at the NCBI WWW site (http://www.ncbi.nlm.nih.gov/gorf/bl2.++ +html). The resulting alignments are presented in both graphical and text form. The variants of the program for PC (Windows), Mac and several UNIX-based platforms can be downloaded from the NCBI FTP site (ftp://ncbi.nlm.nih.gov).  相似文献   

11.
Biological networks are a topic of great current interest, particularly with the publication of a number of large genome-wide interaction datasets. They are globally characterized by a variety of graph-theoretic statistics, such as the degree distribution, clustering coefficient, characteristic path length and diameter. Moreover, real protein networks are quite complex and can often be divided into many sub-networks through systematic selection of different nodes and edges. For instance, proteins can be sub-divided by expression level, length, amino-acid composition, solubility, secondary structure and function. A challenging research question is to compare the topologies of sub- networks, looking for global differences associated with different types of proteins. TopNet is an automated web tool designed to address this question, calculating and comparing topological characteristics for different sub-networks derived from any given protein network. It provides reasonable solutions to the calculation of network statistics for sub-networks embedded within a larger network and gives simplified views of a sub-network of interest, allowing one to navigate through it. After constructing TopNet, we applied it to the interaction networks and protein classes currently available for yeast. We were able to find a number of potential biological correlations. In particular, we found that soluble proteins had more interactions than membrane proteins. Moreover, amongst soluble proteins, those that were highly expressed, had many polar amino acids, and had many alpha helices, tended to have the most interaction partners. Interestingly, TopNet also turned up some systematic biases in the current yeast interaction network: on average, proteins with a known functional classification had many more interaction partners than those without. This phenomenon may reflect the incompleteness of the experimentally determined yeast interaction network.  相似文献   

12.
SUMMARY: We present a new tool for the semi-automated querying of PubMed using a batch of tens to thousands of GenBank accession numbers or UniGene cluster ids. By combining information from UniGene and SWISS-PROT, microGENIE obtains information on the biological relevance of expressed genes, as identified by micro-array experiments, with minimal user intervention and time investment. AVAILABILITY: microGENIE is freely available from http://www.cs.vu.nl/microgenie SUPPLEMENTARY INFORMATION: The web site above supplies examples of input and output files.  相似文献   

13.
MOTIVATION: Clustering is one of the most widely used methods in unsupervised gene expression data analysis. The use of different clustering algorithms or different parameters often produces rather different results on the same data. Biological interpretation of multiple clustering results requires understanding how different clusters relate to each other. It is particularly non-trivial to compare the results of a hierarchical and a flat, e.g. k-means, clustering. RESULTS: We present a new method for comparing and visualizing relationships between different clustering results, either flat versus flat, or flat versus hierarchical. When comparing a flat clustering to a hierarchical clustering, the algorithm cuts different branches in the hierarchical tree at different levels to optimize the correspondence between the clusters. The optimization function is based on graph layout aesthetics or on mutual information. The clusters are displayed using a bipartite graph where the edges are weighted proportionally to the number of common elements in the respective clusters and the weighted number of crossings is minimized. The performance of the algorithm is tested using simulated and real gene expression data. The algorithm is implemented in the online gene expression data analysis tool Expression Profiler. AVAILABILITY: http://www.ebi.ac.uk/expressionprofiler  相似文献   

14.
15.
The present study was designed to determine if dietary protein can alter uncoupling protein (UCP) expression in swine, as has been shown in rats, and attempt to identify the mechanism. Eight pigs (~ 50 kg body mass) were fed an 18% crude protein (CP) diet while another eight pigs were switched to a diet containing 12% crude protein (CP) and fed these diets until 110 kg body mass. The outer (OSQ) and middle (MSQ) subcutaneous adipose tissues, liver, leaf fat, longissimus (LM), red portion of the semitendinosus (STR) and the white portion of the ST (STW) were analyzed for gene expression by real-time PCR. Feeding of 12% CP did not alter growth or carcass composition, relative to 18% CP (P > 0.05). Serum growth hormone, non-esterified fatty acids, triglycerides and urea nitrogen were reduced with the feeding of 12% CP (P < 0.05). The UCP2 mRNA abundance was reduced in LM, STR, MSQ and OSQ with feeding of 12% CP (P < 0.05), as was UCP3 mRNA abundance in MSQ and STW (P < 0.01). Peroxisome proliferation activated receptor α (PPARα) and PPARγ were reduced in MSQ and STR (P < 0.05) with feeding 12% CP as was the PPARα regulated protein, acyl CoA oxidase (ACOX, P < 0.05). These data suggest that feeding 12% CP relative to 18% CP reduces serum NEFA, which reduces PPARα and PPARγ expression and consequently reduces UCP2 lipoperoxidation in OSQ and STR and also reduced UCP3 associated fatty acid transport in MSQ and STW.  相似文献   

16.
17.
CressExpress is a user-friendly, online, coexpression analysis tool for Arabidopsis (Arabidopsis thaliana) microarray expression data that computes patterns of correlated expression between user-entered query genes and the rest of the genes in the genome. Unlike other coexpression tools, CressExpress allows characterization of tissue-specific coexpression networks through user-driven filtering of input data based on sample tissue type. CressExpress also performs pathway-level coexpression analysis on each set of query genes, identifying and ranking genes based on their common connections with two or more query genes. This allows identification of novel candidates for involvement in common processes and functions represented by the query group. Users launch experiments using an easy-to-use Web-based interface and then receive the full complement of results, along with a record of tool settings and parameters, via an e-mail link to the CressExpress Web site. Data sets featured in CressExpress are strictly versioned and include expression data from MAS5, GCRMA, and RMA array processing algorithms. To demonstrate applications for CressExpress, we present coexpression analyses of cellulose synthase genes, indolic glucosinolate biosynthesis, and flowering. We show that subselecting sample types produces a richer network for genes involved in flowering in Arabidopsis. CressExpress provides direct access to expression values via an easy-to-use URL-based Web service, allowing users to determine quickly if their query genes are coexpressed with each other and likely to yield informative pathway-level coexpression results. The tool is available at http://www.cressexpress.org.  相似文献   

18.
19.

Background

High-throughput sequencing, such as ribonucleic acid sequencing (RNA-seq) and chromatin immunoprecipitation sequencing (ChIP-seq) analyses, enables various features of organisms to be compared through tag counts. Recent studies have demonstrated that the normalization step for RNA-seq data is critical for a more accurate subsequent analysis of differential gene expression. Development of a more robust normalization method is desirable for identifying the true difference in tag count data.

Results

We describe a strategy for normalizing tag count data, focusing on RNA-seq. The key concept is to remove data assigned as potential differentially expressed genes (DEGs) before calculating the normalization factor. Several R packages for identifying DEGs are currently available, and each package uses its own normalization method and gene ranking algorithm. We compared a total of eight package combinations: four R packages (edgeR, DESeq, baySeq, and NBPSeq) with their default normalization settings and with our normalization strategy. Many synthetic datasets under various scenarios were evaluated on the basis of the area under the curve (AUC) as a measure for both sensitivity and specificity. We found that packages using our strategy in the data normalization step overall performed well. This result was also observed for a real experimental dataset.

Conclusion

Our results showed that the elimination of potential DEGs is essential for more accurate normalization of RNA-seq data. The concept of this normalization strategy can widely be applied to other types of tag count data and to microarray data.  相似文献   

20.
Time-series data resulting from surveying wild animals are often described using state-space population dynamics models, in particular with Gompertz, Beverton-Holt, or Moran-Ricker latent processes. We show how hidden Markov model methodology provides a flexible framework for fitting a wide range of models to such data. This general approach makes it possible to model abundance on the natural or log scale, include multiple observations at each sampling occasion and compare alternative models using information criteria. It also easily accommodates unequal sampling time intervals, should that possibility occur, and allows testing for density dependence using the bootstrap. The paper is illustrated by replicated time series of red kangaroo abundances, and a univariate time series of ibex counts which are an order of magnitude larger. In the analyses carried out, we fit different latent process and observation models using the hidden Markov framework. Results are robust with regard to the necessary discretization of the state variable. We find no effective difference between the three latent models of the paper in terms of maximized likelihood value for the two applications presented, and also others analyzed. Simulations suggest that ecological time series are not sufficiently informative to distinguish between alternative latent processes for modeling population survey data when data do not indicate strong density dependence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号