首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: Due to advances in experimental technologies, such as microarray, mass spectrometry and nuclear magnetic resonance, it is feasible to obtain large-scale data sets, in which measurements for a large number of features can be simultaneously collected. However, the sample sizes of these data sets are usually small due to their relatively high costs, which leads to the issue of concordance among different data sets collected for the same study: features should have consistent behavior in different data sets. There is a lack of rigorous statistical methods for evaluating this concordance or discordance. METHODS: Based on a three-component normal-mixture model, we propose two likelihood ratio tests for evaluating the concordance and discordance between two large-scale data sets with two sample groups. The parameter estimation is achieved through the expectation-maximization (E-M) algorithm. A normal-distribution-quantile-based method is used for data transformation. RESULTS: To evaluate the proposed tests, we conducted some simulation studies, which suggested their satisfactory performances. As applications, the proposed tests were applied to three SELDI-MS data sets with replicates. One data set has replicates from different platforms and the other two have replicates from the same platform. We found that data generated by SELDI-MS showed satisfactory concordance between replicates from the same platform but unsatisfactory concordance between replicates from different platforms. AVAILABILITY: The R codes are freely available at http://home.gwu.edu/~ylai/research/Concordance.  相似文献   

2.
3.

Background

Microarray technology applied to microRNA (miRNA) profiling is a promising tool in many research fields; nevertheless, independent studies characterizing the same pathology have often reported poorly overlapping results. miRNA analysis methods have only recently been systematically compared but only in few cases using clinical samples.

Methodology/Principal Findings

We investigated the inter-platform reproducibility of four miRNA microarray platforms (Agilent, Exiqon, Illumina, and Miltenyi), comparing nine paired tumor/normal colon tissues. The most concordant and selected discordant miRNAs were further studied by quantitative RT-PCR. Globally, a poor overlap among differentially expressed miRNAs identified by each platform was found. Nevertheless, for eight miRNAs high agreement in differential expression among the four platforms and comparability to qRT-PCR was observed. Furthermore, most of the miRNA sets identified by each platform are coherently enriched in data from the other platforms and the great majority of colon cancer associated miRNA sets derived from the literature were validated in our data, independently from the platform. Computational integration of miRNA and gene expression profiles suggested that anti-correlated predicted target genes of differentially expressed miRNAs are commonly enriched in cancer-related pathways and in genes involved in glycolysis and nutrient transport.

Conclusions

Technical and analytical challenges in measuring miRNAs still remain and further research is required in order to increase consistency between different microarray-based methodologies. However, a better inter-platform agreement was found by looking at miRNA sets instead of single miRNAs and through a miRNAs – gene expression integration approach.  相似文献   

4.
High density oligonucleotide arrays have been used extensively for expression studies of eukaryotic organisms. We have designed a prokaryotic high density oligonucleotide array using the complete Escherichia coli genome sequence to monitor expression levels of all genes and intergenic regions in the genome. Because previously described methods for preparing labeled target nucleic acids are not useful for prokaryotic cell analysis using such arrays, a mRNA enrichment and direct labeling protocol was developed together with a cDNA synthesis protocol. The reproducibility of each labeling method was determined using high density oligonucleotide probe arrays as a read-out methodology and the expression results from direct labeling were compared to the expression results from the cDNA synthesis. About 50% of all annotated E.coli open reading frames are observed to be transcribed, as measured by both protocols, when the cells were grown in rich LB medium. Each labeling method individually showed a high degree of concordance in replica experiments (95 and 99%, respectively), but when each sample preparation method was compared to the other, ~32% of the genes observed to be expressed were discordant. However, both labeling methods can detect the same relative gene expression changes when RNA from IPTG-induced cells was labeled and compared to RNA from uninduced E.coli cells.  相似文献   

5.
6.

Background  

Comparison of data produced on different microarray platforms often shows surprising discordance. It is not clear whether this discrepancy is caused by noisy data or by improper probe matching between platforms. We investigated whether the significant level of inconsistency between results produced by alternative gene expression microarray platforms could be reduced by stringent sequence matching of microarray probes. We mapped the short oligo probes of the Affymetrix platform onto cDNA clones of the Stanford microarray platform. Affymetrix probes were reassigned to redefined probe sets if they mapped to the same cDNA clone sequence, regardless of the original manufacturer-defined grouping. The NCI-60 gene expression profiles produced by Affymetrix HuFL platform were recalculated using these redefined probe sets and compared to previously published cDNA measurements of the same panel of RNA samples.  相似文献   

7.

Background

On most common microarray platforms many genes are represented by multiple probes. Although this is quite common no one has systematically explored the concordance between probes mapped to the same gene.

Results

Here we present an analysis of all the cases of multiple probe sets measuring the same gene on the Affymetrix U133a GeneChip and found that although in the majority of cases both measurements tend to agree there are a significant number of cases in which the two measurements differ from each other. In these cases the measurements can not be simply averaged but rather should be handled individually.

Conclusion

Our analysis allows us to provide a comprehensive list of the correlation between all pairs of probe sets that are mapped to the same gene and thus allows microarray users to sort out the cases that deserve further analysis. Comparison between the set of highly correlated pairs and the set of pairs that tend to differ from each other reveals potential factors that may affect it.  相似文献   

8.
We have evaluated the performance characteristics of three quantitative gene expression technologies and correlated their expression measurements to those of five commercial microarray platforms, based on the MicroArray Quality Control (MAQC) data set. The limit of detection, assay range, precision, accuracy and fold-change correlations were assessed for 997 TaqMan Gene Expression Assays, 205 Standardized RT (Sta)RT-PCR assays and 244 QuantiGene assays. TaqMan is a registered trademark of Roche Molecular Systems, Inc. We observed high correlation between quantitative gene expression values and microarray platform results and found few discordant measurements among all platforms. The main cause of variability was differences in probe sequence and thus target location. A second source of variability was the limited and variable sensitivity of the different microarray platforms for detecting weakly expressed genes, which affected interplatform and intersite reproducibility of differentially expressed genes. From this analysis, we conclude that the MAQC microarray data set has been validated by alternative quantitative gene expression platforms thus supporting the use of microarray platforms for the quantitative characterization of gene expression.  相似文献   

9.
10.
The exploration of copy-number variation (CNV), notably of somatic cells, is an understudied aspect of genome biology. Any differences in the genetic makeup between twins derived from the same zygote represent an irrefutable example of somatic mosaicism. We studied 19 pairs of monozygotic twins with either concordant or discordant phenotype by using two platforms for genome-wide CNV analyses and showed that CNVs exist within pairs in both groups. These findings have an impact on our views of genotypic and phenotypic diversity in monozygotic twins and suggest that CNV analysis in phenotypically discordant monozygotic twins may provide a powerful tool for identifying disease-predisposition loci. Our results also imply that caution should be exercised when interpreting disease causality of de novo CNVs found in patients based on analysis of a single tissue in routine disease-related DNA diagnostics.  相似文献   

11.
12.
13.
Cancer derived microarray data sets are routinely produced by various platforms that are either commercially available or manufactured by academic groups. The fundamental difference in their probe selection strategies holds the promise that identical observations produced by more than one platform prove to be more robust when validated by biology. However, cross-platform comparison requires matching corresponding probe sets. We are introducing here sequence-based matching of probes instead of gene identifier-based matching. We analyzed breast cancer cell line derived RNA aliquots using Agilent cDNA and Affymetrix oligonucleotide microarray platforms to assess the advantage of this method. We show, that at different levels of the analysis, including gene expression ratios and difference calls, cross-platform consistency is significantly improved by sequence- based matching. We also present evidence that sequence-based probe matching produces more consistent results when comparing similar biological data sets obtained by different microarray platforms. This strategy allowed a more efficient transfer of classification of breast cancer samples between data sets produced by cDNA microarray and Affymetrix gene-chip platforms.  相似文献   

14.
15.
16.
For the robust practice of genomic medicine, sequencing results must be compatible, regardless of the sequencing technologies and algorithms used. Presently, genome sequencing is still an imprecise science and is complicated by differences in the chemistry, coverage, alignment, and variant-calling algorithms. We identified ∼3.33 million single nucleotide variants (SNVs) and ∼3.62 million SNVs in the SJK genome using SOLiD and Illumina data, respectively. Approximately 3 million SNVs were concordant between the two platforms while 68,532 SNVs were discordant; 219,616 SNVs were SOLiD-specific and 516,080 SNVs were Illumina-specific (i.e., platform-specific). Concordant, discordant, and platform-specific SNVs were further analyzed and characterized. Overall, a large portion of heterozygous SNVs that were discordant with genotyping calls of single nucleotide polymorphism chips were highly confident. Approximately 70% of the platform-specific SNVs were located in regions containing repetitive sequences. Such platform-specificity may arise from differences between platforms, with regard to read length (36 bp and 72 bp vs. 50 bp), insert size (∼100–300 bp vs. ∼1–2 kb), sequencing chemistry (sequencing-by-synthesis using single nucleotides vs. ligation-based sequencing using oligomers), and sequencing quality. When data from the two platforms were merged for variant calling, the proportion of callable regions of the reference genome increased to 99.66%, which was 1.43% higher than the average callability of the two platforms, representing ∼40 million bases. In this study, we compared the differences in sequencing results between two sequencing platforms. Approximately 90% of the SNVs were concordant between the two platforms, yet ∼10% of the SNVs were either discordant or platform-specific, indicating that each platform had its own strengths and weaknesses. When data from the two platforms were merged, both the overall callability of the reference genome and the overall accuracy of the SNVs improved, demonstrating the likelihood that a re-sequenced genome can be revised using complementary data.  相似文献   

17.

Background  

Gene clustering has been widely used to group genes with similar expression pattern in microarray data analysis. Subsequent enrichment analysis using predefined gene sets can provide clues on which functional themes or regulatory sequence motifs are associated with individual gene clusters. In spite of the potential utility, gene clustering and enrichment analysis have been used in separate platforms, thus, the development of integrative algorithm linking both methods is highly challenging.  相似文献   

18.
IntroductionMetastasis is thought to be a clonal event whereby a single cell initiates the development of a new tumor at a distant site. However the degree to which primary and metastatic tumors differ on a molecular level remains unclear. To further evaluate these concepts, we used next generation sequencing (NGS) to assess the molecular composition of paired primary and metastatic colorectal cancer tissue specimens.Methods468 colorectal tumor samples from a large personalized medicine initiative were assessed by targeted gene sequencing of 1,321 individual genes. Eighteen patients produced genomic profiles for 17 paired primary:metastatic (and 2 metastatic:metastatic) specimens.ResultsAn average of 33.3 mutations/tumor were concordant (shared) between matched samples, including common well-known genes (APC, KRAS, TP53). An average of 2.3 mutations/tumor were discordant (unshared) among paired sites. KRAS mutational status was always concordant. The overall concordance rate for mutations was 93.5%; however, nearly all (18/19 (94.7%)) paired tumors showed at least one mutational discordance. Mutations were seen in: TTN, the largest gene (5 discordant pairs), ADAMTS20, APC, MACF1, RASA1, TP53, and WNT2 (2 discordant pairs), SMAD2, SMAD3, SMAD4, FBXW7, and 66 others (1 discordant pair).ConclusionsWhereas primary and metastatic tumors displayed little variance overall, co-evolution produced incremental mutations in both. These results suggest that while biopsy of the primary tumor alone is likely sufficient in the chemotherapy-naïve patient, additional biopsies of primary or metastatic disease may be necessary to precisely tailor therapy following chemotherapy resistance or insensitivity in order to adequately account for tumor evolution.  相似文献   

19.
20.
The X chromosome constitutes a unique genomic environment because it is present in one copy in males, but two copies in females. This simple fact has motivated several theoretical predictions with respect to how standing genetic variation on the X chromosome should differ from the autosomes. Unmasked expression of deleterious mutations in males and a lower census size are expected to reduce variation, while allelic variants with sexually antagonistic effects, and potentially those with a sex-specific effect, could accumulate on the X chromosome and contribute to increased genetic variation. In addition, incomplete dosage compensation of the X chromosome could potentially dampen the male-specific effects of random mutations, and promote the accumulation of X-linked alleles with sexually dimorphic phenotypic effects. Here we test both the amount and the type of genetic variation on the X chromosome within a population of Drosophila melanogaster, by comparing the proportion of X linked and autosomal trans-regulatory SNPs with a sexually concordant and discordant effect on gene expression. We find that the X chromosome is depleted for SNPs with a sexually concordant effect, but hosts comparatively more SNPs with a sexually discordant effect. Interestingly, the contrasting results for SNPs with sexually concordant and discordant effects are driven by SNPs with a larger influence on expression in females than expression in males. Furthermore, the distribution of these SNPs is shifted towards regions where dosage compensation is predicted to be less complete. These results suggest that intrinsic properties of dosage compensation influence either the accumulation of different types of trans-factors and/or their propensity to accumulate mutations. Our findings document a potential mechanistic basis for sex-specific genetic variation, and identify the X as a reservoir for sexually dimorphic phenotypic variation. These results have general implications for X chromosome evolution, as well as the genetic basis of sex-specific evolutionary change.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号