首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Summaries of Affymetrix GeneChip probe level data   总被引:9,自引:0,他引:9  
High density oligonucleotide array technology is widely used in many areas of biomedical research for quantitative and highly parallel measurements of gene expression. Affymetrix GeneChip arrays are the most popular. In this technology each gene is typically represented by a set of 11–20 pairs of probes. In order to obtain expression measures it is necessary to summarize the probe level data. Using two extensive spike-in studies and a dilution study, we developed a set of tools for assessing the effectiveness of expression measures. We found that the performance of the current version of the default expression measure provided by Affymetrix Microarray Suite can be significantly improved by the use of probe level summaries derived from empirically motivated statistical models. In particular, improvements in the ability to detect differentially expressed genes are demonstrated.  相似文献   

affy--analysis of Affymetrix GeneChip data at the probe level   总被引:32,自引:0,他引:32  
MOTIVATION: The processing of the Affymetrix GeneChip data has been a recent focus for data analysts. Alternatives to the original procedure have been proposed and some of these new methods are widely used. RESULTS: The affy package is an R package of functions and classes for the analysis of oligonucleotide arrays manufactured by Affymetrix. The package is currently in its second release, affy provides the user with extreme flexibility when carrying out an analysis and make it possible to access and manipulate probe intensity data. In this paper, we present the main classes and functions in the package and demonstrate how they can be used to process probe-level data. We also demonstrate the importance of probe-level analysis when using the Affymetrix GeneChip platform.  相似文献   



Cis-regulatory modules (CRMs) are short stretches of DNA that help regulate gene expression in higher eukaryotes. They have been found up to 1 megabase away from the genes they regulate and can be located upstream, downstream, and even within their target genes. Due to the difficulty of finding CRMs using biological and computational techniques, even well-studied regulatory systems may contain CRMs that have not yet been discovered.  相似文献   



The choice of probe set algorithms for expression summary in a GeneChip study has a great impact on subsequent gene expression data analysis. Spiked-in cRNAs with known concentration are often used to assess the relative performance of probe set algorithms. Given the fact that the spiked-in cRNAs do not represent endogenously expressed genes in experiments, it becomes increasingly important to have methods to study whether a particular probe set algorithm is more appropriate for a specific dataset, without using such external reference data.  相似文献   

Hu J  Hu H  Li X 《Nucleic acids research》2008,36(13):4488-4497
The identification of cis-regulatory modules (CRMs) can greatly advance our understanding of eukaryotic regulatory mechanism. Current methods to predict CRMs from known motifs either depend on multiple alignments or can only deal with a small number of known motifs provided by users. These methods are problematic when binding sites are not well aligned in multiple alignments or when the number of input known motifs is large. We thus developed a new CRM identification method MOPAT (motif pair tree), which identifies CRMs through the identification of motif modules, groups of motifs co-occurring in multiple CRMs. It can identify 'orthologous' CRMs without multiple alignments. It can also find CRMs given a large number of known motifs. We have applied this method to mouse developmental genes, and have evaluated the predicted CRMs and motif modules by microarray expression data and known interacting motif pairs. We show that the expression profiles of the genes containing CRMs of the same motif module correlate significantly better than those of a random set of genes do. We also show that the known interacting motif pairs are significantly included in our predictions. Compared with several current methods, our method shows better performance in identifying meaningful CRMs.  相似文献   

Genomic DNA sequences contain a wealth of information about the bendability and curvature of the DNA molecule. For example, the well-known 10-11 bp periodicities within genomes can be attributed to supercoiled structures or wrapping around nucleosomes. Such periodic signals have previously been examined mainly based on mono- or dinucleotide correlations. In this study, we generalize this approach and analyze correlation functions of longer motifs such as tetramers or poly(A) sequences. Periodically placed motifs may indicate regular protein binding or curvature signals. We detected various periodic signals e.g. strong 10-11 bp oscillations of periodically placed poly(A), poly(T) or poly(W) stretches. These observations lead to a new view on the intensively studied 10-11 bp periodicities.  相似文献   

Microarray experiments are affected by several sources of variability. The paper demonstrates the major role of the day-to-day variability, it underlines the importance of a randomized block design when processing replicates over several days to avoid systematic biases and it proposes a simple algorithm that minimizes the day dependence.  相似文献   

High mobility group (HMG) proteins are nuclear proteins believed to significantly affect DNA interactions by altering nucleic acid flexibility. Group B (HMGB) proteins contain HMG box domains known to bind to the DNA minor groove without sequence specificity, slightly intercalating base pairs and inducing a strong bend in the DNA helical axis. A dual-beam optical tweezers system is used to extend double-stranded DNA (dsDNA) in the absence as well as presence of a single box derivative of human HMGB2 [HMGB2(box A)] and a double box derivative of rat HMGB1 [HMGB1(box A+box B)]. The single box domain is observed to reduce the persistence length of the double helix, generating sharp DNA bends with an average bending angle of 99 ± 9° and, at very high concentrations, stabilizing dsDNA against denaturation. The double box protein contains two consecutive HMG box domains joined by a flexible tether. This protein also reduces the DNA persistence length, induces an average bending angle of 77 ± 7°, and stabilizes dsDNA at significantly lower concentrations. These results suggest that single and double box proteins increase DNA flexibility and stability, albeit both effects are achieved at much lower protein concentrations for the double box. In addition, at low concentrations, the single box protein can alter DNA flexibility without stabilizing dsDNA, whereas stabilization at higher concentrations is likely achieved through a cooperative binding mode.  相似文献   

Yeung AT  Holloway BP  Adams PS  Shipley GL 《BioTechniques》2004,36(2):266-70, 272, 274-5
Real-time PCR technology using dual-labeled fluorescent oligonucleotide probes allows for sensitive, specific, and quantitative determination of mRNA or DNA targets. Historically, dual-labeled probes have been the most expensive reagent in real-time PCR because of the postsynthesis high-performance liquid chromatography (HPLC) and/or gel purification steps required due to limitations in traditional synthesis chemistry. The recent availability of quencher reagents that allow the 3' quencher incorporation as part of the on-machine synthesis has presented the possibility that probes, when carefully synthesized, may be used without extensive postsynthesis purification. This would substantially reduce cost, making the synthesis of dual-labeled fluorescent probes affordable to any DNA synthesis laboratory. The Nucleic Acids Research Group (NARG) of the Association of Biomolecular Resource Facilities (ABRF) (Santa Fe, NM, USA) tested the hypothesis that now any DNA synthesis laboratory is capable of making quality dual-labeled fluorescent probes suitable for real-time PCRs without the need for postsynthesis purification. Members of the DNA synthesis community synthesized dual-labeled human beta-actin probes and submitted them for quality and functional analysis. We found that probes that were at least 20% pure had the same efficiency as those near 100% purity, but the sensitivity of the assay was reduced as the level of purity decreased.  相似文献   

MOTIVATION: Many heuristic algorithms have been designed to approximate P-values of DNA motifs described by position weight matrices, for evaluating their statistical significance. They often significantly deviate from the true P-value by orders of magnitude. Exact P-value computation is needed for ranking the motifs. Furthermore, surprisingly, the complexity of the problem is unknown. RESULTS: We show the problem to be NP-hard, and present MotifRank, software based on dynamic programming, to calculate exact P-values of motifs. We define the exact P-value on a general and more precise model. Asymptotically, MotifRank is faster than the best exact P-value computing algorithm, and is in fact practical. Our experiments clearly demonstrate that MotifRank significantly improves the accuracy of existing approximation algorithms. AVAILABILITY: MotifRank is available from http://bio.dlg.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

Single-stranded DNA or double-stranded DNA has the potential to adopt a wide variety of unusual duplex and hairpin motifs in the presence (trans) or absence (cis) of ligands. Several principles for the formation of those unusual structures have been established through the observation of a number of recurring structural motifs associated with different sequences. These include: (i) internal loops of consecutive mismatches can occur in a B-DNA duplex when sheared base pairs are adjacent to each other to confer extensive cross- and intra-strand base stacking; (ii) interdigitated (zipper-like) duplex structures form instead when sheared G·A base pairs are separated by one or two pairs of purine·purine mismatches; (iii) stacking is not restricted to base, deoxyribose also exhibits the potential to do so; (iv) canonical G·C or A·T base pairs are flexible enough to exhibit considerable changes from the regular H-bonded conformation. The paired bases become stacked when bracketed by sheared G·A base pairs, or become extruded out and perpendicular to their neighboring bases in the presence of interacting drugs; (v) the purine-rich and pyrimidine-rich loop structures are notably different in nature. The purine-rich loops form compact triloop structures closed by a sheared G·A, A·A, A·C or sheared-like Ganti·Csyn base pair that is stacked by a single residue. On the other hand, the pyrimidine-rich loops with a thymidine in the first position exhibit no base pairing but are characterized by the folding of the thymidine residue into the minor groove to form a compact loop structure. Identification of such diverse duplex or hairpin motifs greatly enlarges the repertoire for unusual DNA structural formation.  相似文献   

BACKGROUND: Triplet repeat sequences are of considerable biological importance as the expansion of such tandem arrays can lead to the onset of a range of human diseases. Such sequences can self-pair via mismatch alignments to form higher order structures that have the potential to cause replication blocks, followed by strand slippage and sequence expansion. The all-purine d(GGA)n triplet repeat sequence is of particular interest because purines can align via G.G, A.A and G.A mismatch formation. RESULTS: We have solved the structure of the uniformly 13C,15N-labeled d(G1-G2-A3-G4-G5-A6-T7) sequence in 10 mM Na+ solution. This sequence adopts a novel twofold-symmetric duplex fold where interlocked V-shaped arrowhead motifs are aligned solely via interstrand G1.G4, G2.G5 and A3.A6 mismatch formation. The tip of the arrowhead motif is centered about the p-A3-p step, and symmetry-related local parallel-stranded duplex domains are formed by the G1-G2-A3 and G4-G5-A6 segments of partner strands. CONCLUSIONS: The purine-rich (GGA)n triplet repeat sequence is dispersed throughout the eukaryotic genome. Several features of the arrowhead duplex motif for the (GGA)2 triplet repeat provide a unique scaffold for molecular recognition. These include the large localized bend in the sugar-phosphate backbones, the segmental parallel-stranded alignment of strands and the exposure of the Watson-Crick edges of several mismatched bases.  相似文献   

The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, composed by regions with a different extent of variability. The method is based on a multi-step approach, which was implemented in a motif searching web tool (MOST). Overrepresented exact patterns are extracted from input sequences and clustered to produce motifs core regions, which are then extended and scored to generate seeded motifs. The combination of automated pattern discovery algorithms and different display tools for the evaluation and selection of results at several analysis steps can potentially lead to much more meaningful results than complete automation can produce. Experimental results on different yeast and human real datasets proved the methodology to be a promising solution for finding seeded motifs. MOST web tool is freely available at http://telethon.bio.unipd.it/bioinfo/MOST.  相似文献   

A general procedure for the cross-linking of enzyme to DNA has been developed for use as a nonradioactive probe. In this method, DNA is transaminated with diaminopropane to introduce primary amino groups into the cytosine residues. Then the amino groups are converted to thiol groups using a heterobifunctional cross-linker. The thiolated DNA is conjugated with the maleimide-introduced enzyme. With this method, alkaline phosphatase was cross-linked to a single-stranded DNA (sspUCRf1). The conjugate was able to detect 5 pg of target DNA (pUCf1 plasmid, 3.2 kbp) fixed onto the nitrocellulose membrane, using a colorimetric assay. The enzyme-conjugated DNA was applied to "the universal probe system," which consisted of two single-stranded DNA probes (a primary probe and a labeled secondary probe). Using alkaline phosphatase-conjugated sspUCRf1 DNA as the secondary probe, the c-myc gene and HBV DNA were detected effectively on Southern and dot-blot hybridization.  相似文献   

Vallon O 《Proteins》2000,38(1):95-114
We describe two new sequence motifs, present in several families of flavoproteins. The "GG motif" (RxGGRxxS/T) is found shortly after the betaalphabetadinucleotide-binding motif (DBM) in L-amino acid oxidases, achacin and aplysianin-A, monoamine oxidases, corticosteroid-binding proteins, and tryptophan 2-monooxygenases. Other disperse sequence similarities between these families suggest a common origin. A GG motif is also found in protoporphyrinogen oxidase and carotenoid desaturases and, reduced to the central GG doublet, in the THI4 protein, dTDP-4-dehydrorhamnose reductase, soluble fumarate reductase, steroid dehydrogenases, Rab GDP-dissociation inhibitor, and in most flavoproteins with two dinucleotide-binding domains (glutathione reductase, glutamate synthase, flavin-containing monooxygenase, trimethylamine dehydrogenase...). In the latter families, an "ATG motif" (oxhhhATG) is found in both the FAD- and NAD(P)H-binding domains, forming the fourth beta-strand of the Rossman fold and the connecting loop. On the basis of these and previously described motifs, we present a classification of dinucleotide-binding proteins that could also serve as an evolutionary scheme. Like the DBM, the ATG motif appears to predate the divergence of NAD(P)H- and FAD-binding proteins. We propose that flavoproteins have evolved from a well-differentiated NAD(P)H-binding protein. The bulk of the substrate-binding domain was formed by an insertion after the fourth beta-strand, either of a closely related NAD(P)H-binding domain or of a domain of completely different origin.  相似文献   

Branched DNA motifs can be designed to assume a variety of shapes and structures. These structures can be characterized by numerous solution techniques; the structures also can be inferred from atomic force microscopy of two-dimensional periodic arrays that the motifs form via cohesive interactions. Examples of these motifs are the DNA parallelogram, the bulged-junction DNA triangle, and the three-dimensional-double crossover (3D-DX) DNA triangle. The ability of these motifs to withstand stresses without changing geometrical structure is clearly of interest if the motif is to be used in nanomechanical devices or to organize other large chemical species. Metallic nanoparticles can be attached to DNA motifs, and the arrangement of these particles can be established by transmission electron microscopy. We have attached 5 nm or 10 nm gold nanoparticles to every vertex of DNA parallelograms, to two or three vertices of 3D-DX DNA triangle motifs, and to every vertex of bulged-junction DNA triangles. We demonstrate by transmission electron microscopy that the DNA parallelogram motif and the bulged-junction DNA triangle are deformed by the presence of the gold nanoparticles, whereas the structure of the 3D-DX DNA triangle motif appears to be minimally distorted. This method provides a way to estimate the robustness and potential utility of the many new DNA motifs that are becoming available.  相似文献   

''SPKK'' motifs prefer to bind to DNA at A/T-rich sites.   总被引:19,自引:4,他引:19       下载免费PDF全文
The termini of histone H1 and sea urchin spermatogenous H1 and H2B, which are essential for correct chromatin condensation, often contain repeats of the sequence SPK(R)K(R). A special type of beta-turn structural motif has been proposed for this sequence, and it has been shown that a segment of the sea urchin sperm H1 N terminus, which has six repeats of the motif (S6 peptide), binds to DNA and competes with the DNA binding drug Hoechst 33258. Here, we demonstrate by quantitative analysis of hydroxyl radical footprints that the synthetic oligopeptide, SPRKSPRK (S2), and the S6 peptide prefer to bind to the minor groove of DNA at the same A/T-rich sites. The locations of these binding sites are similar to Hoechst, but the sequence specificity of the oligopeptides is lower than that of Hoechst, and the detailed protection patterns differ slightly. We suggest that these small peptides and Hoechst recognize similar sequence-dependent features of the local architecture of DNA.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号