首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Oligonucleotide fingerprinting is a powerful DNA array-based method to characterize cDNA and ribosomal RNA gene (rDNA) libraries and has many applications including gene expression profiling and DNA clone classification. We are especially interested in the latter application. A key step in the method is the cluster analysis of fingerprint data obtained from DNA array hybridization experiments. Most of the existing approaches to clustering use (normalized) real intensity values and thus do not treat positive and negative hybridization signals equally (positive signals are much more emphasized). In this paper, we consider a discrete approach. Fingerprint data are first normalized and binarized using control DNA clones. Because there may exist unresolved (or missing) values in this binarization process, we formulate the clustering of (binary) oligonucleotide fingerprints as a combinatorial optimization problem that attempts to identify clusters and resolve the missing values in the fingerprints simultaneously. We study the computational complexity of this clustering problem and a natural parameterized version and present an efficient greedy algorithm based on MINIMUM CLIQUE PARTITION on graphs. The algorithm takes advantage of some unique properties of the graphs considered here, which allow us to efficiently find the maximum cliques as well as some special maximal cliques. Our preliminary experimental results on simulated and real data demonstrate that the algorithm runs faster and performs better than some popular hierarchical and graph-based clustering methods. The results on real data from DNA clone classification also suggest that this discrete approach is more accurate than clustering methods based on real intensity values in terms of separating clones that have different characteristics with respect to the given oligonucleotide probes.  相似文献   

2.
The major histocompatibility complex (MHC) class II DRB, DQB, DPB, and DOB gene clusters are shared by different eutherian orders. Such an orthologous relationship is not seen between the beta genes of birds and eutherians. A high degree of uncertainty surrounds the evolutionary relationship of marsupial class II beta sequences with eutherian beta gene families. In particular, it has been suggested that marsupials utilize the DRB gene cluster. A cDNA encoding an MHC class II beta molecule was isolated from a brushtail possum mesenteric lymph node cDNA library. This clone is most similar to Macropus rufogriseus DBB. Our analysis suggests that all known marsupial beta-chain genes, excluding DMB, fall into two separate clades, which are distinct from the eutherian DRB, DQB, DPB, or DOB gene clusters. We recommend that the DAB and DBB nomenclature be reinstated. DAB and DBB orthologs are not present in eutherians. It appears that the marsupial and eutherian lineages have retained different gene clusters following gene duplication events early in mammalian evolution.  相似文献   

3.
cDNA microarray technology and its applications   总被引:18,自引:0,他引:18  
The cDNA microarray is the most powerful tool for studying gene expression in many different organisms. It has been successfully applied to the simultaneous expression of many thousands of genes and to large-scale gene discovery, as well as polymorphism screening and mapping of genomic DNA clones. It is a high throughput, highly parallel RNA expression assay technique that permits quantitative analysis of RNAs transcribed from both known and unknown genes. This technique provides diagnostic fingerprints by comparing gene expression patterns in normal and pathological cells, and because it can simultaneously track expression levels of many genes, it provides a source of operational context for inference and predication about complex cell control systems. This review describes this recently developed cDNA microarray technology and its application to gene discovery and expression, and to diagnostics for certain diseases.  相似文献   

4.
MOTIVATION: Clustering sequences of a full-length cDNA library into alternative splice form candidates is a very important problem. RESULTS: We developed a new efficient algorithm to cluster sequences of a full-length cDNA library into alternative splice form candidates. Current clustering algorithms for cDNAs tend to produce too many clusters containing incorrect splice form candidates. Our algorithm is based on a spliced sequence alignment algorithm that considers splice sites. The spliced sequence alignment algorithm is a variant of an ordinary dynamic programming algorithm, which requires O(nm) time for checking a pair of sequences where n and m are the lengths of the two sequences. Since the time bound is too large to perform all-pair comparison for a large set of sequences, we developed new techniques to reduce the computation time without affecting the accuracy of the output clusters. Our algorithm was applied to 21 076 mouse cDNA sequences of the FANTOM 1.10 database to examine its performance and accuracy. In these experiments, we achieved about 2-12-fold speedup against a method using only a traditional hash-based technique. Moreover, without using any information of the mouse genome sequence data or any gene data in public databases, we succeeded in listing 87-89% of all the clusters that biologists have annotated manually. AVAILABILITY: We provide a web service for cDNA clustering located at https://access.obigrid.org/ibm/cluspa/, for which registration for the OBIGrid (http://www.obigrid.org) is required.  相似文献   

5.
The polyploid nature of wheat is a key characteristic of the plant. Full-length complementary DNAs (cDNAs) provide essential information that can be used to annotate the genes and provide a functional analysis of these genes and their products. We constructed a full-length cDNA library derived from young spikelets of common wheat, and obtained 24056 expressed sequence tags (ESTs) from both ends of the cDNA clones. These ESTs were grouped into 3605 contigs using the phrap method, representing expressed loci from each of the three genomes. Using BLAST, 3605 contigs were grouped into 1902 gene clusters, showing that loci of the three genomes are not always expressed. A homology search of these gene clusters against a wheat EST database (15964 gene clusters) and a rice full-length cDNA database (21447 gene clusters) revealed that a quarter of the wheat full-length cDNAs were novel. A protein database of Arabidopsis was used to examine the functional classification of these gene clusters. The GC-content in the 5 -UTR region of wheat cDNAs was compared to that of rice. Forty-three genes (3.5% of wheat cDNAs homologous to those of rice) possessed distinct GC-content in the 5 -UTR region, suggesting different breeding behaviors of wheat and rice.  相似文献   

6.
MOTIVATION: Cluster analysis of genome-wide expression data from DNA microarray hybridization studies has proved to be a useful tool for identifying biologically relevant groupings of genes and samples. In the present paper, we focus on several important issues related to clustering algorithms that have not yet been fully studied. RESULTS: We describe a simple and robust algorithm for the clustering of temporal gene expression profiles that is based on the simulated annealing procedure. In general, this algorithm guarantees to eventually find the globally optimal distribution of genes over clusters. We introduce an iterative scheme that serves to evaluate quantitatively the optimal number of clusters for each specific data set. The scheme is based on standard approaches used in regular statistical tests. The basic idea is to organize the search of the optimal number of clusters simultaneously with the optimization of the distribution of genes over clusters. The efficiency of the proposed algorithm has been evaluated by means of a reverse engineering experiment, that is, a situation in which the correct distribution of genes over clusters is known a priori. The employment of this statistically rigorous test has shown that our algorithm places greater than 90% genes into correct clusters. Finally, the algorithm has been tested on real gene expression data (expression changes during yeast cell cycle) for which the fundamental patterns of gene expression and the assignment of genes to clusters are well understood from numerous previous studies.  相似文献   

7.
This study established the utility of cross-species application of the cDNA microarray technique for investigating differential gene expression. Using both total RNA and mRNA samples recovered from two opossum cell lines derived from UVB-induced melanoma, we analyzed expression of ca. 4400 genes on the human DermArray DNA microarrays. The signals generated on the DermArrays were clear, strong, and reproducible. A cDNA dot blot consisting of differentially expressed genes representative of different functional clusters was used to validate the DermArray results. We also cloned a Monodelphis gene, keratin 18 (KRT18), and characterized its expression patterns in tumor samples of different progression stages. Up-regulated expression was observed for the KRT18 gene in advanced melanomas, a finding consistent with the DermArray analysis. These results provide evidence that cross-species application of cDNA microarrays is a useful strategy for investigating gene expression patterns in animal models for which species-specific cDNA microarrays are not available.  相似文献   

8.
Inulin-type fructans are the simplest and most studied fructans and have become increasingly popular as prebiotic health-improving compounds. A natural variation in the degree of polymerization (DP) of inulins is observed within the family of the Asteraceae. Globe thistle (Echinops ritro), artichoke (Cynara scolymus), and Viguiera discolor biosynthesize fructans with a considerably higher DP than Cichorium intybus (chicory), Helianthus tuberosus (Jerusalem artichoke), and Dahlia variabilis. The higher DP in some species can be explained by the presence of special fructan:fructan 1-fructosyl transferases (high DP 1-FFTs), different from the classical low DP 1-FFTs. Here, the RT-PCR-based cloning of a high DP 1-FFT cDNA from Echinops ritro is described, starting from peptide sequence information derived from the purified native high DP 1-FFT enzyme. The cDNA was successfully expressed in Pichia pastoris. A comparison is made between the mass fingerprints of the native, heterodimeric enzyme and its recombinant, monomeric counterpart (mass fingerprints and kinetical analysis) showing that they have very similar properties. The recombinant enzyme is a functional 1-FFT lacking invertase and 1-SST activities, but shows a small intrinsic 1-FEH activity. The enzyme is capable of producing a high DP inulin pattern in vitro, similar to the one observed in vivo. Depending on conditions, the enzyme is able to produce fructo-oligosaccharides (FOS) as well. Therefore, the enzyme might be suitable for both FOS and high DP inulin production in bioreactors. Alternatively, introduction of the high DP 1-FFT gene in chicory, a crop widely used for inulin extraction, could lead to an increase in DP which is useful for a number of specific industrial applications. 1-FFT expression analysis correlates well with high DP fructan accumulation in vivo, suggesting that the enzyme is responsible for high DP fructan formation in planta.  相似文献   

9.
The use of hybridisation of synthetic oligonucleotides to cDNAs under high stringency to characterise gene sequences has been demonstrated by a number of groups. We have used two cDNA libraries of 9 and 12 day mouse embryos (24 133 and 34 783 clones respectively) in a pilot study to characterise expressed genes by hybridisation with 110 hybridisation probes. We have identified 33 369 clusters of cDNA clones, that ranged in representation from 1 to 487 copies (0.7%). 737 were assigned to known rodent genes, and a further 13 845 showed significant homologies. A total of 404 clusters were identified as significantly differentially represented (P < 0.01) between the two cDNA libraries. This study demonstrates the utility of the fingerprinting approach for the generation of comparative gene expression profiles through the analysis of cDNAs derived from different biological materials.  相似文献   

10.
Fibroblast growth factor (FGF) induces the notochord and mesenchyme in ascidian embryos, via extracellular signal-regulated kinase (ERK) that belongs to the mitogen-activated protein kinase (MAPK) family. A cDNA microarray analysis was carried out to identify genes affected by an inhibitor of MAPK/ERK kinase (MEK), U0126, in embryos of the ascidian Ciona intestinalis. Data obtained from the microarray and in situ hybridization suggest that the majority of genes are downregulated by U0126 treatment. Genes that were downregulated in U0126-treated embryos included Ci-Bra and Ci-Twist-like1 that are master regulatory genes of notochord and mesenchyme differentiation, respectively. The plasminogen mRNA was downregulated by U0126 in presumptive endoderm cells. This suggests that a MEK-mediated extracellular signal is necessary for gene expression in tissues whose specification does not depend on cell-to-cell interaction. Among 85 cDNA clusters that were not affected by U0126, 30 showed mitochondria-like mRNA localization in the nerve cord/muscle lineage blastomeres in the equatorial region. The expression level and asymmetric distribution of these mRNA were independent of MEK signaling.  相似文献   

11.
MOTIVATION: Recent technological advances such as cDNA microarray technology have made it possible to simultaneously interrogate thousands of genes in a biological specimen. A cDNA microarray experiment produces a gene expression 'profile'. Often interest lies in discovering novel subgroupings, or 'clusters', of specimens based on their profiles, for example identification of new tumor taxonomies. Cluster analysis techniques such as hierarchical clustering and self-organizing maps have frequently been used for investigating structure in microarray data. However, clustering algorithms always detect clusters, even on random data, and it is easy to misinterpret the results without some objective measure of the reproducibility of the clusters. RESULTS: We present statistical methods for testing for overall clustering of gene expression profiles, and we define easily interpretable measures of cluster-specific reproducibility that facilitate understanding of the clustering structure. We apply these methods to elucidate structure in cDNA microarray gene expression profiles obtained on melanoma tumors and on prostate specimens.  相似文献   

12.
开放的差异基因表达技术研究进展   总被引:6,自引:2,他引:4  
自 90年代早期发展以来 ,差异基因表达 (DGE)技术在许多领域得到了应用 .“开放”结构系统的DGE技术不需原始的生物学或序列信息 ,而且可应用于任何种群 .主要介绍 6项开放的DGE技术 :cDNA代表性差示分析 (cDNA RDA)、基因表达系统分析 (SAGE)、表达序列标签串联排列连接(TALEST) ,和早期的DGE技术差异显示 (DD)、随机引物聚合酶链反应 (AP PCR) ,以及一项受专利保护的技术———GeneCalling .通过几项重要的参数对这些技术进行了比较 ,认为DD虽然有其致命的弱点 ,但在目前仍然应用得非常广泛 .cDNA RDA能有效富增特异片段 ,扣除共有序列 ,如能和SAGE结合 ,将能进一步促进其发展 .TALEST和GeneCalling操作较简便 ,一次试验能获得大量的数据 ,但是分析这些数据比较麻烦 ,须借助另外的分析软件 .最后介绍了应用DGE技术取得的最新成果 .  相似文献   

13.
Microarray analysis of tiny amounts of RNA extracted from plant section samples prepared by laser microdissection (LM) can provide high-quality information on gene expression in specified plant cells at various stages of development. Having joined the LM-microarray analysis project, we utilized such genome-wide gene expression data from developing rice pollen cells to identify candidates for cis-regulatory elements for specific gene expression in these cells. We first found a few clusters of gene expression patterns based on the data from LM-microarrays. On one gene cluster in which the members were specifically expressed at the bicellular and mature pollen mitotic stages, we identified gene cluster fingerprints (GCFs), each of which consists of a short nucleotide representing the gene cluster. We expected that these GCFs would contain cis-regulatory elements for stage- and tissue-specific gene expression, and we further identified groups of GCFs with common core sequences. Some criteria, such as frequency of occurrence in the gene cluster in contrast to the total tested gene set, flanking sequence preference and distribution of combined GCF sets in the gene regions, allowed us to limit candidates for cis-regulatory sequences for specific gene expression in rice pollen cells to at least 20 sets of combined GCFs. This approach should provide a general purpose algorithm for identifying short nucleotides associated with specific gene expression.  相似文献   

14.
A novel factor required for the SUMO1/Smt3 conjugation of yeast septins   总被引:3,自引:0,他引:3  
Takahashi Y  Toh-e A  Kikuchi Y 《Gene》2001,271(2):223-231
  相似文献   

15.
M. Kuhner  S. Watts  W. Klitz  G. Thomson    R. S. Goodenow 《Genetics》1990,126(4):1115-1126
In order to better understand the role of gene conversion in the evolution of the class I gene family of the major histocompatibility complex (MHC), we have used a computer algorithm to detect clustered sequence similarities among 24 class I DNA sequences from the H-2, Qa, and Tla regions of the murine MHC. Thirty-four statistically significant clusters were detected; individual analysis of the clusters suggested at least 25 past gene conversion or recombination events. These clusters are comparable in size to the conversions observed in the spontaneously occurring H-2K(bm) and H-2K(km2) mutations, and are distributed throughout all exons of the class I gene. Thus, gene conversion does not appear to be restricted to the regions of the class I gene encoding their antigen-presentation function. Moreover, both the highly polymorphic H-2 loci and the relatively monomorphic Qa and Tla loci appear to have participated as donors and recipients in conversion events. If gene conversion is not limited to the highly polymorphic loci of the MHC, then another factor, presumably natural selection, must be responsible for maintaining the observed differences in level of variation.  相似文献   

16.
17.
18.
Algorithms and software for support of gene identification experiments   总被引:1,自引:0,他引:1  
MOTIVATION: Gene annotation is the final goal of gene prediction algorithms. However, these algorithms frequently make mistakes and therefore the use of gene predictions for sequence annotation is hardly possible. As a result, biologists are forced to conduct time-consuming gene identification experiments by designing appropriate PCR primers to test cDNA libraries or applying RT-PCR, exon trapping/amplification, or other techniques. This process frequently amounts to 'guessing' PCR primers on top of unreliable gene predictions and frequently leads to wasting of experimental efforts. RESULTS: The present paper proposes a simple and reliable algorithm for experimental gene identification which bypasses the unreliable gene prediction step. Studies of the performance of the algorithm on a sample of human genes indicate that an experimental protocol based on the algorithm's predictions achieves an accurate gene identification with relatively few PCR primers. Predictions of PCR primers may be used for exon amplification in preliminary mutation analysis during an attempt to identify a gene responsible for a disease. We propose a simple approach to find a short region from a genomic sequence that with high probability overlaps with some exon of the gene. The algorithm is enhanced to find one or more segments that are probably contained in the translated region of the gene and can be used as PCR primers to select appropriate clones in cDNA libraries by selective amplification. The algorithm is further extended to locate a set of PCR primers that uniformly cover all translated regions and can be used for RT-PCR and further sequencing of (unknown) mRNA.   相似文献   

19.
The balhimycin biosynthetic gene cluster of the glycopeptide producer Amycolatopsis balhimycina includes a gene (orf1) with unknown function. orf1 shows high similarity to the mbtH gene from Mycobacterium tuberculosis. In almost all nonribosomal peptide synthetase (NRPS) biosynthetic gene clusters, we could identify a small mbtH-like gene whose function in peptide biosynthesis is not known. The mbtH-like gene is always colocalized with the NRPS genes; however, it does not have a specific position in the gene cluster. In all glycopeptide biosynthetic gene clusters the orf1-like gene is always located downstream of the gene encoding the last module of the NRPS. We inactivated the orf1 gene in A. balhimycina by generating a deletion mutant. The balhimycin production is not affected in the orf1-deletion mutant and is indistinguishable from that of the wild type. For the first time, we show that the inactivation of an mbtH-like gene does not impair the biosynthesis of a nonribosomal peptide.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号