共查询到20条相似文献,搜索用时 15 毫秒
1.
Background
Cardiovascular diseases are the primary cause of death worldwide; the identification of genes specifically expressed in the heart is thus of major biomedical interest. We carried out a comprehensive analysis of gene-expression profiles using expressed sequence tags (ESTs) to identify genes overexpressed in the human adult heart. The initial set of genes expressed in the heart was constructed by clustering and assembling ESTs from heart cDNA libraries. Expression profiles were then generated for each gene by counting their cognate ESTs in all libraries. Differential expression was assessed by applying a previously published statistical procedure to these profiles. 相似文献2.
3.
Jongeneel CV 《Briefings in bioinformatics》2000,1(1):76-92
4.
Lundström J Salazar-Anton F Sherwood E Andersson B Lindh J 《PLoS neglected tropical diseases》2010,4(12):e919
Background
Neurocysticercosis is a disease caused by the oral ingestion of eggs from the human parasitic worm Taenia solium. Although drugs are available they are controversial because of the side effects and poor efficiency. An expressed sequence tag (EST) library is a method used to describe the gene expression profile and sequence of mRNA from a specific organism and stage. Such information can be used in order to find new targets for the development of drugs and to get a better understanding of the parasite biology.Methods and Findings
Here an EST library consisting of 5760 sequences from the pig cysticerca stage has been constructed. In the library 1650 unique sequences were found and of these, 845 sequences (52%) were novel to T. solium and not identified within other EST libraries. Furthermore, 918 sequences (55%) were of unknown function. Amongst the 25 most frequently expressed sequences 6 had no relevant similarity to other sequences found in the Genbank NR DNA database. A prediction of putative signal peptides was also performed and 4 among the 25 were found to be predicted with a signal peptide. Proposed vaccine and diagnostic targets T24, Tsol18/HP6 and Tso31d could also be identified among the 25 most frequently expressed.Conclusions
An EST library has been produced from pig cysticerca and analyzed. More than half of the different ESTs sequenced contained a sequence with no suggested function and 845 novel EST sequences have been identified. The library increases the knowledge about what genes are expressed and to what level. It can also be used to study different areas of research such as drug and diagnostic development together with parasite fitness via e.g. immune modulation. 相似文献5.
A.M. Martínez-Ibeas M.J. Perteguer C. González-Lanza T. Gárate M.Y. Manga-González 《Experimental parasitology》2013
Dicrocoeliosis caused by Dicrocoelium dendriticum is an important liver disease, which affects ruminants all around the world. Despite the significant economic losses caused by this trematode, molecular knowledge is very scarce. In fact, there is no information in the expressed sequence tag (EST) database about the parasite. Furthermore, the immunological diagnosis of dicrocoeliosis remains unsatisfactory, and there aren’t available recombinant proteins that could be tested in the diagnosis. For this reason a cDNA library was constructed with mRNA extracted from D. dendriticum adults for first time. A random preliminary screening of 230 phage plaques from the library resulted in the identification of 173 new EST. The deduced proteins expressed by these genes have been described as possible vaccine targets in other trematodes, and/or as relevant diagnosis antigens. Then, our goal was to identify D. dentriticum diagnosis genes to be used as recombinant antigens in the specific immunological diagnosis of the trematodoses. A D. dendriticum cDNA encoding an 8-kDa recombinant protein has been cloned, expressed in Escherichia coli and evaluated in dicrocoeliosis diagnosis using both Western Blot and enzyme-linked immunosorbent assay (ELISA). The recombinant expression molecule has demonstrated its value as a diagnosis antigen of dicrocoeliosis, able to discriminate between positive and controls on day 30 post infection. This is the first research conducted for identification and characterization of D. dendriticum ESTs, which can serve as a starting point for future research on immunodiagnosis and immunoprofilaxis of dicrocoeliosis. 相似文献
6.
Charlotte Lindqvist Anne-Cathrine Scheen Mi-Jeong Yoo Paris Grey David G Oppenheimer James H Leebens-Mack Douglas E Soltis Pamela S Soltis Victor A Albert 《BMC plant biology》2006,6(1):16-15
Background
The endemic Hawaiian mints represent a major island radiation that likely originated from hybridization between two North American polyploid lineages. In contrast with the extensive morphological and ecological diversity among taxa, ribosomal DNA sequence variation has been found to be remarkably low. In the past few years, expressed sequence tag (EST) projects on plant species have generated a vast amount of publicly available sequence data that can be mined for simple sequence repeats (SSRs). However, these EST projects have largely focused on crop or otherwise economically important plants, and so far only few studies have been published on the use of intragenic SSRs in natural plant populations. We constructed an EST library from developing fleshy nutlets of Stenogyne rugosa principally to identify genetic markers for the Hawaiian endemic mints. 相似文献7.
MOTIVATION: Expressed sequence tag (EST) surveys are an efficient way to characterize large numbers of genes from an organism. The rate of gene discovery in an EST survey depends on the degree of redundancy of the cDNA libraries from which sequences are obtained. However, few statistical methods have been developed to assess and compare redundancies of various libraries from preliminary EST surveys. RESULTS: We consider statistics for the comparison of EST libraries based upon the frequencies with which genes occur in subsamples of reads. These measures are useful in determining which one of several libraries is more likely to yield new genes in future reads and what proportion of additional reads one might want to take from the libraries in order to be likely to obtain new genes. One approach is to compare single sample measures that have been successfully used in species estimation problems, such as coverage of a library, defined as the proportion of the library that is represented in the given sample of reads. Another single library measure is an estimate of the expected number of additional genes that will be found in a new sample of reads. We also propose statistics that jointly use data from all the libraries. Analogous formulas for coverage and the expected numbers of new genes are presented. These measures consider coverage in a single library based upon reads from all libraries and similarly, the expected numbers of new genes that will be discovered by taking reads from all libraries with fixed proportions. Together, the statistics presented provide useful comparative measures for the libraries that can be used to guide sampling from each of the libraries to maximize the rate of gene discovery. Finally, we present tests for whether genes are equally represented or expressed in a set of libraries. Binomial and chi2 tests are presented for gene-by-gene comparisons of expression. Overall tests of the equality of proportional representation are presented and multiple comparisons issues are addressed. These methods can be used to evaluate changes in gene expression reflected in the composition of EST libraries prepared from different tissue types or cells exposed to different environmental conditions. AVAILABILITY: Software will be made available at http://www.mathstat.dal.ca/~tsusko 相似文献
8.
9.
Wheat has been shown to have two forms of the cap-binding protein that participate in the initiation of translation. To identify cap-binding proteins from other higher plant species, the expressed sequence tag (EST) database was searched. Several rice ESTs were identified with similarity to both forms of the wheat cap-binding proteins. Two of the rice ESTs were obtained and the cDNA sequences completed. The deduced amino acid sequences of the rice cap-binding proteins are compared to the wheat cap-binding proteins and cap-binding proteins from Saccharomyces cerevisiae, Drosophila melanogaster, Xenopus laevis and human. 相似文献
10.
Expressed sequence tags (ESTs) are randomly sequenced cDNA clones. Currently, nearly 3 million human and 2 million mouse ESTs provide valuable resources that enable researchers to investigate the products of gene expression. The EST databases have proven to be useful tools for detecting homologous genes, for exon mapping, revealing differential splicing, etc. With the increasing availability of large amounts of poorly characterised eukaryotic (notably human) genomic sequence, ESTs have now become a vital tool for gene identification, sometimes yielding the only unambiguous evidence for the existence of a gene expression product. However, BLAST-based Web servers available to the general user have not kept pace with these developments and do not provide appropriate tools for querying EST databases with large highly spliced genes, often spanning 50 000-100 000 bases or more. Here we describe Gene2EST (http://woody.embl-heidelberg.de/gene2est/), a server that brings together a set of tools enabling efficient retrieval of ESTs matching large DNA queries and their subsequent analysis. RepeatMasker is used to mask dispersed repetitive sequences (such as Alu elements) in the query, BLAST2 for searching EST databases and Artemis for graphical display of the findings. Gene2EST combines these components into a Web resource targeted at the researcher who wishes to study one or a few genes to a high level of detail. 相似文献
11.
An expressed sequence tag (EST) data mining strategy succeeding in the discovery of new G-protein coupled receptors 总被引:7,自引:0,他引:7
We have developed a comprehensive expressed sequence tag database search method and used it for the identification of new members of the G-protein coupled receptor superfamily. Our approach proved to be especially useful for the detection of expressed sequence tag sequences that do not encode conserved parts of a protein, making it an ideal tool for the identification of members of divergent protein families or of protein parts without conserved domain structures in the expressed sequence tag database. At least 14 of the expressed sequence tags found with this strategy are promising candidates for new putative G-protein coupled receptors. Here, we describe the sequence and expression analysis of five new members of this receptor superfamily, namely GPR84, GPR86, GPR87, GPR90 and GPR91. We also studied the genomic structure and chromosomal localization of the respective genes applying in silico methods. A cluster of six closely related G-protein coupled receptors was found on the human chromosome 3q24-3q25. It consists of four orphan receptors (GPR86, GPR87, GPR91, and H963), the purinergic receptor P2Y1, and the uridine 5'-diphosphoglucose receptor KIAA0001. It seems likely that these receptors evolved from a common ancestor and therefore might have related ligands. In conclusion, we describe a data mining procedure that proved to be useful for the identification and first characterization of new genes and is well applicable for other gene families. 相似文献
12.
Casadei R Piovesan A Vitale L Facchin F Pelleri MC Canaider S Bianconi E Frabetti F Strippoli P 《Genomics》2012,100(2):125-130
The "5' end mRNA artifact" issue refers to the incorrect assignment of the first AUG codon in an mRNA, due to the incomplete determination of its 5' end sequence. We performed a systematic identification of coding regions at the 5' end of all human known mRNAs, using an automated expressed sequence tag (EST)-based approach. Following parsing of more than 7 million BLAT alignments, we found 477 human loci, out of 18,665 analyzed, in which an extension of the mRNA 5' coding region was identified. Proof-of-concept confirmation was obtained by in vitro cloning and sequencing for GNB2L1, QARS and TDP2 cDNAs, and the consequences for the functional studies of these loci are discussed. We also generated a list of 20,775 human mRNAs where the presence of an in-frame stop codon upstream of the known start codon indicates completeness of the coding sequence at 5' in the current form. 相似文献
13.
Expressed sequence tag (EST) databases contain a significant number (5-20%) of reversed, antisense, cDNA sequences that can be recognized by the label "reversed clone: similarity on wrong strand" in the annotations to the sequence. Despite this high number of altered sequences, no attempt has been made to explain the alteration in molecular terms, or to evaluate their effect on the quality of the information curated in EST databases. In this paper we try to explain the way these altered sequences are originated, and propose a plausible mechanism: a "double priming" of the first strand oligo-dT primer at both ends of nascent cDNAs. In this way, a symmetrical cDNA intermediate is generated, an intermediate that can be cloned after partial digestion with the restriction enzyme used for the directional cloning. Furthermore, when "secondary" priming takes place inside the cDNA, the chain synthesized is prone to be truncated prematurely, with the subsequent loss of upstream information. One of the most subtle effects of this cloning alteration is the generation of virtual open reading frames (ORFs) in sequences with no homologues available for comparison. Nevertheless, and according to our model and our data, the "double priming mechanism" does not shift the ORF effected, so antisense sequences should be considered as normal ones after a simple transformation in their inverse-complementary forms. 相似文献
14.
The ridgetail white prawn Exopalaemon carinicauda is one of the most important commercial species in eastern China. However, little information of immune genes in E. carinicauda has been reported. To identify distinctive genes associated with immunity, an expressed sequence tag (EST) library was constructed from hemocytes of E. carinicauda. A total of 3411 clones were sequenced, yielding 2853 ESTs and the average sequence length is 436 bp. The cluster and assembly analysis yielded 1053 unique sequences including 329 contigs and 724 singletons. Blast analysis identified 593 (56.3%) of the unique sequences as orthologs of genes from other organisms (E-value < 1e-5). Based on the COG and Gene Ontology (GO), 593 unique sequences were classified. Through comparison with previous studies, 153 genes assembled from 367 ESTs have been identified as possibly involved in defense or immune functions. These genes are categorized into seven categories according to their putative functions in shrimp immune system: antimicrobial peptides, prophenoloxidase activating system, antioxidant defense systems, chaperone proteins, clottable proteins, pattern recognition receptors and other immune-related genes. According to EST abundance, the major immune-related genes were thioredoxin (141, 4.94% of all ESTs) and calmodulin (14, 0.49% of all ESTs). The EST sequences of E. carinicauda hemocytes provide important information of the immune system and lay the groundwork for development of molecular markers related to disease resistance in prawn species. 相似文献
15.
16.
Estimation of population heterozygosity and library construction-induced mutation rate from expressed sequence tag collections
下载免费PDF全文

Unigene alignments obtained from cDNA libraries made using multiple individuals are not currently used to estimate population heterozygosity, as they are known to harbor mutations created during library construction. We describe an estimator of population heterozygosity that utilizes only SNPs unlikely to be library construction artifacts. 相似文献
17.
Kim S Lewers Chris A Saski Brandon J Cuthbertson David C Henry Meg E Staton Dorrie S Main Anik L Dhanaraj Lisa J Rowland Jeff P Tomkins 《BMC plant biology》2008,8(1):1-8
Background
The recent development of novel repeat-fruiting types of blackberry (Rubus L.) cultivars, combined with a long history of morphological marker-assisted selection for thornlessness by blackberry breeders, has given rise to increased interest in using molecular markers to facilitate blackberry breeding. Yet no genetic maps, molecular markers, or even sequences exist specifically for cultivated blackberry. The purpose of this study is to begin development of these tools by generating and annotating the first blackberry expressed sequence tag (EST) library, designing primers from the ESTs to amplify regions containing simple sequence repeats (SSR), and testing the usefulness of a subset of the EST-SSRs with two blackberry cultivars.Results
A cDNA library of 18,432 clones was generated from expanding leaf tissue of the cultivar Merton Thornless, a progenitor of many thornless commercial cultivars. Among the most abundantly expressed of the 3,000 genes annotated were those involved with energy, cell structure, and defense. From individual sequences containing SSRs, 673 primer pairs were designed. Of a randomly chosen set of 33 primer pairs tested with two blackberry cultivars, 10 detected an average of 1.9 polymorphic PCR products.Conclusion
This rate predicts that this library may yield as many as 940 SSR primer pairs detecting 1,786 polymorphisms. This may be sufficient to generate a genetic map that can be used to associate molecular markers with phenotypic traits, making possible molecular marker-assisted breeding to compliment existing morphological marker-assisted breeding in blackberry. 相似文献18.
The availability of large expressed sequence tag (EST) databases has led to a revolution in the way new genes are identified. Mining of these databases using known protein sequences as queries is a powerful technique for discovering orthologous and paralogous genes. The scientist is often confronted, however, by an enormous amount of search output owing to the inherent redundancy of EST data. In addition, high search sensitivity often cannot be achieved using only a single member of a protein superfamily as a query. In this paper a technique for addressing both of these issues is described. Assembled EST databases are queried with every member of a protein superfamily, the results are integrated and false positives are pruned from the set. The result is a set of assemblies enriched in members of the protein superfamily under consideration. The technique is applied to the G protein-coupled receptor (GPCR) superfamily in the construction of a GPCR Resource. A novel full-length human GPCR identified from the GPCR Resource is presented, illustrating the utility of the method. 相似文献
19.
20.
Navajas-Pérez R Robles F Molina-Luzón MJ De La Herrán R Alvarez-Dios JA Pardo BG Vera M Bouza C Martínez P 《Molecular ecology resources》2012,12(4):706-716
In this study, we identified and characterized 160 microsatellite loci from an expressed sequence tag (EST) database generated from immune-related organs of turbot (Scophthalmus maximus). A final set of 83 new polymorphic microsatellites were validated after the analysis of 40 individuals of Atlantic origin including both wild and farmed individuals. The allele number and the expected heterozygosity ranged from 2 to 18 and from 0.021 to 0.951, respectively. Evidences of null alleles at moderate-high frequencies were detected at six loci using population data. None of the analysed loci showed deviations from Mendelian segregation after the analysis of five full-sib families including approximately 92 individuals/family. The markers are used to consolidate the turbot genetic map, and because they are mostly EST-derived, they will be very useful for comparative genomic studies within flatfishes and with model fish species. Using an in silico approach, we detected significant homologies of microsatellite sequences with the EST databases of the flatfish species with highest genomic resources (Senegalese sole, Atlantic halibut, bastard halibut) in 31% of these turbot markers. The conservation of these microsatellites within Pleuronectiformes will pave the way for anchoring genetic maps of different species and identifying genomic regions related to productive traits. 相似文献