首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We have developed a functional genomics tool to identify the subset of cDNAs encoding secreted and membrane-bound proteins within a library (the ‘secretome’). A Sindbis virus replicon was engineered such that the envelope protein precursor no longer enters the secretory pathway. cDNA fragments were fused to the mutant precursor and expression screened for their ability to restore membrane localization of envelope proteins. In this way, recombinant replicons were released within infectious viral particles only if the cDNA fragment they contain encodes a secretory signal. By using engineered viral replicons to selectively export cDNAs of interest in the culture medium, the methodology reported here efficiently filters genetic information in mammalian cells without the need to select individual clones. This adaptation of the ‘signal trap’ strategy is highly sensitive (1/200 000) and efficient. Indeed, of the 2546 inserts that were retrieved after screening various libraries, more than 97% contained a putative signal peptide. These 2473 clones encoded 419 unique cDNAs, of which 77% were previously annotated. Of the 94 cDNAs encoding proteins of unknown function, 24% either had no match in databases or contained a secretory signal that could not be predicted from electronic data.  相似文献   

2.
3.
Nucleic acids are molecules of choice for both established and emerging nanoscale technologies. These technologies benefit from large functional densities of ‘DNA processing elements’ that can be readily manufactured. To achieve the desired functionality, polynucleotide sequences are currently designed by a process that involves tedious and laborious filtering of potential candidates against a series of requirements and parameters. Here, we present a complete novel methodology for the rapid rational design of large sets of DNA sequences. This method allows for the direct implementation of very complex and detailed requirements for the generated sequences, thus avoiding ‘brute force’ filtering. At the same time, these sequences have narrow distributions of melting temperatures. The molecular part of the design process can be done without computer assistance, using an efficient ‘human engineering’ approach by drawing a single blueprint graph that represents all generated sequences. Moreover, the method eliminates the necessity for extensive thermodynamic calculations. Melting temperature can be calculated only once (or not at all). In addition, the isostability of the sequences is independent of the selection of a particular set of thermodynamic parameters. Applications are presented for DNA sequence designs for microarrays, universal microarray zip sequences and electron transfer experiments.  相似文献   

4.
Gene set-based analysis of genome-wide association study (GWAS) data has recently emerged as a useful approach to examine the joint effects of multiple risk loci in complex human diseases or phenotypes. Dental caries is a common, chronic, and complex disease leading to a decrease in quality of life worldwide. In this study, we applied the approaches of gene set enrichment analysis to a major dental caries GWAS dataset, which consists of 537 cases and 605 controls. Using four complementary gene set analysis methods, we analyzed 1331 Gene Ontology (GO) terms collected from the Molecular Signatures Database (MSigDB). Setting false discovery rate (FDR) threshold as 0.05, we identified 13 significantly associated GO terms. Additionally, 17 terms were further included as marginally associated because they were top ranked by each method, although their FDR is higher than 0.05. In total, we identified 30 promising GO terms, including ‘Sphingoid metabolic process,’ ‘Ubiquitin protein ligase activity,’ ‘Regulation of cytokine secretion,’ and ‘Ceramide metabolic process.’ These GO terms encompass broad functions that potentially interact and contribute to the oral immune response related to caries development, which have not been reported in the standard single marker based analysis. Collectively, our gene set enrichment analysis provided complementary insights into the molecular mechanisms and polygenic interactions in dental caries, revealing promising association signals that could not be detected through single marker analysis of GWAS data.  相似文献   

5.
Phytoplasmas (‘Candidatus Phytoplasma’ spp.) are insect-vectored bacteria that infect a wide variety of plants, including many agriculturally important species. The infections can cause devastating yield losses by inducing morphological changes that dramatically alter inflorescence development. Detection of phytoplasma infection typically utilizes sequences located within the 16S–23S rRNA-encoding locus, and these sequences are necessary for strain identification by currently accepted standards for phytoplasma classification. However, these methods can generate PCR products >1400 bp that are less divergent in sequence than protein-encoding genes, limiting strain resolution in certain cases. We describe a method for accessing the chaperonin-60 (cpn60) gene sequence from a diverse array of ‘Ca.Phytoplasma’ spp. Two degenerate primer sets were designed based on the known sequence diversity of cpn60 from ‘Ca.Phytoplasma’ spp. and used to amplify cpn60 gene fragments from various reference samples and infected plant tissues. Forty three cpn60 sequences were thereby determined. The cpn60 PCR-gel electrophoresis method was highly sensitive compared to 16S-23S-targeted PCR-gel electrophoresis. The topology of a phylogenetic tree generated using cpn60 sequences was congruent with that reported for 16S rRNA-encoding genes. The cpn60 sequences were used to design a hybridization array using oligonucleotide-coupled fluorescent microspheres, providing rapid diagnosis and typing of phytoplasma infections. The oligonucleotide-coupled fluorescent microsphere assay revealed samples that were infected simultaneously with two subtypes of phytoplasma. These tools were applied to show that two host plants, Brassica napus and Camelina sativa, displayed different phytoplasma infection patterns.  相似文献   

6.
7.
Recombinant protein translation in Escherichia coli may be limited by stable (i.e. low free energy) secondary structures in the mRNA translation initiation region. To circumvent this issue, we have set-up a computer tool called ‘ExEnSo’ (Expression Enhancer Software) that generates a random library of 8192 sequences, calculates the free energy of secondary structures of each sequence in the 70/+96 region (base 1 is the translation initiation codon), and then selects the sequence having the highest free energy. The software uses this ‘optimized’ sequence to create a 5′ primer that can be used in PCR experiments to amplify the coding sequence of interest prior to sub-cloning into a prokaryotic expression vector. In this article, we report how ExEnSo was set-up and the results obtained with nine coding sequences with low expression levels in E. coli. The free energy of the 70/+96 region of all these coding sequences was increased compared to the non-optimized sequences. Moreover, the protein expression of eight out of nine of these coding sequences was increased in E. coli, indicating a good correlation between in silico and in vivo results. ExEnSo is available as a free online tool.  相似文献   

8.
We present a novel analysis of compositional order (CO) based on the occurrence of Frequent amino-acid Triplets (FTs) that appear much more than random in protein sequences. The method captures all types of proteomic compositional order including single amino-acid runs, tandem repeats, periodic structure of motifs and otherwise low complexity amino-acid regions. We introduce new order measures, distinguishing between ‘regularity’, ‘periodicity’ and ‘vocabulary’, to quantify these phenomena and to facilitate the identification of evolutionary effects. Detailed analysis of representative species across the tree-of-life demonstrates that CO proteins exhibit numerous functional enrichments, including a wide repertoire of particular patterns of dependencies on regularity and periodicity. Comparison between human and mouse proteomes further reveals the interplay of CO with evolutionary trends, such as faster substitution rate in mouse leading to decrease of periodicity, while innovation along the human lineage leads to larger regularity. Large-scale analysis of 94 proteomes leads to systematic ordering of all major taxonomic groups according to FT-vocabulary size. This is measured by the count of Different Frequent Triplets (DFT) in proteomes. The latter provides a clear hierarchical delineation of vertebrates, invertebrates, plants, fungi and prokaryotes, with thermophiles showing the lowest level of FT-vocabulary. Among eukaryotes, this ordering correlates with phylogenetic proximity. Interestingly, in all kingdoms CO accumulation in the proteome has universal characteristics. We suggest that CO is a genomic-information correlate of both macroevolution and various protein functions. The results indicate a mechanism of genomic ‘innovation’ at the peptide level, involved in protein elongation, shaped in a universal manner by mutational and selective forces.  相似文献   

9.
Parallel analysis of RNA ends (PARE) is a technique utilizing high-throughput sequencing to profile uncapped, mRNA cleavage or decay products on a genome-wide basis. Tools currently available to validate miRNA targets using PARE data employ only annotated genes, whereas important targets may be found in unannotated genomic regions. To handle such cases and to scale to the growing availability of PARE data and genomes, we developed a new tool, ‘sPARTA’ (small RNA-PARE target analyzer) that utilizes a built-in, plant-focused target prediction module (aka ‘miRferno’). sPARTA not only exhibits an unprecedented gain in speed but also it shows greater predictive power by validating more targets, compared to a popular alternative. In addition, the novel ‘seed-free’ mode, optimized to find targets irrespective of complementarity in the seed-region, identifies novel intergenic targets. To fully capitalize on the novelty and strengths of sPARTA, we developed a web resource, ‘comPARE’, for plant miRNA target analysis; this facilitates the systematic identification and analysis of miRNA-target interactions across multiple species, integrated with visualization tools. This collation of high-throughput small RNA and PARE datasets from different genomes further facilitates re-evaluation of existing miRNA annotations, resulting in a ‘cleaner’ set of microRNAs.  相似文献   

10.
A common practice in computational genomic analysis is to use a set of ‘background’ sequences as negative controls for evaluating the false-positive rates of prediction tools, such as gene identification programs and algorithms for detection of cis-regulatory elements. Such ‘background’ sequences are generally taken from regions of the genome presumed to be intergenic, or generated synthetically by ‘shuffling’ real sequences. This last method can lead to underestimation of false-positive rates. We developed a new method for generating artificial sequences that are modeled after real intergenic sequences in terms of composition, complexity and interspersed repeat content. These artificial sequences can serve as an inexhaustible source of high-quality negative controls. We used artificial sequences to evaluate the false-positive rates of a set of programs for detecting interspersed repeats, ab initio prediction of coding genes, transcribed regions and non-coding genes. We found that RepeatMasker is more accurate than PClouds, Augustus has the lowest false-positive rate of the coding gene prediction programs tested, and Infernal has a low false-positive rate for non-coding gene detection. A web service, source code and the models for human and many other species are freely available at http://repeatmasker.org/garlic/.  相似文献   

11.
Development of a new methodology to create protein libraries, which enable the exploration of global protein space, is an exciting challenge. In this study we have developed random multi-recombinant PCR (RM-PCR), which permits the shuffling of several DNA fragments without homologous sequences. In order to evaluate this methodology, we applied it to create two different combinatorial DNA libraries. For the construction of a ‘random shuffling library’, RM-PCR was used to shuffle six DNA fragments each encoding 25 amino acids; this affords many different fragment sequences whose every position has an equal probability to encode any of the six blocks. For the construction of the ‘alternative splicing library’, RM-PCR was used to perform different alternative splicings at the DNA level, which also yields different block sequences. DNA sequencing of the RM-PCR products in both libraries revealed that most of the sequences were quite different, and had a long open reading frame without a frame shift or stop codon. Furthermore, no distinct bias among blocks was observed. Here we describe how to use RM-PCR for the construction of combinatorial DNA libraries, which encode protein libraries that would be suitable for selection experiments in the global protein space.  相似文献   

12.
We have carried out a systematic computational analysis on a representative dataset of proteins of known three-dimensional structure, in order to evaluate whether it would possible to ‘swap’ certain short peptide sequences in naturally occurring proteins with their corresponding ‘inverted’ peptides and generate ‘artificial’ proteins that are predicted to retain native-like protein fold. The analysis of 3,967 representative proteins from the Protein Data Bank revealed 102,677 unique identical inverted peptide sequence pairs that vary in sequence length between 5–12 and 18 amino acid residues. Our analysis illustrates with examples that such ‘artificial’ proteins may be generated by identifying peptides with ‘similar structural environment’ and by using comparative protein modeling and validation studies. Our analysis suggests that natural proteins may be tolerant to accommodating such peptides.  相似文献   

13.
Construction of a human full-length cDNA bank   总被引:14,自引:0,他引:14  
We aimed to construct a full-length cDNA bank from an entire set of human genes and to analyze the function of a protein encoded by each cDNA. To achieve this purpose, a multifunctional phagemid shuttle vector, pKAl, was constructed for preparing a high-quality cDNA library composed of full-length cDNA clones which can be sequenced and expressed in vitro and in mammalian cells without subcloning the cDNA fragment into other vectors. Using this as a vector primer, we have prepared a prototype of the bank composed of full-length cDNAs encoding 236 human proteins whose amino acid sequences are identical or similar to known proteins. Most cDNAs contain a putative cap site sequence, some of which show a pyrimidine-rich conserved sequence exhibiting polymorphism. It was confirmed that the vector permits efficient in vitro translation, expression in mammalian cells and the preparation of nested deletion mutants.  相似文献   

14.
Using an in vitro selection, we have obtained oligonucleotide probes with high discriminatory power against multiple, similar nucleic acid sequences, which is often required in diagnostic applications for simultaneous testing of such sequences. We have tested this approach, referred to as iterative hybridizations, by selecting probes against six 22-nt-long sequence variants representing human papillomavirus, (HPV). We have obtained probes that efficiently discriminate between HPV types that differ by 3–7nt. The probes were found effective to recognize HPV sequences of the type 6, 11, 16, 18 and a pair of type 31 and 33, either when immobilized on a solid support or in a reverse configuration, as well to discriminate HPV types from the clinical samples. This methodology can be extended to generate diagnostic kits that rely on nucleic acid hybridization between closely related sequences. In this approach, instead of adjusting hybridization conditions to the intended set of probe–target pairs, we ‘adjust’, through in vitro selection, the probes to the conditions we have chosen. Importantly, these conditions have to be ‘relaxed’, allowing the formation of a variety of not fully complementary complexes from which those that efficiently recognize and discriminate intended from non-intended targets can be readily selected.  相似文献   

15.
A meaningful set of stimuli, such as a sequence of frames from a movie, triggers a set of different experiences. By contrast, a meaningless set of stimuli, such as a sequence of ‘TV noise’ frames, triggers always the same experience—of seeing ‘TV noise’—even though the stimuli themselves are as different from each other as the movie frames. We reasoned that the differentiation of cortical responses underlying the subject’s experiences, as measured by Lempel-Ziv complexity (incompressibility) of functional MRI images, should reflect the overall meaningfulness of a set of stimuli for the subject, rather than differences among the stimuli. We tested this hypothesis by quantifying the differentiation of brain activity patterns in response to a movie sequence, to the same movie scrambled in time, and to ‘TV noise’, where the pixels from each movie frame were scrambled in space. While overall cortical activation was strong and widespread in all conditions, the differentiation (Lempel-Ziv complexity) of brain activation patterns was correlated with the meaningfulness of the stimulus set, being highest in the movie condition, intermediate in the scrambled movie condition, and minimal for ‘TV noise’. Stimulus set meaningfulness was also associated with higher information integration among cortical regions. These results suggest that the differentiation of neural responses can be used to assess the meaningfulness of a given set of stimuli for a given subject, without the need to identify the features and categories that are relevant to the subject, nor the precise location of selective neural responses.  相似文献   

16.
The tendency for repetitiveness of nucleotides in DNA sequences has been reported for a variety of organisms. We show that the tendency for repetitive use of amino acids is widespread and is observed even for segments conserved between human and Drosophila melanogaster at the level of >50% amino acid identity. This indicates that repetitiveness influences not only the weakly constrained segments but also those sequence segments conserved among phyla. Not only glutamine (Q) but also many of the 20 amino acids show a comparable level of repetitiveness. Repetitiveness in bases at codon position 3 is stronger for human than for D.melanogaster, whereas local repetitiveness in intron sequences is similar between the two organisms. While genes for immune system-specific proteins, but not ancient human genes (i.e. human homologs of Escherichia coli genes), have repetitiveness at codon bases 1 and 2, repetitiveness at codon base 3 for these groups is similar, suggesting that the human genome has at least two mechanisms generating local repetitiveness. Neither amino acid nor nucleotide repetitiveness is observed beyond the exon boundary, denying the possibility that such repetitiveness could mainly stem from natural selection on mRNA or protein sequences. Analyses of mammalian sequence alignments show that while the ‘between gene’ GC content heterogeneity, which is linked to ‘isochores’, is a principal factor associated with the bias in substitution patterns in human, ‘within gene’ heterogeneity in nucleotide composition is also associated with such bias on a more local scale. The relationship amongst the various types of repetitiveness is discussed.  相似文献   

17.
Genomic resources such as single nucleotide polymorphism (SNPs), insertions and deletions (InDels) and SSRs (simple sequence repeats) are essential for crop improvement and better utilization in genetic breeding. However, the resources for the sacred lotus (Nelumbo nucifera Gaertn.) are still limited. In the present study, to dissect large-scale genomic molecular marker resources for sacred lotus, we re-sequenced a Thailand sacred lotus cultivar ‘Chiang Mai wild lotus’ and compared with the reported lotus genome ‘Middle lake wild lotus’. A total of 3,180,059 SNPs, 328, 251 InDels and 14,191 SVs were found between the two genomes. The functional impact analyses of these SNPs indicated that they may be involved in metabolic processes, binding, catalytic activity, etc. Mining the genome sequences for SSRs showed that 191,657 SSRs were identified with a frequency of one SSR per 4.23 kb and 103,656 SSR primer pairs were designed. Furthermore, 14, 502 EST-SSRs were also indentified using the available RNA-seq data in the NCBI. A subset of 150 SSRs (genomic and EST-SSRs) was randomly selected for validation and genetic diversity analysis. The genotypes could be easily distinguished using these SSR markers and the ‘Chiang Mai wild lotus’ was obviously differentiated from the other Chinese accessions. This study provides considerable amounts of genomic resources and markers for the quantitative trait locus (QTL) identification and molecular selection of the species, which could have a potential role in various applications in sacred lotus breeding.  相似文献   

18.
We have developed a novel cost-effective procedure, namely ‘chemical nanoprinting’, for oligonucleotide or cDNA chips manufacture. In this thermo-controlled process, the oligonucleotides, covalently attached to a highly loaded ‘master-chip’ through disulfide bonds, are chemically transferred to the acrylamide layer mounted on a ‘print-chip’. It is demonstrated here that multiple identical print-chips can be produced from a single master-chip. This duplication process is a few hundreds of times faster than any existing methods and the speed of process and cost incurred are independent of the scale of the DNA chips.  相似文献   

19.
To gain genetic insights into the early-flowering phenotype of ornamental cherry, also known as sakura, we determined the genome sequences of two early-flowering cherry (Cerasus × kanzakura) varieties, ‘Kawazu-zakura’ and ‘Atami-zakura’. Because the two varieties are interspecific hybrids, likely derived from crosses between Cerasus campanulata (early-flowering species) and Cerasus speciosa, we employed the haplotype-resolved sequence assembly strategy. Genome sequence reads obtained from each variety by single-molecule real-time sequencing (SMRT) were split into two subsets, based on the genome sequence information of the two probable ancestors, and assembled to obtain haplotype-phased genome sequences. The resultant genome assembly of ‘Kawazu-zakura’ spanned 519.8 Mb with 1,544 contigs and an N50 value of 1,220.5 kb, while that of ‘Atami-zakura’ totalled 509.6 Mb with 2,180 contigs and an N50 value of 709.1 kb. A total of 72,702 and 69,528 potential protein-coding genes were predicted in the genome assemblies of ‘Kawazu-zakura’ and ‘Atami-zakura’, respectively. Gene clustering analysis identified 2,634 clusters uniquely presented in the C. campanulata haplotype sequences, which might contribute to its early-flowering phenotype. Genome sequences determined in this study provide fundamental information for elucidating the molecular and genetic mechanisms underlying the early-flowering phenotype of ornamental cherry tree varieties and their relatives.  相似文献   

20.
A primate study reported the existence of neurons from the dorso-lateral prefrontal cortex which fired prior to executing categorical action sequences. The authors suggested these activities may represent abstract level information. Here, we aimed to find the neurophysiological representation of planning categorical action sequences at the population level in healthy humans. Previous human studies have shown beta-band event-related desynchronization (ERD) during action planning in humans. Some of these studies showed different levels of ERD according to different types of action preparation. Especially, the literature suggests that variations in cognitive factors rather than physical factors (force, direction, etc) modulate the level of beta-ERD. We hypothesized that the level of beta-band power will differ according to planning of different categorical sequences. We measured magnetoencephalography (MEG) from 22 subjects performing 11 four-sequence actions - each consisting of one or two of three simple actions - in 3 categories; ‘Paired (ooxx)’, ‘Alternative (oxox)’ and ‘Repetitive (oooo)’ (‘o’ and ‘x’ each denoting one of three simple actions). Time-frequency representations were calculated for each category during the planning period, and the corresponding beta-power time-courses were compared. We found beta-ERD during the planning period for all subjects, mostly in the contralateral fronto-parietal areas shortly after visual cue onset. Power increase (transient rebound) followed ERD in 20 out of 22 subjects. Amplitudes differed among categories in 20 subjects for both ERD and transient rebound. In 18 out of 20 subjects ‘Repetitive’ category showed the largest ERD and rebound. The current result suggests that beta-ERD in the contralateral frontal/motor/parietal areas during planning is differentiated by the category of action sequences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号