首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Chen H  Kihara D 《Proteins》2011,79(1):315-334
Computational protein structure prediction remains a challenging task in protein bioinformatics. In the recent years, the importance of template-based structure prediction is increasing because of the growing number of protein structures solved by the structural genomics projects. To capitalize the significant efforts and investments paid on the structural genomics projects, it is urgent to establish effective ways to use the solved structures as templates by developing methods for exploiting remotely related proteins that cannot be simply identified by homology. In this work, we examine the effect of using suboptimal alignments in template-based protein structure prediction. We showed that suboptimal alignments are often more accurate than the optimal one, and such accurate suboptimal alignments can occur even at a very low rank of the alignment score. Suboptimal alignments contain a significant number of correct amino acid residue contacts. Moreover, suboptimal alignments can improve template-based models when used as input to Modeller. Finally, we use suboptimal alignments for handling a contact potential in a probabilistic way in a threading program, SUPRB. The probabilistic contacts strategy outperforms the partly thawed approach, which only uses the optimal alignment in defining residue contacts, and also the re-ranking strategy, which uses the contact potential in re-ranking alignments. The comparison with existing methods in the template-recognition test shows that SUPRB is very competitive and outperforms existing methods.  相似文献   

2.
3.

Background

Human gene duplicates have been the focus of intense research since the development of array-based and targeted next-generation sequencing approaches in the last decade. These studies have primarily concentrated on determining the extant copy-number variation from a population-genomic perspective but lack a robust evolutionary framework to elucidate the early structural and genomic characteristics of gene duplicates at emergence and their subsequent evolution with increasing age.

Results

We analyzed 184 gene duplicate pairs comprising small gene families in the draft human genome with 10 % or less synonymous sequence divergence. Human gene duplicates primarily originate from DNA-mediated events, taking up genomic residence as intrachromosomal copies in direct or inverse orientation. The distribution of paralogs on autosomes follows random expectations in contrast to their significant enrichment on the sex chromosomes. Furthermore, human gene duplicates exhibit a skewed gradient of distribution along the chromosomal length with significant clustering in pericentromeric regions. Surprisingly, despite the large average length of human genes, the majority of extant duplicates (83 %) are complete duplicates, wherein the entire ORF of the ancestral copy was duplicated. The preponderance of complete duplicates is in accord with an extremely large median duplication span of 36 kb, which enhances the probability of capturing ancestral ORFs in their entirety. With increasing evolutionary age, human paralogs exhibit declines in (i) the frequency of intrachromosomal paralogs, and (ii) the proportion of complete duplicates. These changes may reflect lower survival rates of certain classes of duplicates and/or the role of purifying selection. Duplications arising from RNA-mediated events comprise a small fraction (11.4 %) of all human paralogs and are more numerous in older evolutionary cohorts of duplicates.

Conclusions

The degree of structural resemblance, genomic location and duplication span appear to influence the long-term maintenance of paralogs in the human genome. The median duplication span in the human genome far exceeds that in C. elegans and yeast and likely contributes to the high prevalence of complete duplicates relative to structurally heterogeneous duplicates (partial and chimeric). The relative roles of regulatory sequence versus exon-intron structure changes in the acquisition of novel function by human paralogs remains to be determined.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1827-3) contains supplementary material, which is available to authorized users.  相似文献   

4.
5.
Multiple sequence alignments have much to offer to the understanding of protein structure, evolution and function. We are developing approaches to use this information in predicting protein-binding specificity, intra-protein and protein-protein interactions, and in reconstructing protein interaction networks.  相似文献   

6.
Tchinda J  Lee C 《BioTechniques》2006,41(4):385, 387, 389 passim
Among human beings, it was once estimated that our genomes were 99.9% genetically identical. While this high level of genetic similarity helps to define us as a species, it is our genetic variation that contributes to our phenotypic diversity. As genomic technologies evolve to provide genome-wide analyses at higher resolution, we are beginning to appreciate that the human genome has a lot more variation than was once thought. Array-based comparative genomic hybridization (CGH) is one of these technologies that has recently revealed a newly appreciated type of genetic variation: copy number variation, in which thousands of regions of the human genome are now known to be variable in number between individuals. Some of these copy number variable regions have already been shown to predispose to certain common diseases, and others may ultimately have a significant impact on how each of us reacts to certain foods (e.g., allergic reactions), medications (e.g., pharmacogenomics), microscopic infections (i.e., immunity), and other aspects of our ever-changing environment.  相似文献   

7.
MOTIVATION: Experimental techniques alone cannot keep up with the production rate of protein sequences, while computational techniques for protein structure predictions have matured to such a level to provide reliable structural characterization of proteins at large scale. Integration of multiple computational tools for protein structure prediction can complement experimental techniques. RESULTS: We present an automated pipeline for protein structure prediction. The centerpiece of the pipeline is our threading-based protein structure prediction system PROSPECT. The pipeline consists of a dozen tools for identification of protein domains and signal peptide, protein triage to determine the protein type (membrane or globular), protein fold recognition, generation of atomic structural models, prediction result validation, etc. Different processing and prediction branches are determined automatically by a prediction pipeline manager based on identified characteristics of the protein. The pipeline has been implemented to run in a heterogeneous computational environment as a client/server system with a web interface. Genome-scale applications on Caenorhabditis elegans, Pyrococcus furiosus and three cyanobacterial genomes are presented. AVAILABILITY: The pipeline is available at http://compbio.ornl.gov/proteinpipeline/  相似文献   

8.
《Gene》1998,215(1):143-152
Identification of all human protein–protein interactions will lead to a global human protein linkage map that will provide important information for functional genomics studies. The yeast two-hybrid system is a powerful molecular genetic approach for studying protein–protein interactions. To apply this technology to generate a human protein linkage map, the first step is to construct two-hybrid cDNA libraries that cover the entire human genome. With a homologous recombination-mediated approach, we have constructed a modular human EST-derived yeast two-hybrid library in the Gal4 activation domain-based vector, pACT2. Quality analysis of this library indicated that the approach of constructing two-hybrid cDNA libraries from individually arrayed human EST clones is feasible, and such a two-hybrid library is suitable for detecting protein–protein interactions. This is also the first time that a comprehensive two-hybrid system cDNA library has been constructed from a collection of individually arrayed EST clones.  相似文献   

9.
10.
We have developed software, called Expeditor, that can be used to combine known gene structure information from human and coding sequence information from farm animal species for a streamlined primer design in target farm animal species. This software has many utilities, which include PCR-based SNP discovery for identification of genes/markers associated with economically important traits in farm animals, comparative mapping analysis, and evolution studies. The use of this software helps minimize tedious manual operations and reduces the chance of errors by more conventional approaches.  相似文献   

11.
With the decreasing cost of DNA sequencing technology and the vast diversity of biological resources, researchers increasingly face the basic challenge of annotating a larger number of expressed sequences tags (EST) from a variety of species. This typically consists of a series of repetitive tasks, which should be automated and easy to use. The results of these annotation tasks need to be stored and organized in a consistent way. All these operations should be self-installing, platform independent, easy to customize and amenable to using distributed bioinformatics resources available on the Internet.  相似文献   

12.
13.
Sequencing of microbial genomes is important because of microbial-carrying antibiotic and pathogenetic activities. However, even with the help of new assembling software, finishing a whole genome is a time-consuming task. In most bacteria, pathogenetic or antibiotic genes are carried in genomic islands. Therefore, a quick genomic island (GI) prediction method is useful for ongoing sequencing genomes. In this work, we built a Web server called GI-POP (http://gipop.life.nthu.edu.tw) which integrates a sequence assembling tool, a functional annotation pipeline, and a high-performance GI predicting module, in a support vector machine (SVM)-based method called genomic island genomic profile scanning (GI-GPS). The draft genomes of the ongoing genome projects in contigs or scaffolds can be submitted to our Web server, and it provides the functional annotation and highly probable GI-predicting results. GI-POP is a comprehensive annotation Web server designed for ongoing genome project analysis. Researchers can perform annotation and obtain pre-analytic information include possible GIs, coding/non-coding sequences and functional analysis from their draft genomes. This pre-analytic system can provide useful information for finishing a genome sequencing project.  相似文献   

14.

Background

Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages.

Results

Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence) were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32–0.37 change/site) for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0–2.2 × 10(-9) change/site/year) was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 × 10(-9) change/site/year) was approximately half of the overall rate (1.9–2.0 × 10(-9) change/site/year). Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%.

Conclusion

This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies.
  相似文献   

15.
A first-generation EST RH comparative map of the porcine and human genome   总被引:10,自引:0,他引:10  
We have constructed a first-generation EST radiation hybrid comparative map of the porcine genome by assigning 1058 markers to the IMpRH7000 panel. Chromosomal localization was determined with a 2pt LOD of 4.8 for 984 markers, using the IMpRH mapping tool. Annotated ESTs represent 46.2% or 489 of the markers. Marker distribution was not stochastic and ranged from 0.41 for SSC8 to 1.77 for SSC12, respectively. Two hundred fifty-one markers assigned to the physical map of the pig did not find a homologous sequence in V22 of the human genome assembly, indicative of gaps in the assembled human genome sequence. The comparative porcine/human map covers 3290 MB, or 98.3% of the presumed size of the human genome. However, 60 breakpoints were identified between chromosomes, as well as 90 micro-rearrangements within synteny groups. Six porcine chromosomes—SSC2, 5, 6, 7, 12, and 14—correspond to the three gene-richest human chromosomes, HSA17, 19, and 22, and show above average marker density. Porcine Chrs 1, 8, 11, and X display a low DNA/marker ratio and correspond to the 'genome deserts' on HSA 18, 4, 13, and X.  相似文献   

16.
As soon as whole-genome sequencing entered the scene in the mid-1990s and demonstrated its use in revealing the entire genetic potential of any given microbial organism, this technique immediately revolutionized the way pathogen (and many other fields of) research was carried out. The ability to perform whole-genome comparisons further transformed the field and allowed scientists to obtain information linking phenotypic dissimilarities among closely related organisms and their underlying genetic mechanisms. Such comparisons have become commonplace in examining strain-to-strain variability, as well as comparing pathogens to less, or nonpathogenic near neighbors. In recent years, a bloom in novel sequencing technologies along with continuous increases in throughput has occurred, inundating the field with various types of massively parallel sequencing data and further transforming comparative genomics research. Here, we review the evolution of comparative genomics, its impact in understanding pathogen evolution and physiology and the opportunities and challenges presented by next-generation sequencing as applied to pathogen genome comparisons.  相似文献   

17.
One hundred and fifty-four microsatellite markers were selected for genomic scanning of the porcine genome and were grouped into amplification sets to reduce the cost and labour required. Thirty amplification sets had two markers (duplex), 20 sets had three markers (triplex) and five sets had four markers (quadruplex) while 14 markers were analysed separately. The selection criteria for microsatellites were: ease of scoring, level of polymorphism, genetic location and ability to be genotyped in a multiplexed polymerase chain reaction (PCR). The selected microsatellites were chosen to span the entire genome flanked by the porcine linkage map with intervals between adjacent markers of 15–20 cM where possible. The utility of this set of markers was demonstrated by linkage analyses with loci controlling blood plasma protein and red cell enzyme polymorphisms ( n = 13), erythrocyte antigens ( n = 15), the S blood group, coat colour and ryanodine receptor from 174 backcross Meishan-White Composite pigs. These loci displayed various forms of inheritance and most (24 loci) have been placed in linkage groups. Significant two-point linkages (lod > 3·0) were detected for each polymorphic marker. These results provide the first linkage assignments for phosphoglucomutase (PGM2) and erythrocyte antigen F (EAF) to SSC8; and serum amylase (AMY) and erythrocyte antigen I (EAI) to SSC18. All of the remaining polymorphic loci ( n = 24) mapped to previously identified regions confirming earlier results. Most of the markers used in this study should be useful in resource populations of various breed crosses as the number of alleles detected in a multibreed reference population was one of the selection criteria.  相似文献   

18.
We previously established that a unidirectional site-specific recombinase, the phage C31 integrase, can mediate integration into mammalian chromosomes. The enzyme directs integration of plasmids bearing the phage attB recognition site into pseudo attP sites, a set of native sequences related to the phage attP recognition site. Here we use two cycles of DNA shuffling and screening in Escherichia coli to obtain evolved integrases that possess significant improvements in integration frequency and sequence specificity at a pseudo attP sequence located on human chromosome 8, when measured in the native genomic environment of living human cells. Such integrases represent custom integration tools that will be useful for modifying the genomes of higher eukaryotic cells.  相似文献   

19.
SUMMARY: Kalign2 is one of the fastest and most accurate methods for multiple alignments. However, in contrast to other methods Kalign2 does not allow externally supplied position specific gap penalties. Here, we present a modification to Kalign2, KalignP, so that it accepts such penalties. Further, we show that KalignP using position specific gap penalties obtained from predicted secondary structures makes steady improvement over Kalign2 when tested on Balibase 3.0 as well as on a dataset derived from Pfam-A seed alignments. AVAILABILITY AND IMPLEMENTATION: KalignP is freely available at http://kalignp.cbr.su.se. The source code of KalignP is available under the GNU General Public License, Version 2 or later from the same website.  相似文献   

20.
Multiple members of the MDR-ADH (MDR: Medium-chain dehydrogenases/reductases; ADH: alcohol dehydrogenase) family are found in vertebrates, although the enzymes that belong to this family have also been isolated from bacteria, yeast, plant and animal sources. Initial understanding of the physiological roles and evolution of the family relied on biochemical studies, protein alignments and protein structure comparisons. Subsequently, studies at the genetic level yielded new information: the expression pattern, exon-intron distribution, in silico-derived protein sequences and murine knockout phenotypes. More recently, genomic and EST databases have revealed new family members and the chromosomal location and position in the cluster of both the first and new forms. The data now available provide a comprehensive scenario, from which a reliable picture of the evolutionary history of this family can be made.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号