首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   4篇
  免费   0篇
  2021年   1篇
  2016年   1篇
  2015年   1篇
  2010年   1篇
排序方式: 共有4条查询结果,搜索用时 46 毫秒
1
1.
Nguyen  Nam-phuong  Nute  Michael  Mirarab  Siavash  Warnow  Tandy 《BMC genomics》2016,17(10):765-100

Background

Given a new biological sequence, detecting membership in a known family is a basic step in many bioinformatics analyses, with applications to protein structure and function prediction and metagenomic taxon identification and abundance profiling, among others. Yet family identification of sequences that are distantly related to sequences in public databases or that are fragmentary remains one of the more difficult analytical problems in bioinformatics.

Results

We present a new technique for family identification called HIPPI (Hierarchical Profile Hidden Markov Models for Protein family Identification). HIPPI uses a novel technique to represent a multiple sequence alignment for a given protein family or superfamily by an ensemble of profile hidden Markov models computed using HMMER. An evaluation of HIPPI on the Pfam database shows that HIPPI has better overall precision and recall than blastp, HMMER, and pipelines based on HHsearch, and maintains good accuracy even for fragmentary query sequences and for protein families with low average pairwise sequence identity, both conditions where other methods degrade in accuracy.

Conclusion

HIPPI provides accurate protein family identification and is robust to difficult model conditions. Our results, combined with observations from previous studies, show that ensembles of profile Hidden Markov models can better represent multiple sequence alignments than a single profile Hidden Markov model, and thus can improve downstream analyses for various bioinformatic tasks. Further research is needed to determine the best practices for building the ensemble of profile Hidden Markov models. HIPPI is available on GitHub at https://github.com/smirarab/sepp.
  相似文献   
2.
EcoHealth - We investigated the prevalence of coronaviruses in 44 bats from four families in northeastern Eswatini using high-throughput sequencing of fecal samples. We found evidence of...  相似文献   
3.
This paper presents results on the design and analysis of a robust genetic Muller C-element. The Muller C-element is a standard logic gate commonly used to synchronize independent processes in most asynchronous electronic circuits. Synthetic biological logic gates have been previously demonstrated, but there remain many open issues in the design of sequential (state-holding) logic operations. Three designs are considered for the genetic Muller C-element: a majority gate, a toggle switch, and a speed-independent implementation. While the three designs are logically equivalent, each design requires different assumptions to operate correctly. The majority gate design requires the most timing assumptions, the speed-independent design requires the least, and the toggle switch design is a compromise between the two. This paper examines the robustness of these designs as well as the effects of parameter variation using stochastic simulation. The results show that robustness to timing assumptions does not necessarily increase reliability, suggesting that modifications to existing logic design tools are going to be necessary for synthetic biology. Parameter variation simulations yield further insights into the design principles necessary for building robust genetic gates. The results suggest that high gene count, cooperativity of at least two, tight repression, and balanced decay rates are necessary for robust gates. Finally, this paper presents a potential application of the genetic Muller C-element as a quorum-mediated trigger.  相似文献   
4.
Many biological questions, including the estimation of deep evolutionary histories and the detection of remote homology between protein sequences, rely upon multiple sequence alignments and phylogenetic trees of large datasets. However, accurate large-scale multiple sequence alignment is very difficult, especially when the dataset contains fragmentary sequences. We present UPP, a multiple sequence alignment method that uses a new machine learning technique, the ensemble of hidden Markov models, which we propose here. UPP produces highly accurate alignments for both nucleotide and amino acid sequences, even on ultra-large datasets or datasets containing fragmentary sequences. UPP is available at https://github.com/smirarab/sepp.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-015-0688-z) contains supplementary material, which is available to authorized users.  相似文献   
1
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号