首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
4.
5.
6.
Du  Nan  Chen  Jiao  Sun  Yanni 《BMC genomics》2019,20(2):49-62
Background

Single-molecule, real-time sequencing (SMRT) developed by Pacific BioSciences produces longer reads than second-generation sequencing technologies such as Illumina. The increased read length enables PacBio sequencing to close gaps in genome assembly, reveal structural variations, and characterize the intra-species variations. It also holds the promise to decipher the community structure in complex microbial communities because long reads help metagenomic assembly. One key step in genome assembly using long reads is to quickly identify reads forming overlaps. Because PacBio data has higher sequencing error rate and lower coverage than popular short read sequencing technologies (such as Illumina), efficient detection of true overlaps requires specially designed algorithms. In particular, there is still a need to improve the sensitivity of detecting small overlaps or overlaps with high error rates in both reads. Addressing this need will enable better assembly for metagenomic data produced by third-generation sequencing technologies.

Results

In this work, we designed and implemented an overlap detection program named GroupK, for third-generation sequencing reads based on grouped k-mer hits. While using k-mer hits for detecting reads’ overlaps has been adopted by several existing programs, our method uses a group of short k-mer hits satisfying statistically derived distance constraints to increase the sensitivity of small overlap detection. Grouped k-mer hit was originally designed for homology search. We are the first to apply group hit for long read overlap detection. The experimental results of applying our pipeline to both simulated and real third-generation sequencing data showed that GroupK enables more sensitive overlap detection, especially for datasets of low sequencing coverage.

Conclusions

GroupK is best used for detecting small overlaps for third-generation sequencing data. It provides a useful supplementary tool to existing ones for more sensitive and accurate overlap detection. The source code is freely available at https://github.com/Strideradu/GroupK.

  相似文献   

7.
8.
9.
10.
ABSTRACT

Background: Tropical sand dunes are ideal systems for understanding drivers of community assembly as dunes are subject to both deterministic and stochastic processes. However, studies that evaluate the factors that mediate plant community assembly in these ecosystems are few.

Aims: We evaluated phylogenetic community structure to elucidate the role of deterministic and stochastic processes in mediating the assembly of plant communities along the north of the Yucatan Peninsula, Mexico.

Methods: We used plastid genetic markers to evaluate phylogenetic relationships in 16 sand-dune communities. To evaluate the role of climate in shaping plant community structure we carried out linear regressions between climatic variables and mean phylogenetic distance. We estimated the Net Relatedness Index and Nearest Taxon Index to identify ecological processes mediating community assembly.

Results: Observed phylogenetic structure was not different from random, suggesting that stochastic processes are the major determinants of community assembly. Climate was slightly correlated with phylogenetic diversity suggesting that abiotic environment plays a minimal role in community assembly.

Conclusions: Random assembly appears to be the primary factor structuring the studied sand dune plant communities. Environmental filters may represent a secondary factor contributing to the observed phylogenetic structure. Thus, both processes may act simultaneously to mediate the assembly of sand-dune plant communities.  相似文献   

11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号