期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Impact of the terrestrial-aquatic transition on disparity and rates of evolution in the carnivoran skull

Katrina E Jones Jeroen B Smaers Anjali Goswami 《BMC evolutionary biology》2015,15(1)

Background

Which factors influence the distribution patterns of morphological diversity among clades? The adaptive radiation model predicts that a clade entering new ecological niche will experience high rates of evolution early in its history, followed by a gradual slowing. Here we measure disparity and rates of evolution in Carnivora, specifically focusing on the terrestrial-aquatic transition in Pinnipedia. We analyze fissiped (mostly terrestrial, arboreal, and semi-arboreal, but also including the semi-aquatic otter) and pinniped (secondarily aquatic) carnivorans as a case study of an extreme ecological transition. We used 3D geometric morphometrics to quantify cranial shape in 151 carnivoran specimens (64 fissiped, 87 pinniped) and five exceptionally-preserved fossil pinnipeds, including the stem-pinniped Enaliarctos emlongi. Range-based and variance-based disparity measures were compared between pinnipeds and fissipeds. To distinguish between evolutionary modes, a Brownian motion model was compared to selective regime shifts associated with the terrestrial-aquatic transition and at the base of Pinnipedia. Further, evolutionary patterns were estimated on individual branches using both Ornstein-Uhlenbeck and Independent Evolution models, to examine the origin of pinniped diversity.

Results

Pinnipeds exhibit greater cranial disparity than fissipeds, even though they are less taxonomically diverse and, as a clade nested within fissipeds, phylogenetically younger. Despite this, there is no increase in the rate of morphological evolution at the base of Pinnipedia, as would be predicted by an adaptive radiation model, and a Brownian motion model of evolution is supported. Instead basal pinnipeds populated new areas of morphospace via low to moderate rates of evolution in new directions, followed by later bursts within the crown-group, potentially associated with ecological diversification within the marine realm.

Conclusion

The transition to an aquatic habitat in carnivorans resulted in a shift in cranial morphology without an increase in rate in the stem lineage, contra to the adaptive radiation model. Instead these data suggest a release from evolutionary constraint model, followed by aquatic diversifications within crown families.

Electronic supplementary material

The online version of this article (doi:10.1186/s12862-015-0285-5) contains supplementary material, which is available to authorized users. 相似文献

2.

Gene bionetworks involved in the epigenetic transgenerational inheritance of altered mate preference: environmental epigenetics and evolutionary biology

Michael K Skinner Marina I Savenkova Bin Zhang Andrea C Gore David Crews 《BMC genomics》2014,15(1)

Background

Mate preference behavior is an essential first step in sexual selection and is a critical determinant in evolutionary biology. Previously an environmental compound (the fungicide vinclozolin) was found to promote the epigenetic transgenerational inheritance of an altered sperm epigenome and modified mate preference characteristics for three generations after exposure of a gestating female.

Results

The current study investigated gene networks involved in various regions of the brain that correlated with the altered mate preference behavior in the male and female. Statistically significant correlations of gene clusters and modules were identified to associate with specific mate preference behaviors. This novel systems biology approach identified gene networks (bionetworks) involved in sex-specific mate preference behavior. Observations demonstrate the ability of environmental factors to promote the epigenetic transgenerational inheritance of this altered evolutionary biology determinant.

Conclusions

Combined observations elucidate the potential molecular control of mate preference behavior and suggests environmental epigenetics can have a role in evolutionary biology.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-377) contains supplementary material, which is available to authorized users. 相似文献

3.

Khat use is associated with impaired working memory and cognitive flexibility

Colzato LS Ruiz MJ van den Wildenberg WP Hommel B 《PloS one》2011,6(6):e20602

Rationale

Khat consumption has increased during the last decades in Eastern Africa and has become a global phenomenon spreading to ethnic communities in the rest of the world, such as The Netherlands, United Kingdom, Canada, and the United States. Very little is known, however, about the relation between khat use and cognitive control functions in khat users.

Objective

We studied whether khat use is associated with changes in working memory (WM) and cognitive flexibility, two central cognitive control functions.

Methods

Khat users and khat-free controls were matched in terms of sex, ethnicity, age, alcohol and cannabis consumption, and IQ (Raven''s progressive matrices). Groups were tested on cognitive flexibility, as measured by a Global-Local task, and on WM using an N-back task.

Result

Khat users performed significantly worse than controls on tasks tapping into cognitive flexibility as well as monitoring of information in WM.

Conclusions

The present findings suggest that khat use impairs both cognitive flexibility and the updating of information in WM. The inability to monitor information in WM and to adjust behavior rapidly and flexibly may have repercussions for daily life activities. 相似文献

4.

FastMG: a simple,fast, and accurate maximum likelihood procedure to estimate amino acid replacement rate matrices from large data sets

Cuong Cao Dang Vinh Sy Le Olivier Gascuel Bart Hazes Quang Si Le 《BMC bioinformatics》2014,15(1)

Background

Amino acid replacement rate matrices are a crucial component of many protein analysis systems such as sequence similarity search, sequence alignment, and phylogenetic inference. Ideally, the rate matrix reflects the mutational behavior of the actual data under study; however, estimating amino acid replacement rate matrices requires large protein alignments and is computationally expensive and complex. As a compromise, sub-optimal pre-calculated generic matrices are typically used for protein-based phylogeny. Sequence availability has now grown to a point where problem-specific rate matrices can often be calculated if the computational cost can be controlled.

Results

The most time consuming step in estimating rate matrices by maximum likelihood is building maximum likelihood phylogenetic trees from protein alignments. We propose a new procedure, called FastMG, to overcome this obstacle. The key innovation is the alignment-splitting algorithm that splits alignments with many sequences into non-overlapping sub-alignments prior to estimating amino acid replacement rates. Experiments with different large data sets showed that the FastMG procedure was an order of magnitude faster than without splitting. Importantly, there was no apparent loss in matrix quality if an appropriate splitting procedure is used.

Conclusions

FastMG is a simple, fast and accurate procedure to estimate amino acid replacement rate matrices from large data sets. It enables researchers to study the evolutionary relationships for specific groups of proteins or taxa with optimized, data-specific amino acid replacement rate matrices. The programs, data sets, and the new mammalian mitochondrial protein rate matrix are available at http://fastmg.codeplex.com. 相似文献

5.

Cetaceans evolution: insights from the genome sequences of common minke whales

Jung Youn Park Yong-Rock An Naohisa Kanda Chul-Min An Hye Suck An Jung-Ha Kang Eun Mi Kim Du-Hae An Hojin Jung Myunghee Joung Myung Hum Park Sook Hee Yoon Bo-Young Lee Taeheon Lee Kyu-Won Kim Won Cheoul Park Dong Hyun Shin Young Sub Lee Jaemin Kim Woori Kwak Hyeon Jeong Kim Young-Jun Kwon Sunjin Moon Yuseob Kim David W Burt Seoae Cho Heebal Kim 《BMC genomics》2015,16(1)

Background

Whales have captivated the human imagination for millennia. These incredible cetaceans are the only mammals that have adapted to life in the open oceans and have been a source of human food, fuel and tools around the globe. The transition from land to water has led to various aquatic specializations related to hairless skin and ability to regulate their body temperature in cold water.

Results

We present four common minke whale (Balaenoptera acutorostrata) genomes with depth of ×13 ~ ×17 coverage and perform resequencing technology without a reference sequence. Our results indicated the time to the most recent common ancestors of common minke whales to be about 2.3574 (95% HPD, 1.1521 – 3.9212) million years ago. Further, we found that genes associated with epilation and tooth-development showed signatures of positive selection, supporting the morphological uniqueness of whales.

Conclusions

This whole-genome sequencing offers a chance to better understand the evolutionary journey of one of the largest mammals on earth.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1213-1) contains supplementary material, which is available to authorized users. 相似文献

6.

Are special read alignment strategies necessary and cost-effective when handling sequencing reads from patient-derived tumor xenografts?

Kai-Yuen Tso Sau Dan Lee Kwok-Wai Lo Kevin Y Yip 《BMC genomics》2014,15(1)

Background

Patient-derived tumor xenografts in mice are widely used in cancer research and have become important in developing personalized therapies. When these xenografts are subject to DNA sequencing, the samples could contain various amounts of mouse DNA. It has been unclear how the mouse reads would affect data analyses. We conducted comprehensive simulations to compare three alignment strategies at different mutation rates, read lengths, sequencing error rates, human-mouse mixing ratios and sequenced regions. We also sequenced a nasopharyngeal carcinoma xenograft and a cell line to test how the strategies work on real data.

Results

We found the "filtering" and "combined reference" strategies performed better than aligning reads directly to human reference in terms of alignment and variant calling accuracies. The combined reference strategy was particularly good at reducing false negative variants calls without significantly increasing the false positive rate. In some scenarios the performance gain of these two special handling strategies was too small for special handling to be cost-effective, but it was found crucial when false non-synonymous SNVs should be minimized, especially in exome sequencing.

Conclusions

Our study systematically analyzes the effects of mouse contamination in the sequencing data of human-in-mouse xenografts. Our findings provide information for designing data analysis pipelines for these data.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1172) contains supplementary material, which is available to authorized users. 相似文献

7.

Methylomic profiling of human brain tissue supports a neurodevelopmental origin for schizophrenia

Ruth Pidsley Joana Viana Eilis Hannon Helen Spiers Claire Troakes Safa Al-Saraj Naguib Mechawar Gustavo Turecki Leonard C Schalkwyk Nicholas J Bray Jonathan Mill 《Genome biology》2014,15(10)

Background

Schizophrenia is a severe neuropsychiatric disorder that is hypothesized to result from disturbances in early brain development. There is mounting evidence to support a role for developmentally regulated epigenetic variation in the molecular etiology of the disorder. Here, we describe a systematic study of schizophrenia-associated methylomic variation in the adult brain and its relationship to changes in DNA methylation across human fetal brain development.

Results

We profile methylomic variation in matched prefrontal cortex and cerebellum brain tissue from schizophrenia patients and controls, identifying disease-associated differential DNA methylation at multiple loci, particularly in the prefrontal cortex, and confirming these differences in an independent set of adult brain samples. Our data reveal discrete modules of co-methylated loci associated with schizophrenia that are enriched for genes involved in neurodevelopmental processes and include loci implicated by genetic studies of the disorder. Methylomic data from human fetal cortex samples, spanning 23 to 184 days post-conception, indicates that schizophrenia-associated differentially methylated positions are significantly enriched for loci at which DNA methylation is dynamically altered during human fetal brain development.

Conclusions

Our data support the hypothesis that schizophrenia has an important early neurodevelopmental component, and suggest that epigenetic mechanisms may mediate these effects.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0483-2) contains supplementary material, which is available to authorized users. 相似文献

8.

Systematic exploration of guide-tree topology effects for small protein alignments

Fabian Sievers Graham M Hughes Desmond G Higgins 《BMC bioinformatics》2014,15(1)

Background

Guide-trees are used as part of an essential heuristic to enable the calculation of multiple sequence alignments. They have been the focus of much method development but there has been little effort at determining systematically, which guide-trees, if any, give the best alignments. Some guide-tree construction schemes are based on pair-wise distances amongst unaligned sequences. Others try to emulate an underlying evolutionary tree and involve various iteration methods.

Results

We explore all possible guide-trees for a set of protein alignments of up to eight sequences. We find that pairwise distance based default guide-trees sometimes outperform evolutionary guide-trees, as measured by structure derived reference alignments. However, default guide-trees fall way short of the optimum attainable scores. On average chained guide-trees perform better than balanced ones but are not better than default guide-trees for small alignments.

Conclusions

Alignment methods that use Consistency or hidden Markov models to make alignments are less susceptible to sub-optimal guide-trees than simpler methods, that basically use conventional sequence alignment between profiles. The latter appear to be affected positively by evolutionary based guide-trees for difficult alignments and negatively for easy alignments. One phylogeny aware alignment program can strongly discriminate between good and bad guide-trees. The results for randomly chained guide-trees improve with the number of sequences.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-338) contains supplementary material, which is available to authorized users. 相似文献

9.

Breaking the computational barriers of pairwise genome comparison

Oscar Torreno Oswaldo Trelles 《BMC bioinformatics》2015,16(1)

Background

Conventional pairwise sequence comparison software algorithms are being used to process much larger datasets than they were originally designed for. This can result in processing bottlenecks that limit software capabilities or prevent full use of the available hardware resources. Overcoming the barriers that limit the efficient computational analysis of large biological sequence datasets by retrofitting existing algorithms or by creating new applications represents a major challenge for the bioinformatics community.

Results

We have developed C libraries for pairwise sequence comparison within diverse architectures, ranging from commodity systems to high performance and cloud computing environments. Exhaustive tests were performed using different datasets of closely- and distantly-related sequences that span from small viral genomes to large mammalian chromosomes. The tests demonstrated that our solution is capable of generating high quality results with a linear-time response and controlled memory consumption, being comparable or faster than the current state-of-the-art methods.

Conclusions

We have addressed the problem of pairwise and all-versus-all comparison of large sequences in general, greatly increasing the limits on input data size. The approach described here is based on a modular out-of-core strategy that uses secondary storage to avoid reaching memory limits during the identification of High-scoring Segment Pairs (HSPs) between the sequences under comparison. Software engineering concepts were applied to avoid intermediate result re-calculation, to minimise the performance impact of input/output (I/O) operations and to modularise the process, thus enhancing application flexibility and extendibility. Our computationally-efficient approach allows tasks such as the massive comparison of complete genomes, evolutionary event detection, the identification of conserved synteny blocks and inter-genome distance calculations to be performed more effectively.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0679-9) contains supplementary material, which is available to authorized users. 相似文献

10.

Unraveling the effect of genomic structural changes in the rhesus macaque - implications for the adaptive role of inversions

Anna Ullastres Marta Farré Laia Capilla Aurora Ruiz-Herrera 《BMC genomics》2014,15(1)

Background

By reshuffling genomes, structural genomic reorganizations provide genetic variation on which natural selection can work. Understanding the mechanisms underlying this process has been a long-standing question in evolutionary biology. In this context, our purpose in this study is to characterize the genomic regions involved in structural rearrangements between human and macaque genomes and determine their influence on meiotic recombination as a way to explore the adaptive role of genome shuffling in mammalian evolution.

Results

We first constructed a highly refined map of the structural rearrangements and evolutionary breakpoint regions in the human and rhesus macaque genomes based on orthologous genes and whole-genome sequence alignments. Using two different algorithms, we refined the genomic position of known rearrangements previously reported by cytogenetic approaches and described new putative micro-rearrangements (inversions and indels) in both genomes. A detailed analysis of the rhesus macaque genome showed that evolutionary breakpoints are in gene-rich regions, being enriched in GO terms related to immune system. We also identified defense-response genes within a chromosome inversion fixed in the macaque lineage, underlying the relevance of structural genomic changes in evolutionary and/or adaptation processes. Moreover, by combining in silico and experimental approaches, we studied the recombination pattern of specific chromosomes that have suffered rearrangements between human and macaque lineages.

Conclusions

Our data suggest that adaptive alleles – in this case, genes involved in the immune response – might have been favored by genome rearrangements in the macaque lineage.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-530) contains supplementary material, which is available to authorized users. 相似文献

11.

Structure-revealing data fusion

Evrim Acar Evangelos E Papalexakis G?zde Gürdeniz Morten A Rasmussen Anders J Lawaetz Mathias Nilsson Rasmus Bro 《BMC bioinformatics》2014,15(1)

Background

Analysis of data from multiple sources has the potential to enhance knowledge discovery by capturing underlying structures, which are, otherwise, difficult to extract. Fusing data from multiple sources has already proved useful in many applications in social network analysis, signal processing and bioinformatics. However, data fusion is challenging since data from multiple sources are often (i) heterogeneous (i.e., in the form of higher-order tensors and matrices), (ii) incomplete, and (iii) have both shared and unshared components. In order to address these challenges, in this paper, we introduce a novel unsupervised data fusion model based on joint factorization of matrices and higher-order tensors.

Results

While the traditional formulation of coupled matrix and tensor factorizations modeling only shared factors fails to capture the underlying structures in the presence of both shared and unshared factors, the proposed data fusion model has the potential to automatically reveal shared and unshared components through modeling constraints. Using numerical experiments, we demonstrate the effectiveness of the proposed approach in terms of identifying shared and unshared components. Furthermore, we measure a set of mixtures with known chemical composition using both LC-MS (Liquid Chromatography - Mass Spectrometry) and NMR (Nuclear Magnetic Resonance) and demonstrate that the structure-revealing data fusion model can (i) successfully capture the chemicals in the mixtures and extract the relative concentrations of the chemicals accurately, (ii) provide promising results in terms of identifying shared and unshared chemicals, and (iii) reveal the relevant patterns in LC-MS by coupling with the diffusion NMR data.

Conclusions

We have proposed a structure-revealing data fusion model that can jointly analyze heterogeneous, incomplete data sets with shared and unshared components and demonstrated its promising performance as well as potential limitations on both simulated and real data.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-239) contains supplementary material, which is available to authorized users. 相似文献

12.

A hyper-dynamic nature of bivalent promoter states underlies coordinated developmental gene expression modules

Akshay Shah Anja Oldenburg Philippe Collas 《BMC genomics》2014,15(1)

相似文献

13.

Dynamic probabilistic threshold networks to infer signaling pathways from time-course perturbation data

Narsis A Kiani Lars Kaderali 《BMC bioinformatics》2014,15(1)

Background

Network inference deals with the reconstruction of molecular networks from experimental data. Given N molecular species, the challenge is to find the underlying network. Due to data limitations, this typically is an ill-posed problem, and requires the integration of prior biological knowledge or strong regularization. We here focus on the situation when time-resolved measurements of a system’s response after systematic perturbations are available.

Results

We present a novel method to infer signaling networks from time-course perturbation data. We utilize dynamic Bayesian networks with probabilistic Boolean threshold functions to describe protein activation. The model posterior distribution is analyzed using evolutionary MCMC sampling and subsequent clustering, resulting in probability distributions over alternative networks. We evaluate our method on simulated data, and study its performance with respect to data set size and levels of noise. We then use our method to study EGF-mediated signaling in the ERBB pathway.

Conclusions

Dynamic Probabilistic Threshold Networks is a new method to infer signaling networks from time-series perturbation data. It exploits the dynamic response of a system after external perturbation for network reconstruction. On simulated data, we show that the approach outperforms current state of the art methods. On the ERBB data, our approach recovers a significant fraction of the known interactions, and predicts novel mechanisms in the ERBB pathway.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-250) contains supplementary material, which is available to authorized users. 相似文献

14.

Reference genome of wild goat (capra aegagrus) and sequencing of goat breeds provide insight into genic basis of goat domestication

Yang Dong Xiaolei Zhang Min Xie Babak Arefnezhad Zongji Wang Wenliang Wang Shaohong Feng Guodong Huang Rui Guan Wenjing Shen Rowan Bunch Russell McCulloch Qiye Li Bo Li Guojie Zhang Xun Xu James W. Kijas Ghasem Hosseini Salekdeh Wen Wang Yu Jiang 《BMC genomics》2015,16(1)

Background

Domestic goats (Capra hircus) have been selected to play an essential role in agricultural production systems, since being domesticated from their wild progenitor, bezoar (Capra aegagrus). A detailed understanding of the genetic consequences imparted by the domestication process remains a key goal of evolutionary genomics.

Results

We constructed the reference genome of bezoar and sequenced representative breeds of domestic goats to search for genomic changes that likely have accompanied goat domestication and breed formation. Thirteen copy number variation genes associated with coat color were identified in domestic goats, among which ASIP gene duplication contributes to the generation of light coat-color phenotype in domestic goats. Analysis of rapidly evolving genes identified genic changes underlying behavior-related traits, immune response and production-related traits.

Conclusion

Based on the comparison studies of copy number variation genes and rapidly evolving genes between wild and domestic goat, our findings and methodology shed light on the genetic mechanism of animal domestication and will facilitate future goat breeding.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1606-1) contains supplementary material, which is available to authorized users. 相似文献

15.

Computational identification of a new SelD-like family that may participate in sulfur metabolism in hyperthermophilic sulfur-reducing archaea

Gao-Peng Li Liang Jiang Jia-Zuan Ni Qiong Liu Yan Zhang 《BMC genomics》2014,15(1)

Background

Selenium (Se) and sulfur (S) are closely related elements that exhibit similar chemical properties. Some genes related to S metabolism are also involved in Se utilization in many organisms. However, the evolutionary relationship between the two utilization traits is unclear.

Results

In this study, we conducted a comparative analysis of the selenophosphate synthetase (SelD) family, a key protein for all known Se utilization traits, in all sequenced archaea. Our search showed a very limited distribution of SelD and Se utilization in this kingdom. Interestingly, a SelD-like protein was detected in two orders of Crenarchaeota: Sulfolobales and Thermoproteales. Sequence and phylogenetic analyses revealed that SelD-like protein contains the same domain and conserved functional residues as those of SelD, and might be involved in S metabolism in these S-reducing organisms. Further genome-wide analysis of patterns of gene occurrence in different thermoproteales suggested that several genes, including SirA-like, Prx-like and adenylylsulfate reductase, were strongly related to SelD-like gene. Based on these findings, we proposed a simple model wherein SelD-like may play an important role in the biosynthesis of certain thiophosphate compound.

Conclusions

Our data suggest novel genes involved in S metabolism in hyperthermophilic S-reducing archaea, and may provide a new window for understanding the complex relationship between Se and S metabolism in archaea.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-908) contains supplementary material, which is available to authorized users. 相似文献

16.

The acquisition of novel N-glycosylation sites in conserved proteins during human evolution

Dong Seon Kim Yoonsoo Hahn 《BMC bioinformatics》2015,16(1)

Background

N-linked protein glycosylation plays an important role in various biological processes, including protein folding and trafficking, and cell adhesion and signaling. The acquisition of a novel N-glycosylation site may have significant effect on protein structure and function, and therefore, on the phenotype.

Results

We analyzed the human glycoproteome data set (2,534 N-glycosylation sites in 1,027 proteins) and identified 112 novel N-glycosylation sites in 91 proteins that arose in the human lineage since the last common ancestor of Euarchonta (primates and treeshrews). Three of them, Asn-196 in adipocyte plasma membrane-associated protein (APMAP), Asn-91 in cluster of differentiation 166 (CD166/ALCAM), and Asn-76 in thyroglobulin, are human-specific. Molecular evolutionary analysis suggested that these sites were under positive selection during human evolution. Notably, the Asn-76 of thyroglobulin might be involved in the increased production of thyroid hormones in humans, especially thyroxine (T4), because the removal of the glycan moiety from this site was reported to result in a significant decrease in T4 production.

Conclusions

We propose that the novel N-glycosylation sites described in this study may be useful candidates for functional analyses to identify innovative genetic modifications for beneficial phenotypes acquired in the human lineage.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0468-5) contains supplementary material, which is available to authorized users. 相似文献

17.

Transcriptome reconstruction and annotation of cynomolgus and African green monkey

Albert Lee Hossein Khiabanian Jeffrey Kugelman Oliver Elliott Elyse Nagle Guo-Yun Yu Travis Warren Gustavo Palacios Raul Rabadan 《BMC genomics》2014,15(1)

相似文献

18.

DupChecker: a bioconductor package for checking high-throughput genomic data redundancy in meta-analysis

Quanhu Sheng Yu Shyr Xi Chen 《BMC bioinformatics》2014,15(1)

Background

Meta-analysis has become a popular approach for high-throughput genomic data analysis because it often can significantly increase power to detect biological signals or patterns in datasets. However, when using public-available databases for meta-analysis, duplication of samples is an often encountered problem, especially for gene expression data. Not removing duplicates could lead false positive finding, misleading clustering pattern or model over-fitting issue, etc in the subsequent data analysis.

Results

We developed a Bioconductor package Dupchecker that efficiently identifies duplicated samples by generating MD5 fingerprints for raw data. A real data example was demonstrated to show the usage and output of the package.

Conclusions

Researchers may not pay enough attention to checking and removing duplicated samples, and then data contamination could make the results or conclusions from meta-analysis questionable. We suggest applying DupChecker to examine all gene expression data sets before any data analysis step.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-323) contains supplementary material, which is available to authorized users. 相似文献

19.

Whole genome capture of vector-borne pathogens from mixed DNA samples: a case study of Borrelia burgdorferi

Giovanna Carpi Katharine S. Walter Stephen J. Bent Anne Gatewood Hoen Maria Diuk-Wasser Adalgisa Caccone 《BMC genomics》2015,16(1)

Background

Rapid and accurate retrieval of whole genome sequences of human pathogens from disease vectors or animal reservoirs will enable fine-resolution studies of pathogen epidemiological and evolutionary dynamics. However, next generation sequencing technologies have not yet been fully harnessed for the study of vector-borne and zoonotic pathogens, due to the difficulty of obtaining high-quality pathogen sequence data directly from field specimens with a high ratio of host to pathogen DNA.

Results

We addressed this challenge by using custom probes for multiplexed hybrid capture to enrich for and sequence 30 Borrelia burgdorferi genomes from field samples of its arthropod vector. Hybrid capture enabled sequencing of nearly the complete genome (~99.5 %) of the Borrelia burgdorferi pathogen with 132-fold coverage, and identification of up to 12,291 single nucleotide polymorphisms per genome.

Conclusions

The proprosed culture-independent method enables efficient whole genome capture and sequencing of pathogens directly from arthropod vectors, thus making population genomic study of vector-borne and zoonotic infectious diseases economically feasible and scalable. Furthermore, given the similarities of invertebrate field specimens to other mixed DNA templates characterized by a high ratio of host to pathogen DNA, we discuss the potential applicabilty of hybrid capture for genomic study across diverse study systems.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1634-x) contains supplementary material, which is available to authorized users. 相似文献

20.

Topological characterization of neuronal arbor morphology via sequence representation: II - global alignment

Todd A Gillette Parsa Hosseini Giorgio A Ascoli 《BMC bioinformatics》2015,16(1)

Background

The increasing abundance of neuromorphological data provides both the opportunity and the challenge to compare massive numbers of neurons from a wide diversity of sources efficiently and effectively. We implemented a modified global alignment algorithm representing axonal and dendritic bifurcations as strings of characters. Sequence alignment quantifies neuronal similarity by identifying branch-level correspondences between trees.

Results

The space generated from pairwise similarities is capable of classifying neuronal arbor types as well as, or better than, traditional topological metrics. Unsupervised cluster analysis produces groups that significantly correspond with known cell classes for axons, dendrites, and pyramidal apical dendrites. Furthermore, the distinguishing consensus topology generated by multiple sequence alignment of a group of neurons reveals their shared branching blueprint. Interestingly, the axons of dendritic-targeting interneurons in the rodent cortex associates with pyramidal axons but apart from the (more topologically symmetric) axons of perisomatic-targeting interneurons.

Conclusions

Global pairwise and multiple sequence alignment of neurite topologies enables detailed comparison of neurites and identification of conserved topological features in alignment-defined clusters. The methods presented also provide a framework for incorporation of additional branch-level morphological features. Moreover, comparison of multiple alignment with motif analysis shows that the two techniques provide complementary information respectively revealing global and local features.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0605-1) contains supplementary material, which is available to authorized users. 相似文献