首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The emergence of next-generation sequencing (NGS) technologies has significantly improved sequencing throughput and reduced costs. However, the short read length, duplicate reads and massive volume of data make the data processing much more difficult and complicated than the first-generation sequencing technology. Although there are some software packages developed to assess the data quality, those packages either are not easily available to users or require bioinformatics skills and computer resources. Moreover, almost all the quality assessment software currently available didn’t taken into account the sequencing errors when dealing with the duplicate assessment in NGS data. Here, we present a new user-friendly quality assessment software package called BIGpre, which works for both Illumina and 454 platforms. BIGpre contains all the functions of other quality assessment software, such as the correlation between forward and reverse reads, read GC-content distribution, and base Ns quality. More importantly, BIGpre incorporates associated programs to detect and remove duplicate reads after taking sequencing errors into account and trimming low quality reads from raw data as well. BIGpre is primarily written in Perl and integrates graphical capability from the statistics package R. This package produces both tabular and graphical summaries of data quality for sequencing datasets from Illumina and 454 platforms. Processing hundreds of millions reads within minutes, this package provides immediate diagnostic information for user to manipulate sequencing data for downstream analyses. BIGpre is freely available at http://bigpre.sourceforge.net/.  相似文献   

2.
R is an increasingly preferred software environment for data analytics and statistical computing among scientists and practitioners. Packages markedly extend R’s utility and ameliorate inefficient solutions to data science problems. We outline 10 simple rules for finding relevant packages and determining which package is best for your desired use. We begin in Rule 1 with tips on how to consider your purpose, which will guide your search to follow, where, in Rule 2, you’ll learn best practices for finding and collecting options. Rules 3 and 4 will help you navigate packages’ profiles and explore the extent of their online resources, so that you can be confident in the quality of the package you choose and assured that you’ll be able to access support. In Rules 5 and 6, you’ll become familiar with how the R Community evaluates packages and learn how to assess the popularity and utility of packages for yourself. Rules 7 and 8 will teach you how to investigate and track package development processes, so you can further evaluate their merit. We end in Rules 9 and 10 with more hands-on approaches, which involve digging into package code.  相似文献   

3.
Sperm packages are widespread in the order Scorpiones, but absent in the family Buthidae. The morphology of sperm packages is diverse and apparently has phylogenetic information. The objectives of this work were to show diversity of sperm packages and to provide a quantitative basis for using sperm packages’ morphology as a taxonomic character. For this, we conducted a morphological analysis and comparison of the different sperm packages of species of the family Bothriuridae. The seminal content from males of species of Bothriuridae was studied. Specimens from Iuridae, Buthidae, Euscorpiidae, Liochelidae, Scorpionidae, Vaejovidae, Chaerilidae, and Chactidae were used for comparison. Digital images of sperm packages were measured and statistically analysed based on the following variables: total length, head width, head–body angle, total area, and head length. Pairs of variables were also contrasted, and all the variables were correlated with the current phylogenetic hypothesis for Bothriuridae. High morphological diversity and variability in measures was observed. In general, measurements were similar within each genus, but differed amongst genera. Cane‐like sperm packages are very common in species of the family Bothriuridae. Species from Bothriurus show a wide range of sperm package shapes, some of them shared with Timogenes and Vachonia species, supporting the idea of nonmonophyly of the genus. Many species showed sperm package dimorphism inside a single male. Some of the analysed features fit well with the phylogenetic hypothesis in Bothriuridae, and the general package shape shows high correlation with scorpion phylogeny in other families. Bent and round packages are the most common amongst the different families. Sperm packages are not developed in Chaerilidae, as in Buthidae. This is the first morphological and comparative analysis of sperm packages in scorpions, and reveals much greater diversity in this trait than previously known. Our results reinforce the idea that the study of morphology of sperm packages would contribute characters for scorpion phylogeny at different levels. © 2011 The Linnean Society of London, Zoological Journal of the Linnean Society, 2011, 161 , 463–483.  相似文献   

4.
Integrative modeling computes a model based on varied types of input information, be it from experiments or prior models. Often, a type of input information will be best handled by a specific modeling software package. In such a case, we desire to integrate our integrative modeling software package, Integrative Modeling Platform (IMP), with software specialized to the computational demands of the modeling problem at hand. After several attempts, however, we have concluded that even in collaboration with the software’s developers, integration is either impractical or impossible. The reasons for the intractability of integration include software incompatibilities, differing modeling logic, the costs of collaboration, and academic incentives. In the integrative modeling software ecosystem, several large modeling packages exist with often redundant tools. We reason, therefore, that the other development groups have similarly concluded that the benefit of integration does not justify the cost. As a result, modelers are often restricted to the set of tools within a single software package. The inability to integrate tools from distinct software negatively impacts the quality of the models and the efficiency of the modeling. As the complexity of modeling problems grows, we seek to galvanize developers and modelers to consider the long-term benefit that software interoperability yields. In this article, we formulate a demonstrative set of software standards for implementing a model search using tools from independent software packages and discuss our efforts to integrate IMP and the crystallography suite Phenix within the Bayesian modeling framework.  相似文献   

5.
The fertility and parental care hypothesis interprets sex differences in some spatial-cognitive tasks as an adaptive mechanism to suppress women’s travel. In particular, the hypothesis argues that estrogens constrain travel during key reproductive periods by depressing women’s spatial-cognitive ability. Limiting travel reduces exposure to the dangers and caloric costs of navigating long distances into unfamiliar environments. Our study evaluates a collection of predictions drawn from the fertility and parental care hypothesis among the Twe and Himba people living in a remote region of Namibia. We find that nursing mothers travel more than women at any other stage of their reproductive career. This challenges the assumption that women limit travel during vulnerable and energetically demanding reproductive periods. In addition, we join previous studies in identifying a relationship between spatial ability and traveling among men, but not women. If spatial ability does not influence travel, hormonally induced changes in spatial ability cannot be used as a mechanism to reduce travel. Instead, it appears the fitness consequences of men’s travel is a more likely target for adaptive explanations of the sex differences in spatial ability, navigation, and range size.  相似文献   

6.
Software for the processing of electron micrographs in structural biology suffers from incompatibility between different packages, poor definition and choice of conventions, and a lack of coherence in software development. The solution lies in adopting a common philosophy of interaction and conventions between the packages. To understand the choices required to have such common interfaces, I am developing a package called "Bsoft." Its foundations lie in the variety of different image file formats used in electron microscopy-a continually frustrating experience to the user and programmer alike. In Bsoft, this problem is greatly diminished by support for many different formats (including MRC, SPIDER, IMAGIC, SUPRIM, and PIF) and by separating algorithmic issues from image format-specific issues. In addition, I implemented a generalized functionality for reading the tag-base STAR (self-defining text archiving and retrieval) parameter file format as a mechanism to exchanging parameters between different packages. Bsoft is written in highly portable code (tested on several Unix systems and under VMS) and offers a continually growing range of image processing functionality, such as Fourier transformation, cross-correlation, and interpolation. Finally, prerequisites for software collaboration are explored, which include agreements on information exchange and conventions, and tests to evaluate compatibility between packages.  相似文献   

7.
A decade ago, there was widespread enthusiasm for the prospects of genome-wide association studies to identify common variants related to common chronic diseases using samples of unrelated individuals from populations. Although technological advancements allow us to query more than a million SNPs across the genome at low cost, a disappointingly small fraction of the genetic portion of common disease etiology has been uncovered. This has led to the hypothesis that less frequent variants might be involved, stimulating a renaissance of the traditional approach of seeking genes using multiplex families from less diverse populations. However, by using the modern genotyping and sequencing technology, we can now look not just at linkage, but jointly at linkage and linkage disequilibrium (LD) in such samples. Software methods that can look simultaneously at linkage and LD in a powerful and robust manner have been lacking. Most algorithms cannot jointly analyze datasets involving families of varying structures in a statistically or computationally efficient manner. We have implemented previously proposed statistical algorithms in a user-friendly software package, PSEUDOMARKER. This paper is an announcement of this software package. We describe the motivation behind the approach, the statistical methods, and software, and we briefly demonstrate PSEUDOMARKER's advantages over other packages by example.  相似文献   

8.
beadarray: R classes and methods for Illumina bead-based data   总被引:2,自引:0,他引:2  
The R/Bioconductor package beadarray allows raw data from Illumina experiments to be read and stored in convenient R classes. Users are free to choose between various methods of image processing, background correction and normalization in their analysis rather than using the defaults in Illumina's; proprietary software. The package also allows quality assessment to be carried out on the raw data. The data can then be summarized and stored in a format which can be used by other R/Bioconductor packages to perform downstream analyses. Summarized data processed by Illumina's; BeadStudio software can also be read and analysed in the same manner. Availability: The beadarray package is available from the Bioconductor web page at www.bioconductor.org. A user's guide and example data sets are provided with the package.  相似文献   

9.
SAGA: sequence alignment by genetic algorithm.   总被引:29,自引:0,他引:29       下载免费PDF全文
We describe a new approach to multiple sequence alignment using genetic algorithms and an associated software package called SAGA. The method involves evolving a population of alignments in a quasi evolutionary manner and gradually improving the fitness of the population as measured by an objective function which measures multiple alignment quality. SAGA uses an automatic scheduling scheme to control the usage of 22 different operators for combining alignments or mutating them between generations. When used to optimise the well known sums of pairs objective function, SAGA performs better than some of the widely used alternative packages. This is seen with respect to the ability to achieve an optimal solution and with regard to the accuracy of alignment by comparison with reference alignments based on sequences of known tertiary structure. The general attraction of the approach is the ability to optimise any objective function that one can invent.  相似文献   

10.
X-windows based microscopy image processing package (Xmipp) is a specialized suit of image processing programs, primarily aimed at obtaining the 3D reconstruction of biological specimens from large sets of projection images acquired by transmission electron microscopy. This public-domain software package was introduced to the electron microscopy field eight years ago, and since then it has changed drastically. New methodologies for the analysis of single-particle projection images have been added to classification, contrast transfer function correction, angular assignment, 3D reconstruction, reconstruction of crystals, etc. In addition, the package has been extended with functionalities for 2D crystal and electron tomography data. Furthermore, its current implementation in C++, with a highly modular design of well-documented data structures and functions, offers a convenient environment for the development of novel algorithms. In this paper, we present a general overview of a new generation of Xmipp that has been re-engineered to maximize flexibility and modularity, potentially facilitating its integration in future standardization efforts in the field. Moreover, by focusing on those developments that distinguish Xmipp from other packages available, we illustrate its added value to the electron microscopy community.  相似文献   

11.
We describe functions recently added to the r package popgenreport that can be used to perform a landscape genetic analysis (LGA) based on landscape resistance surfaces, which aims to detect the effect of landscape features on gene flow. These functions for the first time implement a LGA in a single framework. Although the approach has been shown to be a valuable tool to study gene flow in landscapes, it has not been widely used to date, despite the type of data being widely available. In part, this is likely due to the necessity to use several software packages to perform landscape genetic analyses. To apply LGA functions, two types of data sets are required: a data set with spatially referenced and genotyped individuals, and a resistance layer representing the effect of the landscape. The function outputs three pairwise distance matrices from these data: a genetic distance matrix, a cost distance matrix and a Euclidean distance matrix. Statistical tests are performed to test whether the cost matrix contributes to the understanding of the observed population structure. A full report on the analysis and outputs in the form of plots and tables of all intermediate steps of the LGA is produced. It is possible to customize the LGA to allow for different cost path approaches and measures of genetic distances. The package is written in the r language and is available through the Comprehensive r Archive. Comprehensive tutorials and information on how to install and use the package are provided at the authors’ website ( www.popgenreport.org ).  相似文献   

12.
13.
The aim of the ecospat package is to make available novel tools and methods to support spatial analyses and modeling of species niches and distributions in a coherent workflow. The package is written in the R language (R Development Core Team) and contains several features, unique in their implementation, that are complementary to other existing R packages. Pre‐modeling analyses include species niche quantifications and comparisons between distinct ranges or time periods, measures of phylogenetic diversity, and other data exploration functionalities (e.g. extrapolation detection, ExDet). Core modeling brings together the new approach of ensemble of small models (ESM) and various implementations of the spatially‐explicit modeling of species assemblages (SESAM) framework. Post‐modeling analyses include evaluation of species predictions based on presence‐only data (Boyce index) and of community predictions, phylogenetic diversity and environmentally‐constrained species co‐occurrences analyses. The ecospat package also provides some functions to supplement the ‘biomod2’ package (e.g. data preparation, permutation tests and cross‐validation of model predictive power). With this novel package, we intend to stimulate the use of comprehensive approaches in spatial modelling of species and community distributions.  相似文献   

14.

Background  

Spectral processing and post-experimental data analysis are the major tasks in NMR-based metabonomics studies. While there are commercial and free licensed software tools available to assist these tasks, researchers usually have to use multiple software packages for their studies because software packages generally focus on specific tasks. It would be beneficial to have a highly integrated platform, in which these tasks can be completed within one package. Moreover, with open source architecture, newly proposed algorithms or methods for spectral processing and data analysis can be implemented much more easily and accessed freely by the public.  相似文献   

15.
In this article we describe a new Bioconductor package 'CALIB' for normalization of two-color microarray data. This approach is based on the measurements of external controls and estimates an absolute target level for each gene and condition pair, as opposed to working with log-ratios as a relative measure of expression. Moreover, this method makes no assumptions regarding the distribution of gene expression divergence. AVAILABILITY: http://bioconductor.org/packages/2.0/bioc Open Source.  相似文献   

16.
Bio3D is a family of R packages for the analysis of biomolecular sequence, structure, and dynamics. Major functionality includes biomolecular database searching and retrieval, sequence and structure conservation analysis, ensemble normal mode analysis, protein structure and correlation network analysis, principal component, and related multivariate analysis methods. Here, we review recent package developments, including a new underlying segregation into separate packages for distinct analysis, and introduce a new method for structure analysis named ensemble difference distance matrix analysis (eDDM). The eDDM approach calculates and compares atomic distance matrices across large sets of homologous atomic structures to help identify the residue wise determinants underlying specific functional processes. An eDDM workflow is detailed along with an example application to a large protein family. As a new member of the Bio3D family, the Bio3D‐eddm package supports both experimental and theoretical simulation‐generated structures, is integrated with other methods for dissecting sequence‐structure–function relationships, and can be used in a highly automated and reproducible manner. Bio3D is distributed as an integrated set of platform independent open source R packages available from: http://thegrantlab.org/bio3d/ .  相似文献   

17.

Purpose

Life cycle assessment (LCA) software packages have proliferated and evolved as LCA has developed and grown. There are now a multitude of LCA software packages that must be critically evaluated by users. Prior to conducting a comparative LCA study on different concrete materials, it is necessary to examine a variety of software packages for this specific purpose. The paper evaluates five LCA tools in the context of the LCA of seven concrete mix designs (conventional concrete, concrete with fly ash, slag, silica fume or limestone as cement replacement, recycled aggregate concrete, and photocatalytic concrete).

Methods

Three key evaluation criteria required to assess the quality of analysis are adequate flexibility, sophistication and complexity of analysis, and usefulness of outputs. The quality of life cycle inventory (LCI) data included in each software package is also assessed for its reliability, completeness, and correlation to the scope of LCA of concrete products in Canada. A questionnaire is developed for evaluating LCA software packages and is applied to five LCA tools.

Results and discussion

The result is the selection of a software package for the specific context of LCA of concrete materials in Canada, which will be used to complete a full LCA study. The software package with the highest score is software package C (SP-C), with 44 out of a possible 48 points. Its main advantage is that it allows for the user to have a high level of control over the system being modeled and the calculation methods used.

Conclusions

This comparative study highlights the importance of selecting a software package that is appropriate for a specific research project. The ability to accurately model the chosen functional unit and system boundary is an important selection criterion. This study demonstrates a method to enable a critical and rigorous comparison without excessive and redundant duplication of efforts.
  相似文献   

18.
The ideal free distribution (IFD) theory, which predicts that a population of individuals will match the distribution of a patchily distributed resource, is widely used in ecology to describe the spatial distribution of animals. While many studies have shown general support of its habitat matching prediction, others have described a systematic pattern of undermatching, where too many animals feed at patches with fewer resources, and too few animals feed in richer patches. These results have been attributed to deviations from several of the assumptions of the IFD. One possible variable, the cost of travelling between patches, has received little attention. Here, we investigated the impact on resource matching when travel costs were manipulated in a simple laboratory experiment involving two continuous input patches. This experiment allowed us to control for extraneous variables and decouple time costs from energetic costs of travel. Two experiments examined the impact of varying travel costs on movement rates between foraging patches and how these travel costs impact conformity to the IFD. Our data demonstrated that there was less movement between patches and greater discrepancies from the IFD predictions as the cost of travel increased.  相似文献   

19.
20.
A maximum likelihood approach to two-dimensional crystals   总被引:1,自引:0,他引:1  
Maximum likelihood (ML) processing of transmission electron microscopy images of protein particles can produce reconstructions of superior resolution due to a reduced reference bias. We have investigated a ML processing approach to images centered on the unit cells of two-dimensional (2D) crystal images. The implemented software makes use of the predictive lattice node tracking in the MRC software, which is used to window particle stacks. These are then noise-whitened and subjected to ML processing. Resulting ML maps are translated into amplitudes and phases for further processing within the 2dx software package. Compared with ML processing for randomly oriented single particles, the required computational costs are greatly reduced as the 2D crystals restrict the parameter search space. The software was applied to images of negatively stained or frozen hydrated 2D crystals of different crystal order. We find that the ML algorithm is not free from reference bias, even though its sensitivity to noise correlation is lower than for pure cross-correlation alignment. Compared with crystallographic processing, the newly developed software yields better resolution for 2D crystal images of lower crystal quality, and it performs equally well for well-ordered crystal images.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号