首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
pyOpenMS is an open‐source, Python‐based interface to the C++ OpenMS library, providing facile access to a feature‐rich, open‐source algorithm library for MS‐based proteomics analysis. It contains Python bindings that allow raw access to the data structures and algorithms implemented in OpenMS, specifically those for file access (mzXML, mzML, TraML, mzIdentML among others), basic signal processing (smoothing, filtering, de‐isotoping, and peak‐picking) and complex data analysis (including label‐free, SILAC, iTRAQ, and SWATH analysis tools). pyOpenMS thus allows fast prototyping and efficient workflow development in a fully interactive manner (using the interactive Python interpreter) and is also ideally suited for researchers not proficient in C++. In addition, our code to wrap a complex C++ library is completely open‐source, allowing other projects to create similar bindings with ease. The pyOpenMS framework is freely available at https://pypi.python.org/pypi/pyopenms while the autowrap tool to create Cython code automatically is available at https://pypi.python.org/pypi/autowrap (both released under the 3‐clause BSD licence).  相似文献   

2.
Visualization of the intracellular constituents of individual bacteria while performing as live biocatalysts is in principle doable through more or less sophisticated fluorescence microscopy. Unfortunately, rigorous quantitation of the wealth of data embodied in the resulting images requires bioinformatic tools that are not widely extended within the community‐let alone that they are often subject to licensing that impedes software reuse. In this context we have developed CellShape, a user‐friendly platform for image analysis with subpixel precision and double‐threshold segmentation system for quantification of fluorescent signals stemming from single‐cells. CellShape is entirely coded in Python, a free, open‐source programming language with widespread community support. For a developer, CellShape enhances extensibility (ease of software improvements) by acting as an interface to access and use existing Python modules; for an end‐user, CellShape presents standalone executable files ready to open without installation. We have adopted this platform to analyse with an unprecedented detail the tridimensional distribution of the constituents of the gene expression flow (DNA, RNA polymerase, mRNA and ribosomal proteins) in individual cells of the industrial platform strain Pseudomonas putida KT2440. While the CellShape first release version (v0.8) is readily operational, users and/or developers are enabled to expand the platform further.  相似文献   

3.
MOTIVATION: Microarray-based expression profiles have become a standard methodology in any high-throughput analysis. Several commercial platforms are available, each with its strengths and weaknesses. The R platform for statistical analysis and graphics is a powerful environment for the analysis of microarray data, because it has many integrated statistical methods available as well as the specialized microarray analysis project Bioconductor. Many packages have been added in the last few years increasing the range of possible analysis. Here, we report the availability of a package for reading and analyzing data from GE Healthcare Gene Expression Bioarrays within the R environment. AVAILABILITY: The software is implemented in the R language, is open source and available for download free of charge through the Bioconductor (http://www.bioconductor.org) project.  相似文献   

4.
Mathematical equations are fundamental to modeling biological networks, but as networks get large and revisions frequent, it becomes difficult to manage equations directly or to combine previously developed models. Multiple simultaneous efforts to create graphical standards, rule‐based languages, and integrated software workbenches aim to simplify biological modeling but none fully meets the need for transparent, extensible, and reusable models. In this paper we describe PySB, an approach in which models are not only created using programs, they are programs. PySB draws on programmatic modeling concepts from little b and ProMot, the rule‐based languages BioNetGen and Kappa and the growing library of Python numerical tools. Central to PySB is a library of macros encoding familiar biochemical actions such as binding, catalysis, and polymerization, making it possible to use a high‐level, action‐oriented vocabulary to construct detailed models. As Python programs, PySB models leverage tools and practices from the open‐source software community, substantially advancing our ability to distribute and manage the work of testing biochemical hypotheses. We illustrate these ideas using new and previously published models of apoptosis.  相似文献   

5.
Industrial ecology (IE) is a maturing scientific discipline. The field is becoming more data and computation intensive, which requires IE researchers to develop scientific software to tackle novel research questions. We review the current state of software programming and use in our field and find challenges regarding transparency, reproducibility, reusability, and ease of collaboration. Our response to that problem is fourfold: First, we propose how existing general principles for the development of good scientific software could be implemented in IE and related fields. Second, we argue that collaborating on open source software could make IE research more productive and increase its quality, and we present guidelines for the development and distribution of such software. Third, we call for stricter requirements regarding general access to the source code used to produce research results and scientific claims published in the IE literature. Fourth, we describe a set of open source modules for standard IE modeling tasks that represent our first attempt at turning our recommendations into practice. We introduce a Python toolbox for IE that includes the life cycle assessment (LCA) framework Brightway2, the ecospold2matrix module that parses unallocated data in ecospold format, the pySUT and pymrio modules for building and analyzing multiregion input‐output models and supply and use tables, and the dynamic_stock_model class for dynamic stock modeling. Widespread use of open access software can, at the same time, increase quality, transparency, and reproducibility of IE research.  相似文献   

6.
The paper introduces a fuzzy training approach based on nonlinear regularization in an effort to avoid over training. The main idea is to restrict training so that the basic expert knowledge used to build the model is still visible. This is implemented by a new nonlinear regularization approach which can be applied to any kind of training data set. The approach is demonstrated using a large crop yield data set (>4500 field records) for sugar beet collected in agricultural farms over a 14-year period (1976–1989) in East Germany. The software is implemented in SAMT2, free and open source software, using the Python programming language.  相似文献   

7.
Shotgun proteomics workflows for database protein identification typically include a combination of search engines and postsearch validation software based mostly on machine learning algorithms. Here, a new postsearch validation tool called Scavager employing CatBoost, an open‐source gradient boosting library, which shows improved efficiency compared with the other popular algorithms, such as Percolator, PeptideProphet, and Q‐ranker, is presented. The comparison is done using multiple data sets and search engines, including MSGF+, MSFragger, X!Tandem, Comet, and recently introduced IdentiPy. Implemented in Python programming language, Scavager is open‐source and freely available at https://bitbucket.org/markmipt/scavager .  相似文献   

8.
It is important to easily and efficiently obtain high quality species distribution data for predicting the potential distribution of species using species distribution models (SDMs). There is a need for a powerful software tool to automatically or semi-automatically assist in identifying and correcting errors. Here, we use Python to develop a web-based software tool (SDMdata) to easily collect occurrence data from the Global Biodiversity Information Facility (GBIF) and check species names and the accuracy of coordinates (latitude and longitude). It is an open source software (GNU Affero General Public License/AGPL licensed) allowing anyone to access and manipulate the source code. SDMdata is available online free of charge from <http://www.sdmserialsoftware.org/sdmdata/>.  相似文献   

9.
MOTIVATION: Linking experimental data to mathematical models in biology is impeded by the lack of suitable software to manage and transform data. Model calibration would be facilitated and models would increase in value were it possible to preserve links to training data along with a record of all normalization, scaling, and fusion routines used to assemble the training data from primary results. RESULTS: We describe the implementation of DataRail, an open source MATLAB-based toolbox that stores experimental data in flexible multi-dimensional arrays, transforms arrays so as to maximize information content, and then constructs models using internal or external tools. Data integrity is maintained via a containment hierarchy for arrays, imposition of a metadata standard based on a newly proposed MIDAS format, assignment of semantically typed universal identifiers, and implementation of a procedure for storing the history of all transformations with the array. We illustrate the utility of DataRail by processing a newly collected set of approximately 22 000 measurements of protein activities obtained from cytokine-stimulated primary and transformed human liver cells. AVAILABILITY: DataRail is distributed under the GNU General Public License and available at http://code.google.com/p/sbpipeline/  相似文献   

10.
11.
12.
13.
《Ecological Informatics》2009,4(4):183-195
Geographic Information tools (GI tools) have become an essential component of research in landscape ecology. In this article we review the use of GIS (Geographic Information Systems) and GI tools in landscape ecology, with an emphasis on free and open source software (FOSS) projects. Specifically, we introduce the background and terms related to the free and open source software movement, then compare eight FOSS desktop GIS with proprietary GIS to analyse their utility for landscape ecology research. We also provide a summary of related landscape analysis FOSS applications, and extensions. Our results indicate that (i) all eight GIS provide the basic GIS functionality needed in landscape ecology, (ii) they all facilitate customisation, and (iii) they all provide good support via forums and email lists. Drawbacks that have been identified are related to the fact that most projects are relatively young. This currently affects the size of their user and developer communities, and their ability to include advanced spatial analysis functions and up-to-date documentation. However, we expect these drawbacks to be addressed over time, as systems mature. In general, we see great potential for the use of free and open source desktop GIS in landscape ecology research and advocate concentrated efforts by the landscape ecology community towards a common, customisable and free research platform.  相似文献   

14.
15.
MOTIVATION: R/qtl is free and powerful software for mapping and exploring quantitative trait loci (QTL). R/qtl provides a fully comprehensive range of methods for a wide range of experimental cross types. We recently added multiple QTL mapping (MQM) to R/qtl. MQM adds higher statistical power to detect and disentangle the effects of multiple linked and unlinked QTL compared with many other methods. MQM for R/qtl adds many new features including improved handling of missing data, analysis of 10,000 s of molecular traits, permutation for determining significance thresholds for QTL and QTL hot spots, and visualizations for cis-trans and QTL interaction effects. MQM for R/qtl is the first free and open source implementation of MQM that is multi-platform, scalable and suitable for automated procedures and large genetical genomics datasets. AVAILABILITY: R/qtl is free and open source multi-platform software for the statistical language R, and is made available under the GPLv3 license. R/qtl can be installed from http://www.rqtl.org/. R/qtl queries should be directed at the mailing list, see http://www.rqtl.org/list/. CONTACT: kbroman@biostat.wisc.edu.  相似文献   

16.
17.
Increasingly, data on shape are analysed in combination with molecular genetic or ecological information, so that tools for geometric morphometric analysis are required. Morphometric studies most often use the arrangements of morphological landmarks as the data source and extract shape information from them by Procrustes superimposition. The MorphoJ software combines this approach with a wide range of methods for shape analysis in different biological contexts. The program offers an integrated and user-friendly environment for standard multivariate analyses such as principal components, discriminant analysis and multivariate regression as well as specialized applications including phylogenetics, quantitative genetics and analyses of modularity in shape data. MorphoJ is written in Java and versions for the Windows, Macintosh and Unix/Linux platforms are freely available from http://www.flywings.org.uk/MorphoJ_page.htm.  相似文献   

18.
SUMMARY: We introduce a novel Matlab toolbox for microarray data analysis. This toolbox uses normalization based upon a normally distributed background and differential gene expression based on five statistical measures. The objects in this toolbox are open source and can be implemented to suit your application. AVAILABILITY: MDAT v1.0 is a Matlab toolbox and requires Matlab to run. MDAT is freely available at http://microarray.omrf.org/publications/2004/knowlton/MDAT.zip.  相似文献   

19.
ABSTRACT: BACKGROUND: Ongoing innovation in phylogenetics and evolutionary biology has been accompanied by a proliferation of software tools, data formats, analytical techniques and web servers. This brings with it the challenge of integrating phylogenetic and other related biological data found in a wide variety of formats, and underlines the need for reusable software that can read, manipulate and transform this information into the various forms required to build computational pipelines. RESULTS: We built a Python software library for working with phylogenetic data that is tightly integrated with Biopython, a broad-ranging toolkit for computational biology. Our library, Bio.Phylo, is highly interoperable with existing libraries, tools and standards, and is capable of parsing common file formats for phylogenetic trees, performing basic transformations and manipulations, attaching rich annotations, and visualizing trees. We unified the modules for working with the standard file formats Newick, NEXUS and phyloXML behind a consistent and simple API, providing a common set of functionality independent of the data source. CONCLUSIONS: Bio.Phylo meets a growing need in bioinformatics for working with heterogeneous types of phylogenetic data. By supporting interoperability with multiple file formats and leveraging existing Biopython features, this library simplifies the construction of phylogenetic workflows. We also provide examples of the benefits of building a community around a shared open-source project. Bio.Phylo is included with Biopython, available through the Biopython website, http://biopython.org.  相似文献   

20.
A software package, IndexToolkit, aimed at overcoming the disadvantage of FASTA-format databases for frequent searching, is developed to utilize an indexing strategy to substantially accelerate sequence queries. IndexToolkit includes user-friendly tools and an Application Programming Interface (API) to facilitate indexing, storage and retrieval of protein sequence databases. As open source, it provides a sequence-retrieval developing framework, which is easily extensible for high-speed-request proteomic applications, such as database searching or modification discovering. We applied IndexToolkit to database searching engine pFind to demonstrate its effect. Experimental studies show that IndexToolkit is able to support significantly faster searches of protein database. AVAILABILITY: The IndexToolkit is free to use under the open source GNU GPL license. The source code and the compiled binary can be freely accessed through the website http://pfind.jdl.ac.cn/IndexToolkit. In this website, the more detailed information including screenshots and documentations for users and developers is also available.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号