首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
gmconvert is a platform‐independent program provided in GUI (for Apple OS X and Windows XP) and command‐line versions (for other platforms). gmconvert allows rapid reformatting of microsatellite data from output files produced by Applied Biosystems genemapper software (version 3.x). The program will re‐array data into three formats commonly used in downstream analysis: genepop , cervus , and gerud . gmconvert will greatly increase the speed of data preparation prior to analysis and aid in reducing transpositional errors associated with manual re‐arraying and reformatting steps. gmconvert is available from http://gallus.forestry.uga.edu/software/ .  相似文献   

2.
convert is a user‐friendly, 32‐bit Windows program that facilitates ready transfer of codominant, diploid genotypic data amongst commonly used population genetic software packages. convert reads input files in its own ‘standard’ data format, easily produced from an excel file of diploid, codominant marker data, and can convert these to the input formats of the following programs: gda , genepop , arlequin , popgene , microsat , phylip , and structure . convert can also read input files in genepop format. In addition, convert can produce a summary table of allele frequencies in which private alleles and the sample sizes at each locus are indicated.  相似文献   

3.
High-throughput genotyping chips have produced huge datasets for genome-wide association studies(GWAS)that have contributed greatly to discovering susceptibility genes for complex diseases.There are two strategies for performing data analysis for GWAS.One strategy is to use open-source or commercial packages that are designed for GWAS.The other is to take advantage of classic genetic programs with specific functions,such as linkage disequilibrium mapping,haplotype inference and transmission disequilibrium tests.However,most classic programs that are available are not suitable for analyzing chip data directly and require custom-made input,which results in the inconvenience of converting raw genotyping files into various data formats.We developed a powerful,user-friendly,lightweight program named SNPTransformer for GWAS that includes five major modules (Transformer,Operator,Previewer,Coder and Simulator).The toolkit not only works for transforming the genotyping files into ten input formats for use with classic genetics packages,but also carries out useful functions such as relational operations on IDs,previewing data files,recoding data formats and simulating marker files,among other functions.It bridges upstream raw genotyping data with downstream genetic programs,and can act as an in-hand toolkit for human geneticists,especially for non-programmers.SNPTransformer is freely available at http://snptransformer.sourceforge.net.  相似文献   

4.
5.
CDtool is a software package written to facilitate circular dichroism (CD) spectroscopic studies on both conventional lab-based instruments and synchrotron beamlines. It takes format-independent input data from any type of CD instrument, enables a wide range of standard and advanced processing methods, and, in a single user-friendly graphics-based package, takes raw data through the entire processing procedure and, importantly, uses data-mining techniques to retain in the final output all the information associated with the processing. It permits the facile comparison of data obtained from different instruments without the need for reformatting and displays it in graphical formats suitable for publication. It also includes the ability to automatically archive the processed data. This latter feature may be especially useful in light of recent funding institution directives with regard to data sharing and archiving and requirements for "good practice" and "traceability" within the pharmaceutical industry. In addition, CDtool includes a means of interfacing with protein data bank coordinate files and calculating secondary structures from them using alternate definitions and algorithms. This feature, along with a function that permits the facile production of new reference databases, enables the creation of specialized databases for secondary structural analyses of specific types of proteins. Thus the CDtool software not only enables rapid data processing and analyses but also includes many enhanced features not available in other CD data processing/analysis packages.  相似文献   

6.
Puah WC  Cheok LP  Biro M  Ng WT  Wasser M 《BioTechniques》2011,51(1):49-50, 52-3
Automated microscopy enables in vivo studies in developmental biology over long periods of time. Time-lapse recordings in three or more dimensions to study the dynamics of developmental processes can produce huge data sets that extend into the terabyte range. However, depending on the available computational resources and software design, downstream processing of very large image data sets can become highly inefficient, if not impossible. To address the lack of available open source and commercial software tools to efficiently reorganize time-lapse data on a desktop computer with limited system resources, we developed TLM-Converter. The software either fragments oversized files or concatenates multiple files representing single time frames and saves the output files in open standard formats. Our application is undemanding on system resources as it does not require the whole data set to be loaded into the system memory. We tested our tool on time-lapse data sets of live Drosophila specimens recorded by laser scanning confocal microscopy. Image data reorganization dramatically enhances the productivity of time-lapse data processing and allows the use of downstream image analysis software that is unable to handle large data sets of ≥2 GB. In addition, saving the outputs in open standard image file formats enables data sharing between independently developed software tools.  相似文献   

7.
One of the most tedious steps in genetic data analyses is the reformatting data generated with one program for use with other applications. This conversion is necessary because comprehensive evaluation of the data may be based on different algorithms included in diverse software, each requiring a distinct input format. A platform‐independent and freely available program or a web‐based tool dedicated to such reformatting can save time and efforts in data processing. Here, we report widgetcon , a website and a program which has been developed to quickly and easily convert among various molecular data formats commonly used in phylogenetic analysis, population genetics, and other fields. The web‐based service is available at https://www.widgetcon.net . The program and the website convert the major data formats in four basic steps in less than a minute. The resource will be a useful tool for the research community and can be updated to include more formats and features in the future.  相似文献   

8.
We have developed a software package named PEAS to facilitate analyses of large data sets of single nucleotide polymorphisms (SNPs) for population genetics and molecular phylogenetics studies. PEAS reads SNP data in various formats as input and is versatile in data formatting; using PEAS, it is easy to create input files for many popular packages, such as STRUCTURE, frappe, Arlequin, Haploview, LDhat, PLINK, EIGENSOFT, PHASE, fastPHASE, MEGA and PHYLIP. In addition, PEAS fills up several analysis gaps in currently available computer programs in population genetics and molecular phylogenetics. Notably, (i) It calculates genetic distance matrices with bootstrapping for both individuals and populations from genome-wide high-density SNP data, and the output can be streamlined to MEGA and PHYLIP programs for further processing; (ii) It calculates genetic distances from STRUCTURE output and generates MEGA file to reconstruct component trees; (iii) It provides tools to conduct haplotype sharing analysis for phylogenetic studies based on high-density SNP data. To our knowledge, these analyses are not available in any other computer program. PEAS for Windows is freely available for academic users from http://www.picb.ac.cn/~xushua/index.files/Download_PEAS.htm.  相似文献   

9.
Large-scale genome projects require the analysis of large amounts of raw data. This analysis often involves the application of a chain of biology-based programs. Many of these programs are difficult to operate because they are non-integrated, command-line driven, and platform-dependent. The problem is compounded when the number of data files involved is large, making navigation and status-tracking difficult. To demonstrate how this problem can be addressed, we have created a platform-independent Web front end that integrates a set of programs used in a genomic project analyzing gene function by transposon mutagenesis in Saccharomyces cerevisiae. In particular, these programs help define a large number of transposon insertion events within the yeast genome, identifying both the precise site of transposon insertion as well as potential open reading frames disrupted by this insertion event. Our Web interface facilitates this analysis by performing the following tasks. Firstly, it allows each of the analysis programs to be launched against multiple directories of data files. Secondly, it allows the user to view, download, and upload files generated by the programs. Thirdly, it indicates which sets of data directories have been processed by each program. Although designed specifically to aid in this project, our interface exemplifies a general approach by which independent software programs may be integrated into an efficient protocol for large-scale genomic data processing. Electronic Publication  相似文献   

10.
Given the growing amount of biological data, data mining methods have become an integral part of bioinformatics research. Unfortunately, standard data mining tools are often not sufficiently equipped for handling raw data such as e.g. amino acid sequences. One popular and freely available framework that contains many well-known data mining algorithms is the Waikato Environment for Knowledge Analysis (Weka). In the BioWeka project, we introduce various input formats for bioinformatics data and bioinformatics methods like alignments to Weka. This allows users to easily combine them with Weka's classification, clustering, validation and visualization facilities on a single platform and therefore reduces the overhead of converting data between different data formats as well as the need to write custom evaluation procedures that can deal with many different programs. We encourage users to participate in this project by adding their own components and data formats to BioWeka. Availability: The software, documentation and tutorial are available at http://www.bioweka.org.  相似文献   

11.
Mass spectrometry-based proteomics is increasingly being used in biomedical research. These experiments typically generate a large volume of highly complex data, and the volume and complexity are only increasing with time. There exist many software pipelines for analyzing these data (each typically with its own file formats), and as technology improves, these file formats change and new formats are developed. Files produced from these myriad software programs may accumulate on hard disks or tape drives over time, with older files being rendered progressively more obsolete and unusable with each successive technical advancement and data format change. Although initiatives exist to standardize the file formats used in proteomics, they do not address the core failings of a file-based data management system: (1) files are typically poorly annotated experimentally, (2) files are "organically" distributed across laboratory file systems in an ad hoc manner, (3) files formats become obsolete, and (4) searching the data and comparing and contrasting results across separate experiments is very inefficient (if possible at all). Here we present a relational database architecture and accompanying web application dubbed Mass Spectrometry Data Platform that is designed to address the failings of the file-based mass spectrometry data management approach. The database is designed such that the output of disparate software pipelines may be imported into a core set of unified tables, with these core tables being extended to support data generated by specific pipelines. Because the data are unified, they may be queried, viewed, and compared across multiple experiments using a common web interface. Mass Spectrometry Data Platform is open source and freely available at http://code.google.com/p/msdapl/.  相似文献   

12.
This is part two of an article that describes the properties of the image data files that are encountered routinely in digital light micrography. In the current part of the article, the differences between saving image data as large intact files and smaller files that have had some information removed, i.e., using lossy compression, are related first. Subsequently, appropriate ways of configuring computers to deal with the large intact image data files are suggested. The structures of the image data files used for recording dynamic sequences and kinematic animations of series of digital light micrographs, i.e., movie formats, are then described. Finally, some information is supplied about choosing file formats for compressing both static and dynamic image data sets.  相似文献   

13.
This is part two of an article that describes the properties of the image data files that are encountered routinely in digital light micrography. In the current part of the article, the differences between saving image data as large intact files and smaller files that have had some information removed, i.e., using lossy compression, are related first. Subsequently, appropriate ways of configuring computers to deal with the large intact image data files are suggested. The structures of the image data files used for recording dynamic sequences and kinematic animations of series of digital light micrographs, i.e., movie formats, are then described. Finally, some information is supplied about choosing file formats for compressing both static and dynamic image data sets.  相似文献   

14.
Metabolomics spectral formatting, alignment and conversion tools (MSFACTs)   总被引:13,自引:0,他引:13  
MOTIVATION: The amplified interest in metabolic profiling has generated the need for additional tools to assist in the rapid analysis of complex data sets. RESULTS: A new program; metabolomics spectral formatting, alignment and conversion tools, (MSFACTs) is described here for the automated import, reformatting, alignment, and export of large chromatographic data sets to allow more rapid visualization and interrogation of metabolomic data. MSFACTs incorporates two tools: one for the alignment of integrated chromatographic peak lists and another for extracting information from raw chromatographic ASCII formatted data files. MSFACTs is illustrated in the processing of GC/MS metabolomic data from different tissues of the model legume plant, Medicago truncatula. The results document that various tissues such as roots, stems, and leaves from the same plant can be easily differentiated based on metabolite profiles. Further, similar types of tissues within the same plant, such as the first to eleventh internodes of stems, could also be differentiated based on metabolite profiles. AVAILABILITY: Freely available upon request for academic and non-commercial use. Commercial use is available through licensing agreement http://www.noble.org/PlantBio/MS/MSFACTs/MSFACTs.html.  相似文献   

15.
Failing to open computer files that describe image data is not the most frustrating experience that the user of a computer can suffer, but it is high on list of possible aggravations. To ameliorate this, the structure of uncompressed image data files is described here. The various ways in which information that describes a picture can be recorded are related, and a primary distinction between raster or bitmap based and vector or object based image data files is drawn. Bitmap based image data files are the more useful of the two formats for recording complicated images such as digital light micrographs, whereas object based files are better for recording illustrations and cartoons. Computer software for opening a very large variety of different formats of digital image data is recommended, and if these fail, ways are described for opening bitmap based digital image data files whose format is unknown.  相似文献   

16.
Failing to open computer files that describe image data is not the most frustrating experience that the user of a computer can suffer, but it is high on list of possible aggravations. To ameliorate this, the structure of uncompressed image data files is described here. The various ways in which information that describes a picture can be recorded are related, and a primary distinction between raster or bitmap based and vector or object based image data files is drawn. Bitmap based image data files are the more useful of the two formats for recording complicated images such as digital light micrographs, whereas object based files are better for recording illustrations and cartoons. Computer software for opening a very large variety of different formats of digital image data is recommended, and if these fail, ways are described for opening bitmap based digital image data files whose format is unknown.  相似文献   

17.
Phylogenetic analyses today involve dealing with computer files in different formats and often several computer programs. Although some widely used applications have integrated important functionalities for such analyses, they still work with local resources only: input/output files (users have to manage them) and local computing (users have sometimes to leave their programs, on their desktop computers, running for extended periods of time). To address these problems we have developed 'Bosque', a multi-platform client-server software that performs standard phylogenetic tasks either locally or remotely on servers, and integrates the results on a local relational database. Bosque performs sequence alignments and graphical visualization and editing of trees, thus providing a powerful environment that integrates all the steps of phylogenetic analyses. AVAILABILITY: http://bosque.udec.cl  相似文献   

18.
Nmrglue, an open source Python package for working with multidimensional NMR data, is described. When used in combination with other Python scientific libraries, nmrglue provides a highly flexible and robust environment for spectral processing, analysis and visualization and includes a number of common utilities such as linear prediction, peak picking and lineshape fitting. The package also enables existing NMR software programs to be readily tied together, currently facilitating the reading, writing and conversion of data stored in Bruker, Agilent/Varian, NMRPipe, Sparky, SIMPSON, and Rowland NMR Toolkit file formats. In addition to standard applications, the versatility offered by nmrglue makes the package particularly suitable for tasks that include manipulating raw spectrometer data files, automated quantitative analysis of multidimensional NMR spectra with irregular lineshapes such as those frequently encountered in the context of biomacromolecular solid-state NMR, and rapid implementation and development of unconventional data processing methods such as covariance NMR and other non-Fourier approaches. Detailed documentation, install files and source code for nmrglue are freely available at http://nmrglue.com. The source code can be redistributed and modified under the New BSD license.  相似文献   

19.
Tandem mass spectrometry-based proteomics experiments produce large amounts of raw data, and different database search engines are needed to reliably identify all the proteins from this data. Here, we present Compid, an easy-to-use software tool that can be used to integrate and compare protein identification results from two search engines, Mascot and Paragon. Additionally, Compid enables extraction of information from large Mascot result files that cannot be opened via the Web interface and calculation of general statistical information about peptide and protein identifications in a data set. To demonstrate the usefulness of this tool, we used Compid to compare Mascot and Paragon database search results for mitochondrial proteome sample of human keratinocytes. The reports generated by Compid can be exported and opened as Excel documents or as text files using configurable delimiters, allowing the analysis and further processing of Compid output with a multitude of programs. Compid is freely available and can be downloaded from http://users.utu.fi/lanatr/compid. It is released under an open source license (GPL), enabling modification of the source code. Its modular architecture allows for creation of supplementary software components e.g. to enable support for additional input formats and report categories.  相似文献   

20.

Background  

Trace or chromatogram files (raw data) are produced by automatic nucleic acid sequencing equipment or sequencers. Each file contains information which can be interpreted by specialised software to reveal the sequence (base calling). This is done by the sequencer proprietary software or publicly available programs. Depending on the size of a sequencing project the number of trace files can vary from just a few to thousands of files. Sequencing quality assessment on various criteria is important at the stage preceding clustering and contig assembly. Two major publicly available packages – Phred and Staden are used by preAssemble to perform sequence quality processing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号