首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Allan R Brasier 《BioTechniques》2002,32(1):100-2, 104, 106, 108-9
High-density oligonucleotide arrays are widely employed for detecting global changes in gene expression profiles of cells or tissues exposed to specific stimuli. Presented with large amounts of data, investigators can spend significant amounts of time analyzing and interpreting this array data. In our application of GeneChip arrays to analyze changes in gene expression in viral-infected epithelium, we have needed to develop additional computational tools that may be of utility to other investigators using this methodology. Here, I describe two executable programs to facilitate data extraction and multiple data point analysis. These programs run in a virtual DOS environment on Microsoft Windows 95/98/2K operating systems on a desktop PC. Both programs can be freely downloaded from the BioTechniques Software Library (www.BioTechniques.com). The first program, Retriever, extracts primary data from an array experiment contained in an Affymetrix textfile using user-inputted individual identification strings (e.g., the probe set identification numbers). With specific data retrieved for individual genes, hybridization profiles can be examined and data normalized. The second program, CompareTable, is used to facilitate comparison analysis of two experimental replicates. CompareTable compares two lists of genes, identifies common entries, extracts their data, and writes an output text file containing only those genes present in both of the experiments. The output files generated by these two programs can be opened and manipulated by any software application recognizing tab-delimited text files (e.g., Microsoft NotePad or Excel).  相似文献   

2.
create is a Windows program for the creation of new and conversion of existing data input files for 52 genetic data analysis software programs. Programs are grouped into areas of sibship reconstruction, parentage assignment, genetic data analysis, and specialized applications. create is able to read in data from text, Microsoft Excel and Access sources and allows the user to specify columns containing individual and population identifiers, birth and death data, sex data, relationship information, and spatial location data. create's only constraints on source data are that one individual is contained in one row, and the genotypic data is contiguous. create is available for download at http://www.lsc.usgs.gov/CAFL/Ecology/Software.html.  相似文献   

3.
SUMMARY: Microarray data management and processing (MAD) is a set of Windows integrated software for microarray analysis. It consists of a relational database for data storage with many user-interfaces for data manipulation, several text file parsers and Microsoft Excel macros for automation of data processing, and a generator to produce text files that are ready for cluster analysis. AVAILABILITY: Executable is available free of charge on http://pompous.swmed.edu. The source code is also available upon request.  相似文献   

4.
The collection and conversion of 4-color fluorescent genotyping data from capillary array electrophoresis microchip devices and its conversion to a format easily and rapidly analyzed by Genetic Profiler genotyping software is presented. Microchip fluorescence intensity data are acquired and stored as 4-color tab-delimited text. These files are converted to electrophoretic signal data (ESD) files using a utility program (TEXT-to-ESD) written in C. TEXT-to-ESD generates an ESD file by converting text data to binary data and then appending a 632-byte ESD-file trailer. Up to 96 ESD files are then assembled into a run folder and imported into Genetic Profiler, where data are reduced to 4-color electropherograms and analyzed. In this manner, DNA fragment sizing data acquired with our high-speed electrophoretic microchip devices can be rapidly analyzed using robust commercial software. Additionally, the conversion program allows sizing of data with Genetic Profiler that have been preprocessed using other third-party software, such as BaseFinder.  相似文献   

5.
Tandem mass spectrometry-based proteomics experiments produce large amounts of raw data, and different database search engines are needed to reliably identify all the proteins from this data. Here, we present Compid, an easy-to-use software tool that can be used to integrate and compare protein identification results from two search engines, Mascot and Paragon. Additionally, Compid enables extraction of information from large Mascot result files that cannot be opened via the Web interface and calculation of general statistical information about peptide and protein identifications in a data set. To demonstrate the usefulness of this tool, we used Compid to compare Mascot and Paragon database search results for mitochondrial proteome sample of human keratinocytes. The reports generated by Compid can be exported and opened as Excel documents or as text files using configurable delimiters, allowing the analysis and further processing of Compid output with a multitude of programs. Compid is freely available and can be downloaded from http://users.utu.fi/lanatr/compid. It is released under an open source license (GPL), enabling modification of the source code. Its modular architecture allows for creation of supplementary software components e.g. to enable support for additional input formats and report categories.  相似文献   

6.
目的:Microsoft Excel的内置控制语言是VBA(visual basic for application)。它可以极大地增强Excel的数据处理能力。本文通过一个简单的例子说明如何利用VBA自动分析大量共聚焦线扫描图像数据并图示分析结果。方法与结果:文中首先描述了取自共聚焦线扫描图像的实验数据的结构及处理要求。然后具体说明宏程序(用VBA编写)的录制、修改和使用的详细方法。宏程序代码很接近自然语言,较好理解,而且在大多数情况下可通过“录制宏”功能自动生成,把编程的工作减至最少。结论:与手工使用Excel一步步进行数据处理相比,使用Excel中的VBA处理数据可少花时间、少犯错误、减少大量单调重复的劳动..这些可极大地提高数据处理效率,使研究者可把更多的时间用于数据处理方案的设计和完善上。特别在处理量大而复杂的实验数据时更需要如此。这样,数据中蕴含的有用信息才能更好地被有效而准确地提取出来并加以显示。  相似文献   

7.
Data processing and analysis of proteomics data are challenging and time consuming. In this paper, we present MS Data Miner (MDM) (http://sourceforge.net/p/msdataminer), a freely available web-based software solution aimed at minimizing the time required for the analysis, validation, data comparison, and presentation of data files generated in MS software, including Mascot (Matrix Science), Mascot Distiller (Matrix Science), and ProteinPilot (AB Sciex). The program was developed to significantly decrease the time required to process large proteomic data sets for publication. This open sourced system includes a spectra validation system and an automatic screenshot generation tool for Mascot-assigned spectra. In addition, a Gene Ontology term analysis function and a tool for generating comparative Excel data reports are included. We illustrate the benefits of MDM during a proteomics study comprised of more than 200 LC-MS/MS analyses recorded on an AB Sciex TripleTOF 5600, identifying more than 3000 unique proteins and 3.5 million peptides.  相似文献   

8.
Currently, the vital impact of environmental pollution on economic, social and health dimensions has been recognized. The need for theoretical and implementation frameworks for the acquisition, modeling and analysis of environmental data as well as tools to conceive and validate scenarios is becoming increasingly important. For these reasons, different environmental simulation models have been developed. Researchers and stakeholders need efficient tools to store, display, compare and analyze data that are produced by simulation models. One common way to manage simulation results is to use text files; however, text files make it difficult to explore the data. Spreadsheet tools (e.g., OpenOffice, MS Excel) can help to display and analyze model results, but they are not suitable for very large volumes of information. Recently, some studies have shown the feasibility of using Data Warehouse (DW) and On-Line Analytical Processing (OLAP) technologies to store model results and to facilitate model visualization, analysis and comparisons. These technologies allow model users to easily produce graphical reports and charts. In this paper, we address the analysis of pesticide transfer simulation results by warehousing and OLAPing data, for which the data results from the MACRO simulation model. This model simulates hydrological transfers of pesticides at the plot scale. We demonstrate how the simulation results can be managed using DW technologies. We also demonstrate how the use of integrity constraints can improve OLAP analysis. These constraints are used to maintain the quality of the warehoused data as well as to maintain the aggregations and queries, which will lead to better analysis, conclusions and decisions.  相似文献   

9.
The creation of classification kernel models to categorize unknown data samples of massive magnitude is an extremely advantageous tool for the scientific community. Excel2SVM, a stand-alone Python mathematical analysis tool, bridges the gap between researchers and computer science to create a simple graphical user interface that allows users to examine data and perform maximal margin classification. This valuable ability to train support vector machines and classify unknown data files is harnessed in this fast and efficient software, granting researchers full access to this complicated, high-level algorithm. Excel2SVM offers the ability to convert data to the proper sparse format while performing a variety of kernel functions along with cost factors/modes, grids, crossvalidation, and several other functions. This program functions with any type of quantitative data making Excel2SVM the ideal tool for analyzing a wide variety of input. The software is free and available at www.bioinformatics.org/excel2svm. A link to the software may also be found at www.kernel-machines.org. This software provides a useful graphical user interface that has proven to provide kernel models with accurate results and data classification through a decision boundary.  相似文献   

10.
Sequence analysis and editing for bisulphite genomic sequencing projects   总被引:6,自引:1,他引:5  
Bisulphite genomic sequencing is a widely used technique for detailed analysis of the methylation status of a region of DNA. It relies upon the selective deamination of unmethylated cytosine to uracil after treatment with sodium bisulphite, usually followed by PCR amplification of the chosen target region. Since this two-step procedure replaces all unmethylated cytosine bases with thymine, PCR products derived from unmethylated templates contain only three types of nucleotide, in unequal proportions. This can create a number of technical difficulties (e.g. for some base-calling methods) and impedes manual analysis of sequencing results (since the long runs of T or A residues are difficult to align visually with the parent sequence). To facilitate the detailed analysis of bisulphite PCR products (particularly using multiple cloned templates), we have developed a visually intuitive program that identifies the methylation status of CpG dinucleotides by analysis of raw sequence data files produced by MegaBace or ABI sequencers as well as Staden SCF trace files and plain text files. The program then also collates and presents data derived from independent templates (e.g. separate clones). This results in a considerable reduction in the time required for completion of a detailed genomic methylation project.  相似文献   

11.
SLControl is a computerized data acquisition and analysis system that was developed in our laboratory to help perform mechanical experiments using striated muscle preparations. It consists of a computer program (Windows 2000 or later) and a commercially available data acquisition board (16-bit resolution, DAP5216a, Microstar Laboratories, Bellevue, WA). Signals from the user's existing equipment representing force, fiber length (FL), and (if desired) sarcomere length (SL) are connected to the system through standard Bayonet Neill Concelman cables and saved to data files for later analysis. Output signals from the board control FL and trigger additional equipment, e.g., flash lamps. Windows dialogs drive several different experimental protocols, including slack tests and rate of tension recovery measurements. Precise measurements of muscle stiffness and force velocity/power characteristics can also be accomplished using SL and tension control, respectively. In these situations, the FL command signal is updated in real time (at rates > or =2.5 kHz) in response to changes in the measured SL or force signals. Data files can be exported as raw text or analyzed within SLControl with the use of built-in tools for cursor analysis, digital filtering, curve fitting, etc. The software is available for free download at http://www.slcontrol.com.  相似文献   

12.
SNPselector: a web tool for selecting SNPs for genetic association studies   总被引:7,自引:0,他引:7  
SUMMARY: Single nucleotide polymorphisms (SNPs) are commonly used for association studies to find genes responsible for complex genetic diseases. With the recent advance of SNP technology, researchers are able to assay thousands of SNPs in a single experiment. But the process of manually choosing thousands of genotyping SNPs for tens or hundreds of genes is time consuming. We have developed a web-based program, SNPselector, to automate the process. SNPselector takes a list of gene names or a list of genomic regions as input and searches the Ensembl genes or genomic regions for available SNPs. It prioritizes these SNPs on their tagging for linkage disequilibrium, SNP allele frequencies and source, function, regulatory potential and repeat status. SNPselector outputs result in compressed Excel spreadsheet files for review by the user. AVAILABILITY: SNPselector is freely available at http://primer.duhs.duke.edu/  相似文献   

13.
In this paper, we present the package detrendeR, a Graphical User Interface to facilitate the visualization and analysis of dendrochronological data, using the R computing environment. This package offers an easy way to perform most of the traditional tasks in dendrochronology: detrending, chronology building and graphical presentation of time series. The advantage of detrendeR, compared with the program ARSTAN, is the graphical interface that provides the user with an easy way to use R language, rich in graphics and handling routines, with no need to type commands. The detrendeR uses a simple and familiar dialog-box interface and it can read Tucson decadal-format files (*.rwl and *.crn) as well as plain text files. In addition, detrendeR has the ability to test temporal changes of the common signal using moving intervals. The detrendeR should make it easier to perform detrending and chronology building of tree-ring series, taking advantage of the R statistical programming environment.  相似文献   

14.
Here we present the Coon OMSSA Proteomic Analysis Software Suite (COMPASS): a free and open-source software pipeline for high-throughput analysis of proteomics data, designed around the Open Mass Spectrometry Search Algorithm. We detail a synergistic set of tools for protein database generation, spectral reduction, peptide false discovery rate analysis, peptide quantitation via isobaric labeling, protein parsimony and protein false discovery rate analysis, and protein quantitation. We strive for maximum ease of use, utilizing graphical user interfaces and working with data files in the original instrument vendor format. Results are stored in plain text comma-separated value files, which are easy to view and manipulate with a text editor or spreadsheet program. We illustrate the operation and efficacy of COMPASS through the use of two LC-MS/MS data sets. The first is a data set of a highly annotated mixture of standard proteins and manually validated contaminants that exhibits the identification workflow. The second is a data set of yeast peptides, labeled with isobaric stable isotope tags and mixed in known ratios, to demonstrate the quantitative workflow. For these two data sets, COMPASS performs equivalently or better than the current de facto standard, the Trans-Proteomic Pipeline.  相似文献   

15.
perm is a permutation program designed to detect statistical connections between grouping structures and grouping factors or correlates. Groups may be of various kinds such as herds, flocks, schools and mating couples provided they make up meaningful social units. Relatedness, population membership and genotypic contents are among several aggregating variables which may be processed. Typically, perm takes in a collection of grouped data and outputs a P value. The latter is computed on the basis of random membership among groups (HO). All files, including input, output and program, are of Excel type (.xls). perm can be downloaded free of charge at: http://www.bio.ulaval.ca/louisbernatchez/downloads.htm .  相似文献   

16.
MOTIVATION: BLAST programs are very efficient in finding similarities for sequences. However for large datasets such as ESTs, manual extraction of the information from the batch BLAST output is needed. This can be time consuming, insufficient, and inaccurate. Therefore implementation of a parser application would be extremely useful in extracting information from BLAST outputs. RESULTS: We have developed a java application, Batch Blast Extractor, with a user friendly graphical interface to extract information from BLAST output. The application generates a tab delimited text file that can be easily imported into any statistical package such as Excel or SPSS for further analysis. For each BLAST hit, the program obtains and saves the essential features from the BLAST output file that would allow further analysis. The program was written in Java and therefore is OS independent. It works on both Windows and Linux OS with java 1.4 and higher. It is freely available from: http://mcbc.usm.edu/BatchBlastExtractor/  相似文献   

17.
Pedro is a Java application that dynamically generates data entry forms for data models expressed in XML Schema, producing XML data files that validate against this schema. The software uses an intuitive tree-based navigation system, can supply context-sensitive help to users and features a sophisticated interface for populating data fields with terms from controlled vocabularies. The software also has the ability to import records from tab delimited text files and features various validation routines. AVAILABILITY: The application, source code, example models from several domains and tutorials can be downloaded from http://pedro.man.ac.uk/.  相似文献   

18.
This paper describes the application of text compression methodsto machine-readable files of nucleic acid and protein sequencedata. Two main methods are used to reduce the storage requirementsof such files, these being n-gram coding and run-length coding.A Pascal program combining both of these techniques resultedin a compression figure of 74.6% for the GenBank database anda program that used only n-gram coding gave a compression figureof 42.8% for the Protein Identification Resource database. Received on November 29, 1985; accepted on February 24, 1986  相似文献   

19.
Battye F 《Cytometry》2001,43(2):143-149
BACKGROUND: The obvious benefits of centralized data storage notwithstanding, the size of modern flow cytometry data files discourages their transmission over commonly used telephone modem connections. The proposed solution is to install at the central location a web servlet that can extract compact data arrays, of a form dependent on the requested display type, from the stored files and transmit them to a remote client computer program for display. METHODS: A client program and a web servlet, both written in the Java programming language, were designed to communicate over standard network connections. The client program creates familiar numerical and graphical display types and allows the creation of gates from combinations of user-defined regions. Data compression techniques further reduce transmission times for data arrays that are already much smaller than the data file itself. RESULTS: For typical data files, network transmission times were reduced more than 700-fold for extraction of one-dimensional (1-D) histograms, between 18 and 120-fold for 2-D histograms, and 6-fold for color-coded dot plots. Numerous display formats are possible without further access to the data file. CONCLUSIONS: This scheme enables telephone modem access to centrally stored data without restricting flexibility of display format or preventing comparisons with locally stored files.  相似文献   

20.
MapDraw,在Excel中绘制遗传连锁图的宏   总被引:113,自引:7,他引:106  
刘仁虎  孟金陵 《遗传》2003,25(3):317-321
MAPMAKER是现今广泛使用的遗传连锁数据分析软件,然而其广泛使用的DOS版本却不具有连锁图绘制功能,给连锁作图工作带来了相当大的麻烦。为了解决这一问题,我们以大家广泛使用的数据处理软件Microsoft Excel为平台,编写了一个Excel宏——MapDraw来在轻松的操作中实现遗传连锁图的绘制。 Abstract:MAPMAKER is one of the most widely used computer software package for constructing genetic linkage maps.However,the PC version,MAPMAKER 3.0 for PC,could not draw the genetic linkage maps that its Macintosh version,MAPMAKER 3.0 for Macintosh,was able to do.Especially in recent years,Macintosh computer is much less popular than PC.Most of the geneticists use PC to analyze their genetic linkage data.So a new computer software to draw the same genetic linkage maps on PC as the MAPMAKER for Macintosh to do on Macintosh has been crying for.Microsoft Excel,one component of Microsoft Office package,is one of the most popular software in laboratory data processing.Microsoft Visual Basic for Applications (VBA) is one of the most powerful functions of Microsoft Excel.Using this program language,we can take creative control of Excel,including genetic linkage map construction,automatic data processing and more.In this paper,a Microsoft Excel macro called MapDraw is constructed to draw genetic linkage maps on PC computer based on given genetic linkage data.Use this software,you can freely construct beautiful genetic linkage map in Excel and freely edit and copy it to Word or other application.This software is just an Excel format file.You can freely copy it from ftp://211.69.140.177 or ftp://brassica.hzau.edu.cn and the source code can be found in Excel′s Visual Basic Editor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号