首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background  

DNA Microarrays have become the standard method for large scale analyses of gene expression and epigenomics. The increasing complexity and inherent noisiness of the generated data makes visual data exploration ever more important. Fast deployment of new methods as well as a combination of predefined, easy to apply methods with programmer's access to the data are important requirements for any analysis framework. Mayday is an open source platform with emphasis on visual data exploration and analysis. Many built-in methods for clustering, machine learning and classification are provided for dissecting complex datasets. Plugins can easily be written to extend Mayday's functionality in a large number of ways. As Java program, Mayday is platform-independent and can be used as Java WebStart application without any installation. Mayday can import data from several file formats, database connectivity is included for efficient data organization. Numerous interactive visualization tools, including box plots, profile plots, principal component plots and a heatmap are available, can be enhanced with metadata and exported as publication quality vector files.  相似文献   

2.
A user-friendly Hypercard interface for human linkage analysis   总被引:3,自引:0,他引:3  
The availability of a large number of highly informative geneticmarkers has made human linkage analysis faster and easier toperform. However, current linkage analysis software does notprovide an organizational database into which a large body oflinkage data can be easily stored and manipulated. This manualentry and editing of linkage data is often time consuming andprone to typing errors. In addition, the large number of allelesin many of these markers must be reduced in order to performlinkage analysis with multiple loci across large genetic distances.This reduction in allele number is often difficult and confusing,especially in large pedigrees. We have taken advantage of theMacintosh-based Hypercard program to develop an interface withwhich linkage data can be easily stored, retrieved and edited.For each family, the components of the pedigree, including IDnumbers, sex and affection status, only need to be entered once.The program (Linkage Interface) retrieves this information eachtime the data from a new polymorphic marker is entered. LinkageInterface has flexible editing capabilities that allow the userto change any portion of the pedigree, including the additionor deletion of family members, without affecting previouslyentered genotype data. Linkage Interface can also analyze boththe pedigree and marker data and will detect any inconsistenciesin inheritance patterns. In addition, the program can reducethe number of alleles for a polynwrphic marker. Linkage Interfacewill then compare the ‘reduced’ data to the originalmarker data and assists in maintaining all informative meiosesby pointing out which meioses have become non-informative. Oncepolymorphic marker data are entered, the pedigree data, includingthe marker genotypes, are easily exported to a text file. Thistext file can be transferred to an IBM-compatible computer fordirect use with DOS-based linkage programs.  相似文献   

3.
A computer program has been designed to aid development of synthetic strategies for oligonucleotides produced by solid-phase chemical techniques. The program reduces the time required to develop a strategy and a data file from hours to minutes. The program contains inventories, provides cost analyses, and generates and stores other associated data. The program searches an inventory of sequences for that sequence to avoid duplicate synthesis. If the sequence is not in the inventory the program devises a synthetic strategy, calculates the amounts of reagents and labor costs necessary to complete the synthetic oligonucleotide. The program also deducts the reagents from inventory files. Physical data is also calculated. A file is generated in a sequence inventory for storage of the data as well as other data that will be generated during the purification processes. All variable parameters can be easily edited. The programs were designed to provide a cross-referencing feature for data analysis and can use several parameters as a constant.  相似文献   

4.
Microcomputer programs for DNA sequence analysis.   总被引:21,自引:5,他引:16       下载免费PDF全文
Computer programs are described which allow (a) analysis of DNA sequences to be performed on a laboratory microcomputer or (b) transfer of DNA sequences between a laboratory microcomputer and another computer system, such as a DNA library. The sequence analysis programs are interactive, do not require prior experience with computers and in many other respects resemble programs which have been written for larger computer systems (1-7). The user enters sequence data into a text file, accesses this file with the programs, and is then able to (a) search for restriction enzyme sites or other specified sequences, (b) translate in one or more reading frames in one or both directions in order to find open reading frames, or (c) determine codon usage in the sequence in one or more given reading frames. The results are given in table format and a restriction map is generated. The modem program permits collection of large amounts of data from a sequence library into a permanent file on the microcomputer disc system, or transfer of laboratory data in the reverse direction to a remote computer system.  相似文献   

5.
Battye F 《Cytometry》2001,43(2):143-149
BACKGROUND: The obvious benefits of centralized data storage notwithstanding, the size of modern flow cytometry data files discourages their transmission over commonly used telephone modem connections. The proposed solution is to install at the central location a web servlet that can extract compact data arrays, of a form dependent on the requested display type, from the stored files and transmit them to a remote client computer program for display. METHODS: A client program and a web servlet, both written in the Java programming language, were designed to communicate over standard network connections. The client program creates familiar numerical and graphical display types and allows the creation of gates from combinations of user-defined regions. Data compression techniques further reduce transmission times for data arrays that are already much smaller than the data file itself. RESULTS: For typical data files, network transmission times were reduced more than 700-fold for extraction of one-dimensional (1-D) histograms, between 18 and 120-fold for 2-D histograms, and 6-fold for color-coded dot plots. Numerous display formats are possible without further access to the data file. CONCLUSIONS: This scheme enables telephone modem access to centrally stored data without restricting flexibility of display format or preventing comparisons with locally stored files.  相似文献   

6.
Synthetic oligonucleotides have proven to be extremely useful probes for screening cDNA and genomic libraries. Selection of the appropriate probe can be more easily and accurately achieved with the use of the computer program PROBFIND. The user enters the amino acid sequence from a file or from the keyboard, selects the minimum length allowed for the probe and the maximum allowable degeneracy. The computer prints a list of the sequences of potential probes which meet these minimum specifications and the location of the corresponding sequence in the protein to the screen and to a file. The user may modify the specifications for length and degeneracy at any time during the output of data, which allows for rapid selection of the desired probe. The program is interactive, accepts any file format with only a single modification of the file, is written in BASIC, and requires less than 6 kbytes of memory. This makes the program easy to use and adaptable even to unsophisticated microcomputers.  相似文献   

7.
Eriksson J  Fenyö D 《Proteomics》2002,2(3):262-270
A rapid and accurate method for testing the significance of protein identities determined by mass spectrometric analysis of protein digests and genome database searching is presented. The method is based on direct computation using a statistical model of the random matching of measured and theoretical proteolytic peptide masses. Protein identification algorithms typically rank the proteins of a genome database according to a score based on the number of matches between the masses obtained by mass spectrometry analysis and the theoretical proteolytic peptide masses of a database protein. The random matching of experimental and theoretical masses can cause false results. A result is significant only if the score characterizing the result deviates significantly from the score expected from a false result. A distribution of the score (number of matches) for random (false) results is computed directly from our model of the random matching, which allows significance testing under any experimental and database search constraints. In order to mimic protein identification data quality in large-scale proteome projects, low-to-high quality proteolytic peptide mass data were generated in silico and subsequently submitted to a database search program designed to include significance testing based on direct computation. This simulation procedure demonstrates the usefulness of direct significance testing for automatically screening for samples that must be subjected to peptide sequence analysis by e.g. tandem mass spectrometry in order to determine the protein identity.  相似文献   

8.
A program (PREDITOP) for predicting the location of antigenic regions (or epitopes) on proteins is described. This program and the associated ones are written in Turbo Pascal and run on IBM-PC compatibles. The program contains 22 normalized scales, corresponding to hydrophilicity, accessibility, flexibility, or secondary structure propensities. New scales are easily implemented. An hydrophobic moment procedure has also been implemented in order to determine amphiphilic helices. The program generates a result file where the values represent a particular physicochemical aspect of the studied protein. PREDITOP can display one or several result files by simple graphical superimposition. Curve combinations can be done by the ADDITIO or MULTIPLI routines which create a new result file by adding or multiplying previously calculated files representing several propensities. The program is useful and efficient for identifying potential antigenic regions in a protein with the aim of raising antibodies against synthesized peptides which cross-react with the native protein.  相似文献   

9.
A BASIC program has been devised for the hydropathic analysisof protein sequences according to the method of Kyte and Doolittle(1982). The program uses sequence data from input files thatare created with a word processor and produces two types ofoutput file: one contains a bar graph of the hydropathic profilein a format that can be easily edited; the other is a tabulationof hydropathic indices along a protein's sequence that can beused as input by the program for the production of a bar graphor as input into other graphics and analysis software. An MS-DOSmicrocomputer, operating under IBM BASICA or GWBASIC and a dotmatrix printer with block graphics capabilities are the onlyhardware requirements for graphic display of hydropathy profiles.The program is capable of unattended analysis from a list ofup to 15 input files. ; accepted on March 10, 1986  相似文献   

10.
Generation of subject-specific finite element (FE) models from computed tomography (CT) datasets is of significance for application of the FE analysis to bone structures. A great challenge that remains is the automatic assignment of bone material properties from CT Hounsfield Units into finite element models. This paper proposes a new assignment approach, in which material properties are directly assigned to each integration point. Instead of modifying the dataset of FE models, the proposed approach divides the assignment procedure into two steps: generating the data file of the image intensity of a bone in a MATLAB program and reading the file into ABAQUS via user subroutines. Its accuracy has been validated by assigning the density of a bone phantom into a FE model. The proposed approach has been applied to the FE model of a sheep tibia and its applicability tested on a variety of element types. The proposed assignment approach is simple and illustrative. It can be easily modified to fit users’ situations.  相似文献   

11.
MOTIVATION: BLAST programs are very efficient in finding similarities for sequences. However for large datasets such as ESTs, manual extraction of the information from the batch BLAST output is needed. This can be time consuming, insufficient, and inaccurate. Therefore implementation of a parser application would be extremely useful in extracting information from BLAST outputs. RESULTS: We have developed a java application, Batch Blast Extractor, with a user friendly graphical interface to extract information from BLAST output. The application generates a tab delimited text file that can be easily imported into any statistical package such as Excel or SPSS for further analysis. For each BLAST hit, the program obtains and saves the essential features from the BLAST output file that would allow further analysis. The program was written in Java and therefore is OS independent. It works on both Windows and Linux OS with java 1.4 and higher. It is freely available from: http://mcbc.usm.edu/BatchBlastExtractor/  相似文献   

12.
Parallel file systems have been developed in recent years to ease the I/O bottleneck of high-end computing system. These advanced file systems offer several data layout strategies in order to meet the performance goals of specific I/O workloads. However, while a layout policy may perform well on some I/O workload, it may not perform as well for another. Peak I/O performance is rarely achieved due to the complex data access patterns. Data access is application dependent. In this study, a cost-intelligent data access strategy based on the application-specific optimization principle is proposed. This strategy improves the I/O performance of parallel file systems. We first present examples to illustrate the difference of performance under different data layouts. By developing a cost model which estimates the completion time of data accesses in various data layouts, the layout can better match the application. Static layout optimization can be used for applications with dominant data access patterns, and dynamic layout selection with hybrid replications can be used for applications with complex I/O patterns. Theoretical analysis and experimental testing have been conducted to verify the proposed cost-intelligent layout approach. Analytical and experimental results show that the proposed cost model is effective and the application-specific data layout approach can provide up to a 74% performance improvement for data-intensive applications.  相似文献   

13.
A computer program, "SeinFit," was created to determine the Seinhorst equation that best fits experimental data on the relationship between preplant nematode densities and plant growth. Data, which can be entered manually or imported from a text file, are displayed in a data window while the corresponding graph is shown in a graph window. Various options are available to manipulate the data and the graph settings. The best-fitting Seinhorst equation can be calculated by two methods that are both based on the evaluation of the residual sum of squares. Depending on the method, a range of values for different parameters of the Seinhorst equation can be chosen, as well as the number of steps in each range. Data, graphs, and values of the parameters of the Seinhorst equation can be printed. The program allows for quick calculation of the danaage threshold density - one of the parameters of the Seinhorst model. Versions written for Macintosh or DOS-compatible machines are currently available through the Society of Nematologists'' World Wide Web site (http://ianrwww.unl.edu/ianr/plntpath/ nematode/SOFTWARE/nemasoft.htm).  相似文献   

14.
Nonstructured line scales (NLS) are widely used in sensory and consumer research, normally generating a large amount of data to be introduced to computers for statistical analysis. This process can be very much accelerated with the use of special hardware and software. Available systems are efficient but costly. To overcome this last item a standard mouse was modified to be used as a measuring instrument, and a simple QBASIC program was developed to input the measured data into an ASCII file. The cost of the modified mouse was $60, and data input was 5 times faster than measuring distances with a ruler. Experiments designed to test the mouse showed that error measurements were small.  相似文献   

15.
We describe a program (and a website) to reformat the ClustalX/ClustalW outputs to a format that is widely used in the presentation of sequence alignment data in SNP analysis and molecular systematic studies. This program, CLOURE, CLustal OUtput REformatter, takes the multiple sequence alignment file (nucleic acid or protein) generated from Clustal as input files. The CLOURE-D format presents the Clustal alignment in a format that highlights only the different nucleotides/residues relative to the first query sequence. The program has been written in Visual Basic and will run on a Windows platform. The downloadable program, as well as a web-based server which has also been developed, can be accessed at http://imtech.res.in/~anand/cloure.html.  相似文献   

16.
After gas chromatography-mass spectrometry (GC-MS) analysis, data processing, including retention time correction, spectral deconvolution, peak alignment, and normalization prior to statistical analysis, is an important step in metabolomics. Several commercial or free software packages have been introduced for data processing, but most of them are vendor dependent. To design a simple method for Agilent GC/MS data processing, we developed an in-house program, "CompExtractor", using Microsoft Visual Basic. We tailored the macro modules of an Agilent Chemstation and implanted them in the program. To verify the performance of CompExtractor processing, 30 samples from the three species of the genus Papaver were analyzed with Agilent 5973 MSD GC-MS. The results of CompExtractor processing were compared with those of AMDIS-SpectConnect processing by hierarchical cluster analysis (HCA) and principal component analysis (PCA). The two methods showed good classification according to their species in HCA. The PC1+PC2 scores were 54.32-63.62% for AMDIS-SpectConnect and 56.65-85.92% for CompExtractor in PCA. Although the CompExtractor processing method is an Agilent GC-MS-specific application and the target compounds must be selected first, it can extract the target compounds more precisely in the raw data file with batch mode and simultaneously assemble the matrix text file.  相似文献   

17.
The mzQuantML standard from the HUPO Proteomics Standards Initiative has recently been released, capturing quantitative data about peptides and proteins, following analysis of MS data. We present a Java application programming interface (API) for mzQuantML called jmzQuantML. The API provides robust bridges between Java classes and elements in mzQuantML files and allows random access to any part of the file. The API provides read and write capabilities, and is designed to be embedded in other software packages, enabling mzQuantML support to be added to proteomics software tools ( http://code.google.com/p/jmzquantml/ ). The mzQuantML standard is designed around a multilevel validation system to ensure that files are structurally and semantically correct for different proteomics quantitative techniques. In this article, we also describe a Java software tool ( http://code.google.com/p/mzquantml‐validator/ ) for validating mzQuantML files, which is a formal part of the data standard.  相似文献   

18.
Improvements in assay technology have reduced the amount of random variation in measured responses to the point where even slight asymmetry of the assay data can be more significant than random variation. Use of the five-parameter logistic (5PL) function to fit dose-response data easily accommodates such asymmetry. The 5PL can dramatically improve the accuracy of asymmetric assays over the use of symmetric models such as the four-parameter logistic (4PL) function. Until recently, however, the process of fitting the 5PL function has been difficult, with the result that the 4PL function has continued to be used even for highly asymmetric data. Various ad hoc modifications of the 4PL method have been developed in an attempt to address asymmetric data. However, recent advances in numerical methods and assay analysis software have rendered easier the fitting of the 5PL routine. This paper demonstrates how use of the 5PL function can improve assay performance over the 4PL and its variants. Specifically, the improvement in the accuracy of concentration estimates that can be obtained using the 5PL over the 4PL as a function of the asymmetry present in the data is studied. The behavior of the 5PL curve and how it differs from the 4PL curve are discussed. Common experimental designs, which can lead to ill-conditioned regression problems, are also examined.  相似文献   

19.
The collection and conversion of 4-color fluorescent genotyping data from capillary array electrophoresis microchip devices and its conversion to a format easily and rapidly analyzed by Genetic Profiler genotyping software is presented. Microchip fluorescence intensity data are acquired and stored as 4-color tab-delimited text. These files are converted to electrophoretic signal data (ESD) files using a utility program (TEXT-to-ESD) written in C. TEXT-to-ESD generates an ESD file by converting text data to binary data and then appending a 632-byte ESD-file trailer. Up to 96 ESD files are then assembled into a run folder and imported into Genetic Profiler, where data are reduced to 4-color electropherograms and analyzed. In this manner, DNA fragment sizing data acquired with our high-speed electrophoretic microchip devices can be rapidly analyzed using robust commercial software. Additionally, the conversion program allows sizing of data with Genetic Profiler that have been preprocessed using other third-party software, such as BaseFinder.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号