首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The microarray technology allows the high-throughput quantification of the mRNA level of thousands of genes under dozens of conditions, generating a wealth of data which must be analyzed using some form of computational means. A popular framework for such analysis is Matlab, a powerful computing language for which many functions have been written. However, although complex topics like neural networks or principal component analysis are freely available in Matlab, functions to perform more basic tasks like data normalization or hierarchical clustering in an efficient manner are not. The MatArray toolbox aims at filling this gap by offering efficient implementations of the most needed functions for microarray analysis. The functions in the toolbox are command-line only, since it is geared toward seasoned Matlab users.  相似文献   

2.
MArray is a Matlab toolbox with a graphical user interface that allows the user to analyse single or paired microarray datasets by direct input of the raw data output file from image analysis packages, such as QuantArray or GenePiX. The application provides simple procedures to manually evaluate the quality of each measurement, multiple approaches to both ratio normalization (simple normalization, intensity dependent normalization) and evaluation of the reproducibility of paired experiments (using the techniques 'simple statistical method' and 'quality control ellipse' and 'significance analysis of microarrays'). Specifically, interactive spot evaluation functions are available in MArray and an online gene information database (NCBI UniGene) is linked. The application may provide a valuable aid in selecting and optimizing experimental procedures, as well as serving as an analytical tool for two-state biological comparisons, such as a study of single-dose activation. It is entirely platform independent, and only requires Matlab installed. AVAILABILITY: http://matrise.uio.no/marray/marray.html  相似文献   

3.

Background

Proteins are composed of domains, protein segments that fold independently from the rest of the protein and have a specific function. During evolution the arrangement of domains can change: domains are gained, lost or their order is rearranged. To facilitate the analysis of these changes we propose the use of multiple domain alignments.

Results

We developed an alignment program, called MDAT, which aligns multiple domain arrangements. MDAT extends earlier programs which perform pairwise alignments of domain arrangements. MDAT uses a domain similarity matrix to score domain pairs and aligns the domain arrangements using a consistency supported progressive alignment method.

Conclusion

MDAT will be useful for analysing changes in domain arrangements within and between protein families and will thus provide valuable insights into the evolution of proteins and their domains. MDAT is coded in C++, and the source code is freely available for download at http://www.bornberglab.org/pages/mdat.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0442-7) contains supplementary material, which is available to authorized users.  相似文献   

4.
基于TDT神经电生理软硬件平台和Matlab软件环境,开发了专用于听觉电生理研究的实时分析软件。通过对神经元胞外记录信号的在线处理和分析,可以在实验过程中得到刺激后放电活动时间直方图、平均发放率、首次发放潜伏期等定量分析结果,以及刺激参数变化时神经元发放率的变化曲线,如发放率-刺激强度曲线等。此分析软件被用于大鼠下丘神经元听觉信息编码的研究中,观察到下丘神经元对于纯音和噪声刺激不同的时间响应模式,以及神经元发放率和首次发放潜伏期对声音刺激强度的编码。  相似文献   

5.
MOTIVATION: A serious limitation in microarray analysis is the unreliability of the data generated from low signal intensities. Such data may produce erroneous gene expression ratios and cause unnecessary validation or post-analysis follow-up tasks. Therefore, the elimination of unreliable signal intensities will enhance reproducibility and reliability of gene expression ratios produced from microarray data. In this study, we applied fuzzy c-means (FCM) and normal mixture modeling (NMM) based classification methods to separate microarray data into reliable and unreliable signal intensity populations. RESULTS: We compared the results of FCM classification with those of classification based on NMM. Both approaches were validated against reference sets of biological data consisting of only true positives and true negatives. We observed that both methods performed equally well in terms of sensitivity and specificity. Although a comparison of the computation times indicated that the fuzzy approach is computationally more efficient, other considerations support the use of NMM for the reliability analysis of microarray data. AVAILABILITY: The classification approaches described in this paper and sample microarray data are available as Matlab( TM ) (The MathWorks Inc., Natick, MA) programs (mfiles) and text files, respectively, at http://rc.kfshrc.edu.sa/bssc/staff/MusaAsyali/Downloads.asp. The programs can be run/tested on many different computer platforms where Matlab is available. CONTACT: asyali@kfshrc.edu.sa.  相似文献   

6.
Adjustment of systematic microarray data biases   总被引:6,自引:0,他引:6  
MOTIVATION: Systematic differences due to experimental features of microarray experiments are present in most large microarray data sets. Many different experimental features can cause biases including different sources of RNA, different production lots of microarrays or different microarray platforms. These systematic effects present a substantial hurdle to the analysis of microarray data. RESULTS: We present here a new method for the identification and adjustment of systematic biases that are present within microarray data sets. Our approach is based on modern statistical discrimination methods and is shown to be very effective in removing systematic biases present in a previously published breast tumor cDNA microarray data set. The new method of 'Distance Weighted Discrimination (DWD)' is shown to be better than Support Vector Machines and Singular Value Decomposition for the adjustment of systematic microarray effects. In addition, it is shown to be of general use as a tool for the discrimination of systematic problems present in microarray data sets, including the merging of two breast tumor data sets completed on different microarray platforms. AVAILABILITY: Matlab software to perform DWD can be retrieved from https://genome.unc.edu/pubsup/dwd/  相似文献   

7.
Improving missing value estimation in microarray data with gene ontology   总被引:3,自引:0,他引:3  
MOTIVATION: Gene expression microarray experiments produce datasets with frequent missing expression values. Accurate estimation of missing values is an important prerequisite for efficient data analysis as many statistical and machine learning techniques either require a complete dataset or their results are significantly dependent on the quality of such estimates. A limitation of the existing estimation methods for microarray data is that they use no external information but the estimation is based solely on the expression data. We hypothesized that utilizing a priori information on functional similarities available from public databases facilitates the missing value estimation. RESULTS: We investigated whether semantic similarity originating from gene ontology (GO) annotations could improve the selection of relevant genes for missing value estimation. The relative contribution of each information source was automatically estimated from the data using an adaptive weight selection procedure. Our experimental results in yeast cDNA microarray datasets indicated that by considering GO information in the k-nearest neighbor algorithm we can enhance its performance considerably, especially when the number of experimental conditions is small and the percentage of missing values is high. The increase of performance was less evident with a more sophisticated estimation method. We conclude that even a small proportion of annotated genes can provide improvements in data quality significant for the eventual interpretation of the microarray experiments. AVAILABILITY: Java and Matlab codes are available on request from the authors. SUPPLEMENTARY MATERIAL: Available online at http://users.utu.fi/jotatu/GOImpute.html.  相似文献   

8.
PrepMS: TOF MS data graphical preprocessing tool   总被引:1,自引:0,他引:1  
We introduce a simple-to-use graphical tool that enables researchers to easily prepare time-of-flight mass spectrometry data for analysis. For ease of use, the graphical executable provides default parameter settings, experimentally determined to work well in most situations. These values, if desired, can be changed by the user. PrepMS is a stand-alone application made freely available (open source), and is under the General Public License (GPL). Its graphical user interface, default parameter settings, and display plots allow PrepMS to be used effectively for data preprocessing, peak detection and visual data quality assessment. AVAILABILITY: Stand-alone executable files and Matlab toolbox are available for download at: http://sourceforge.net/projects/prepms  相似文献   

9.

Background

The Matlab software is a one of the most advanced development tool for application in engineering practice. From our point of view the most important is the image processing toolbox, offering many built-in functions, including mathematical morphology, and implementation of a many artificial neural networks as AI. It is very popular platform for creation of the specialized program for image analysis, also in pathology. Based on the latest version of Matlab Builder Java toolbox, it is possible to create the software, serving as a remote system for image analysis in pathology via internet communication. The internet platform can be realized based on Java Servlet Pages with Tomcat server as servlet container.

Methods

In presented software implementation we propose remote image analysis realized by Matlab algorithms. These algorithms can be compiled to executable jar file with the help of Matlab Builder Java toolbox. The Matlab function must be declared with the set of input data, output structure with numerical results and Matlab web figure. Any function prepared in that manner can be used as a Java function in Java Servlet Pages (JSP). The graphical user interface providing the input data and displaying the results (also in graphical form) must be implemented in JSP. Additionally the data storage to database can be implemented within algorithm written in Matlab with the help of Matlab Database Toolbox directly with the image processing. The complete JSP page can be run by Tomcat server.

Results

The proposed tool for remote image analysis was tested on the Computerized Analysis of Medical Images (CAMI) software developed by author. The user provides image and case information (diagnosis, staining, image parameter etc.). When analysis is initialized, input data with image are sent to servlet on Tomcat. When analysis is done, client obtains the graphical results as an image with marked recognized cells and also the quantitative output. Additionally, the results are stored in a server database. The internet platform was tested on PC Intel Core2 Duo T9600 2.8GHz 4GB RAM server with 768x576 pixel size, 1.28Mb tiff format images reffering to meningioma tumour (x400, Ki-67/MIB-1). The time consumption was as following: at analysis by CAMI, locally on a server – 3.5 seconds, at remote analysis – 26 seconds, from which 22 seconds were used for data transfer via internet connection. At jpg format image (102 Kb) the consumption time was reduced to 14 seconds.

Conclusions

The results have confirmed that designed remote platform can be useful for pathology image analysis. The time consumption is depended mainly on the image size and speed of the internet connections. The presented implementation can be used for many types of analysis at different staining, tissue, morphometry approaches, etc. The significant problem is the implementation of the JSP page in the multithread form, that can be used parallelly by many users. The presented platform for image analysis in pathology can be especially useful for small laboratory without its own image analysis system.
  相似文献   

10.
CGH-Plotter: MATLAB toolbox for CGH-data analysis   总被引:1,自引:0,他引:1  
CGH-Plotter is a MATLAB toolbox with a graphical user interface for the analysis of comparative genomic hybridization (CGH) microarray data. CGH-Plotter provides a tool for rapid visualization of CGH-data according to the locations of the genes along the genome. In addition, the CGH-Plotter identifies regions of amplifications and deletions, using k-means clustering and dynamic programming. The application offers a convenient way to analyze CGH-data and can also be applied for the analysis of cDNA microarray expression data. CGH-Plotter toolbox is platform independent and requires MATLAB 6.1 or higher to operate.  相似文献   

11.
Fuzzy C-means method for clustering microarray data   总被引:9,自引:0,他引:9  
MOTIVATION: Clustering analysis of data from DNA microarray hybridization studies is essential for identifying biologically relevant groups of genes. Partitional clustering methods such as K-means or self-organizing maps assign each gene to a single cluster. However, these methods do not provide information about the influence of a given gene for the overall shape of clusters. Here we apply a fuzzy partitioning method, Fuzzy C-means (FCM), to attribute cluster membership values to genes. RESULTS: A major problem in applying the FCM method for clustering microarray data is the choice of the fuzziness parameter m. We show that the commonly used value m = 2 is not appropriate for some data sets, and that optimal values for m vary widely from one data set to another. We propose an empirical method, based on the distribution of distances between genes in a given data set, to determine an adequate value for m. By setting threshold levels for the membership values, genes which are tigthly associated to a given cluster can be selected. Using a yeast cell cycle data set as an example, we show that this selection increases the overall biological significance of the genes within the cluster. AVAILABILITY: Supplementary text and Matlab functions are available at http://www-igbmc.u-strasbg.fr/fcm/  相似文献   

12.
In a convenience survey of dairy goats on 9 farms in Tennessee, Kentucky, and Georgia, 54 of 99 females were positive for antibodies to Toxoplasma gondii as measured by the indirect hemagglutination test (IHA). Two of 9 males were also positive. Positive goats were found on all farms. The percentage of positive does increased from 55% to 65% when the sera were titered a second time by modified direct agglutination (MDAT). This difference was not statistically significant, indicating that both IHA and MDAT are reliable epidemiological tools.  相似文献   

13.
spads 1.0 (for ‘Spatial and Population Analysis of DNA Sequences’) is a population genetic toolbox for characterizing genetic variability within and among populations from DNA sequences. In view of the drastic increase in genetic information available through sequencing methods, spads was specifically designed to deal with multilocus data sets of DNA sequences. It computes several summary statistics from populations or groups of populations, performs input file conversions for other population genetic programs and implements locus‐by‐locus and multilocus versions of two clustering algorithms to study the genetic structure of populations. The toolbox also includes two Matlab and r functions, Gdispal and Gdivpal , to display differentiation and diversity patterns across landscapes. These functions aim to generate interpolating surfaces based on multilocus distance and diversity indices. In the case of multiple loci, such surfaces can represent a useful alternative to multiple pie charts maps traditionally used in phylogeography to represent the spatial distribution of genetic diversity. These coloured surfaces can also be used to compare different data sets or different diversity and/or distance measures estimated on the same data set.  相似文献   

14.
We review quantitative methods and software developed to analyze genome-scale, brain-wide spatially-mapped gene-expression data. We expose new methods based on the underlying high-dimensional geometry of voxel space and gene space, and on simulations of the distribution of co-expression networks of a given size. We apply them to the Allen Atlas of the adult mouse brain, and to the co-expression network of a set of genes related to nicotine addiction retrieved from the NicSNP database. The computational methods are implemented in BrainGeneExpressionAnalysis (BGEA), a Matlab toolbox available for download.  相似文献   

15.
Neuronal population codes are increasingly being investigated with multivariate pattern-information analyses. A key challenge is to use measured brain-activity patterns to test computational models of brain information processing. One approach to this problem is representational similarity analysis (RSA), which characterizes a representation in a brain or computational model by the distance matrix of the response patterns elicited by a set of stimuli. The representational distance matrix encapsulates what distinctions between stimuli are emphasized and what distinctions are de-emphasized in the representation. A model is tested by comparing the representational distance matrix it predicts to that of a measured brain region. RSA also enables us to compare representations between stages of processing within a given brain or model, between brain and behavioral data, and between individuals and species. Here, we introduce a Matlab toolbox for RSA. The toolbox supports an analysis approach that is simultaneously data- and hypothesis-driven. It is designed to help integrate a wide range of computational models into the analysis of multichannel brain-activity measurements as provided by modern functional imaging and neuronal recording techniques. Tools for visualization and inference enable the user to relate sets of models to sets of brain regions and to statistically test and compare the models using nonparametric inference methods. The toolbox supports searchlight-based RSA, to continuously map a measured brain volume in search of a neuronal population code with a specific geometry. Finally, we introduce the linear-discriminant t value as a measure of representational discriminability that bridges the gap between linear decoding analyses and RSA. In order to demonstrate the capabilities of the toolbox, we apply it to both simulated and real fMRI data. The key functions are equally applicable to other modalities of brain-activity measurement. The toolbox is freely available to the community under an open-source license agreement (http://www.mrc-cbu.cam.ac.uk/methods-and-resources/toolboxes/license/).
This is a PLOS Computational Biology Software Article
  相似文献   

16.
Robust PCA and classification in biosciences   总被引:7,自引:0,他引:7  
MOTIVATION: Principal components analysis (PCA) is a very popular dimension reduction technique that is widely used as a first step in the analysis of high-dimensional microarray data. However, the classical approach that is based on the mean and the sample covariance matrix of the data is very sensitive to outliers. Also, classification methods based on this covariance matrix do not give good results in the presence of outlying measurements. RESULTS: First, we propose a robust PCA (ROBPCA) method for high-dimensional data. It combines projection-pursuit ideas with robust estimation of low-dimensional data. We also propose a diagnostic plot to display and classify the outliers. This ROBPCA method is applied to several bio-chemical datasets. In one example, we also apply a robust discriminant method on the scores obtained with ROBPCA. We show that this combination of robust methods leads to better classifications than classical PCA and quadratic discriminant analysis. AVAILABILITY: All the programs are part of the Matlab Toolbox for Robust Calibration, available at http://www.wis.kuleuven.ac.be/stat/robust.html.  相似文献   

17.
The mouse model is an important research tool in neurosciences to examine brain function and diseases with genetic perturbation in different brain regions. However, the limited techniques to map activated brain regions under specific experimental manipulations has been a drawback of the mouse model compared to human functional brain mapping. Here, we present a functional brain mapping method for fast and robust in vivo brain mapping of the mouse brain. The method is based on the acquisition of high density electroencephalography (EEG) with a microarray and EEG source estimation to localize the electrophysiological origins. We adapted the Fieldtrip toolbox for the source estimation, taking advantage of its software openness and flexibility in modeling the EEG volume conduction. Three source estimation techniques were compared: Distribution source modeling with minimum-norm estimation (MNE), scanning with multiple signal classification (MUSIC), and single-dipole fitting. Known sources to evaluate the performance of the localization methods were provided using optogenetic tools. The accuracy was quantified based on the receiver operating characteristic (ROC) analysis. The mean detection accuracy was high, with a false positive rate less than 1.3% and 7% at the sensitivity of 90% plotted with the MNE and MUSIC algorithms, respectively. The mean center-to-center distance was less than 1.2 mm in single dipole fitting algorithm. Mouse microarray EEG source localization using microarray allows a reliable method for functional brain mapping in awake mouse opening an access to cross-species study with human brain.  相似文献   

18.
双指数模型在高b值弥散加权成像中的初步研究   总被引:1,自引:0,他引:1  
目的采用双指数分析模型探讨弥散加权信号强度的衰减规律,揭示脑组织的弥散信息。材料和方法对豆状核、内囊、额叶自质、丘脑等感兴趣区的每一像素,使用Matlab优化工具箱中的lsqcurvefit()函数对b值从500s/mm。到3500s/mm。共计7个b值图像的信号强度值进行拟合,并与单指数拟合的结果进行比较。结果双指数模型对信号强度的拟合优于单指数模型,并能获得三个新参数。结论双指数模型能更好的拟合高b值时图像信号强度.所得的三个参数能从不同角度提供大脑的弥散信息.但其生理基础有待于进一步研究。  相似文献   

19.
MOTIVATION: Microarrays have been widely used to discover novel disease related genes. Some types of microarray, such as cDNA arrays, usually contain a considerable portion of missing values. When missing value imputation and gene prioritization are sequentially conducted, it is necessary to consider the distribution space of prioritization scores due to the existence of missing values. We propose an ensemble approach to address this issue. A bootstrap procedure enables us to generate a resample multivariate distribution of the prioritization scores and then to obtain the expected prioritization scores. RESULTS: We used a published microarray two-sample data set to illustrate our approach. We focused on the following issues after missing value imputation: (i) concordance of gene prioritization and (ii) control of true and false positives. We compared our approach with the traditional non-ensemble approach to missing value imputation. We also evaluated the performance of non-imputation approach when the theoretical test distribution was available. The results showed that the ensemble imputation approach provided clearly improved performances in the concordance of gene prioritization and the control of true/false positives, especially when sample sizes were about 5-10 per group and missing rates were about 10-20%, which was a common situation for cDNA microarray studies. AVAILABILITY: The Matlab codes are freely available at http://home.gwu.edu/~ylai/research/Missing.  相似文献   

20.
Enhancing scatterplots with smoothed densities   总被引:3,自引:0,他引:3  
MOTIVATION: Scatterplots of microarray data generally contain a very large number of dots, making it difficult to get a good impression of their distribution in dense areas. RESULTS: We present a fast and simple algorithm for two-dimensional histogram smoothing, to visually enhance scatterplots. AVAILABILITY: Functions for Matlab and R are available from the corresponding author.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号