首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 484 毫秒
1.
Large-scale genome projects require the analysis of large amounts of raw data. This analysis often involves the application of a chain of biology-based programs. Many of these programs are difficult to operate because they are non-integrated, command-line driven, and platform-dependent. The problem is compounded when the number of data files involved is large, making navigation and status-tracking difficult. To demonstrate how this problem can be addressed, we have created a platform-independent Web front end that integrates a set of programs used in a genomic project analyzing gene function by transposon mutagenesis in Saccharomyces cerevisiae. In particular, these programs help define a large number of transposon insertion events within the yeast genome, identifying both the precise site of transposon insertion as well as potential open reading frames disrupted by this insertion event. Our Web interface facilitates this analysis by performing the following tasks. Firstly, it allows each of the analysis programs to be launched against multiple directories of data files. Secondly, it allows the user to view, download, and upload files generated by the programs. Thirdly, it indicates which sets of data directories have been processed by each program. Although designed specifically to aid in this project, our interface exemplifies a general approach by which independent software programs may be integrated into an efficient protocol for large-scale genomic data processing. Electronic Publication  相似文献   

2.
3.
Computer programs for the analysis of cellular survival data   总被引:4,自引:0,他引:4  
Four programs have been written to enable radiobiologists to build a computer data base of cellular dose-survival data, calculate cell survival with a correction for cell multiplicity at the time of irradiation, fit various survival models to the data by iteratively weighted least squares, and calculate the ratio of survival levels corresponding to specified doses or the ratio of doses that produce specified survival levels (e.g., oxygen enhancement ratio or relative biological effectiveness). The programs make plots of survival curves and data, and they calculate standard errors and confidence intervals of the fitted survival curve parameters and ratios. The programs calculate survival curves for the linear-quadratic, repair-saturation, single-hit multitarget, linear-multitarget, and repair-misrepair models of cell survival and have been designed to accommodate the addition of other survival models in the future. The programs can be used to compare the accuracy with which different models fit the data, determine if a difference in fit is statistically significant, and show how the estimated value of a survival curve parameter, such as the extrapolation number or the final slope, varies with the survival model. The repair of radiation-induced damage is analyzed in a novel way using these programs.  相似文献   

4.
For the analysis of enzyme kinetics, a variety of programs exists. These programs apply either algebraic or dynamic parameter estimation, requiring different approaches for data fitting. The choice of approach and computer program is usually subjective, and it is generally assumed that this choice has no influence on the obtained parameter estimates. However, this assumption has not yet been verified comprehensively. Therefore, in this study, five computer programs for progress curve analysis were compared with respect to accuracy and minimum data amount required to obtain accurate parameter estimates. While two of these five computer programs (MS‐Excel, Origin) use algebraic parameter estimation, three computer programs (Encora, ModelMaker, gPROMS) are able to perform dynamic parameter estimation. For this comparison, the industrially important enzyme penicillin amidase (EC 3.5.1.11) was studied, and both experimental and in silico data were used. It was shown that significant differences in the estimated parameter values arise by using different computer programs, especially if the number of data points is low. Therefore, deviations between parameter values reported in the literature could simply be caused by the use of different computer programs.  相似文献   

5.
Enrolment in external quality assurance programs is part of the accreditation process for medical laboratories in Australia, with the majority of Australian laboratories being enrolled in programs from RCPA Quality Assurance Programs Pty Limited, a company owned by the Royal College of Pathologists of Australasia. An important feature of these programs is that they have been developed with the involvement and contribution of the profession. For example, the Chemical Pathology programs are a joint venture between the company and the Australasian Association of Clinical Biochemists (AACB). Some of the unique features of the programs are the composition of the material, the use of target values, the structure and information in the reports and the use of the internet for data entry and data review. Over the past thirty years, the development of these programs has made a significant contribution to the quality of laboratories in Australia.  相似文献   

6.
Sustainability indicator programs in developing countries are the poor cousin of ecological indicator research. While an enormous number of indicators for the monitoring of sustainable development exists, few meta-evaluations on these measurements have been conducted in developing countries. Yet, researchers developing new programs face the question: how shall we design our monitoring instrument to respond to the local challenges.By presenting a qualitative meta-performance evaluation of seven sustainability indicator programs on the municipal level in developing countries of Asia, we identify crucial success factors in this contribution. The research draws on 41 expert interviews in Indonesia, Thailand, China, and India, as well as on program-related documents. In the presented case studies, local contexts are intended to be diverse: obtained results should map success factors in different settings. A context-related list of good-practice factors is derived from the interview material via a Qualitative Content Analysis and assessed against the data.We identify crucial strengths and weaknesses of sustainability indicator programs in six dimensions and link the success factors to their contexts. The results include innovative approaches to indicator types, data collection and data quality control, and a correlation between the anchoring of programs in approved development plans and long-term implementation. The results can provide valuable guidance to users of existing sustainability indicator programs and planners of new programs.  相似文献   

7.
Simulation software programs continue to evolve and to meet the needs of risk analysts. In the past several years, two spreadsheet add-in programs added the capability of fitting distributions to data to their tool kits using classical statistical (i.e., non-Bayesian) methods. Crystal Ball version 4.0 now contains this capability in its standard program (and in Crystal Ball Pro version 4.0), while the BestFit software program is a component of the @RISK Decision Tools Suite that can also be purchased as a stand-alone program. Both programs will automatically fit distributions using maximum likelihood estimators to continuous data and provide goodness-of-fit statistics based on chi-squared, Kolmogorov-Smirnov, and Anderson-Darling tests. BestFit will also fit discrete distributions, and for all distributions it offers the option of optimizing the fit based on the goodness-of-fit parameters. Analysts should be wary of placing too much emphasis on the goodness-of-fit statistics given their limitations, and the fact that only some of the statistics are appropriately corrected to account for the fact that the distribution parameters are also fit using the data. These programs dramatically simplify efforts to use maximum likelihood estimation to fit distributions. However, the fact that a program is used to fit distributions should not be viewed as validation that the data have been fitted and interpreted correctly. Both programs rely heavily on the analyst's judgment and will allow analysts to fit inappropriate distributions. Currently, both programs could be improved by adding the ability to perform extensive basic exploratory data analysis and to give regression diagnostics that are needed to satisfy critical analysts or reviewers. Given that Bayesian methods are central to risk analysis, adding the capability of fitting distributions by combining data with prior information would greatly increase the utility of these programs.  相似文献   

8.
Citizen science and community-based monitoring programs are increasing in number and breadth, generating volumes of scientific data. Many programs are ill-equipped to effectively manage these data. We examined the art and science of multi-scale citizen science support, focusing on issues of integration and flexibility that arise for data management when programs span multiple spatial, temporal, and social scales across many domains. Our objectives were to: (1) briefly review existing citizen science approaches and data management needs; (2) propose a framework for multi-scale citizen science support; (3) develop a cyber-infrastructure to support citizen science program needs; and (4) describe lessons learned. We find that approaches differ in scope, scale, and activities and that the proposed framework situates programs while guiding cyber-infrastructure system development. We built a cyber-infrastructure support system for citizen science programs (www.citsci.org) and show that carefully designed systems can be adept enough to support programs at multiple spatial and temporal scales across many domains when built with a flexible architecture. The advantage of a flexible, yet controlled, cyber-infrastructure system lies in the ability of users with different levels of permission to easily customize the features themselves, while adhering to controlled vocabularies necessary for cross-discipline comparisons and meta-analyses. Program evaluation tied to this framework and integrated into cyber-infrastructure support systems will improve our ability to track effectiveness. We compare existing systems and discuss the importance of standards for interoperability and the challenges associated with system maintenance and long-term support. We conclude by offering a vision of the future of citizen science data management and cyber-infrastructure support.  相似文献   

9.
Computer programs for the analysis of data from techniques frequently used in nucleic acids research are described. In addition to calculating non-linear, least-squares solutions to equations describing these systems, the programs allow for data editing, normalization, plotting and storage, and are flexible and simple to use. Typical applications of the programs are described.  相似文献   

10.
Computer programs have been developed to serve as a method for storing, retrieving, and sorting mouse linkage data. The programs accept and store raw data and reference information for gene linkage; calculate recombination values for each data set and for combined data sets; retrieve, sort, and print-out raw data, references, and recombination values; and generate linkage maps.  相似文献   

11.
The programs for microcomputer "Electronika" BZ-34 permitting to compute the molecular DNA-DNA hybridization data with thermal stability duplex analysis have been developed. The proper different programs are offered for analysis of results obtained in different conditions of DNA hybridization. These programs are especially useful for accelerated calculation of large series of experimental data.  相似文献   

12.
13.
ABSTRACT: BACKGROUND: The MapReduce framework enables a scalable processing and analyzing of large datasets by distributing the computational load on connected computer nodes, referred to as a cluster. In Bioinformatics, MapReduce has already been adopted to various case scenarios such as mapping next generation sequencing data to a reference genome, finding SNPs from short read data or matching strings in genotype files. Nevertheless, tasks like installing and maintaining MapReduce on a cluster system, importing data into its distributed file system or executing MapReduce programs require advanced knowledge in computer science and could thus prevent scientists from usage of currently available and useful software solutions. RESULTS: Here we present Cloudgene, a freely available platform to improve the usability of MapReduce programs in Bioinformatics by providing a graphical user interface for the execution, the import and export of data and the reproducibility of workflows on in-house (private clouds) and rented clusters (public clouds). The aim of Cloudgene is to build a standardized graphical execution environment for currently available and future MapReduce programs, which can all be integrated by using its plug-in interface. Since Cloudgene can be executed on private clusters, sensitive datasets can be kept in house at all time and data transfer times are therefore minimized. CONCLUSIONS: Our results show that MapReduce programs can be integrated into Cloudgene with little effort and without adding any computational overhead to existing programs. This platform gives developers the opportunity to focus on the actual implementation task and provides scientists a platform with the aim to hide the complexity of MapReduce. In addition to MapReduce programs, Cloudgene can also be used to launch predefined systems (e.g. Cloud BioLinux, RStudio) in public clouds. Currently, five different bioinformatic programs using MapReduce and two systems are integrated and have been successfully deployed. Cloudgene is freely available at http://cloudgene.uibk.ac.at.  相似文献   

14.
Sequence data handling by computer.   总被引:134,自引:70,他引:64       下载免费PDF全文
The speed of the new DNA sequencing techniques has created a need for computer programs to handle the data produced. This paper describes simple programs designed specifically for use by people with little or no computer experience. The programs are for use on small computers and provide facilities for storage, editing and analysis of both DNA and amino acid sequences. A magnetic tape containing these programs is available on request.  相似文献   

15.
Feng R  Zhou G  Zhang M  Zhang H 《Biometrics》2009,65(2):584-589
Summary .  Twin studies are essential for assessing disease inheritance. Data generated from twin studies are traditionally analyzed using specialized computational programs. For many researchers, especially those who are new to twin studies, understanding and using those specialized computational programs can be a daunting task. Given that SAS (Statistical Analysis Software) is the most popular software for statistical analysis, we suggest that the use of SAS procedures for twin data may be a helpful alternative and demonstrate that we can obtain similar results from SAS to those produced by specialized computational programs. This numerical validation is practically useful, because a natural concern with general statistical software is whether it can deal with data that are generated from special study designs such as twin studies and if it can test a particular hypothesis. We concluded through our extensive simulation that SAS procedures can be used easily as a very convenient alternative to specialized programs for twin data analysis.  相似文献   

16.
On the basis of simulated data, this study compares the relative performances of the Bayesian clustering computer programs structure , geneland , geneclust and a new program named tess . While these four programs can detect population genetic structure from multilocus genotypes, only the last three ones include simultaneous analysis from geographical data. The programs are compared with respect to their abilities to infer the number of populations, to estimate membership probabilities, and to detect genetic discontinuities and clinal variation. The results suggest that combining analyses using tess and structure offers a convenient way to address inference of spatial population structure.  相似文献   

17.
Multiple sequence alignment (MSA) is a crucial first step in the analysis of genomic and proteomic data. Commonly occurring sequence features, such as deletions and insertions, are known to affect the accuracy of MSA programs, but the extent to which alignment accuracy is affected by the positions of insertions and deletions has not been examined independently of other sources of sequence variation. We assessed the performance of 6 popular MSA programs (ClustalW, DIALIGN-T, MAFFT, MUSCLE, PROBCONS, and T-COFFEE) and one experimental program, PRANK, on amino acid sequences that differed only by short regions of deleted residues. The analysis showed that the absence of residues often led to an incorrect placement of gaps in the alignments, even though the sequences were otherwise identical. In data sets containing sequences with partially overlapping deletions, most MSA programs preferentially aligned the gaps vertically at the expense of incorrectly aligning residues in the flanking regions. Of the programs assessed, only DIALIGN-T was able to place overlapping gaps correctly relative to one another, but this was usually context dependent and was observed only in some of the data sets. In data sets containing sequences with non-overlapping deletions, both DIALIGN-T and MAFFT (G-INS-I) were able to align gaps with near-perfect accuracy, but only MAFFT produced the correct alignment consistently. The same was true for data sets that comprised isoforms of alternatively spliced gene products: both DIALIGN-T and MAFFT produced highly accurate alignments, with MAFFT being the more consistent of the 2 programs. Other programs, notably T-COFFEE and ClustalW, were less accurate. For all data sets, alignments produced by different MSA programs differed markedly, indicating that reliance on a single MSA program may give misleading results. It is therefore advisable to use more than one MSA program when dealing with sequences that may contain deletions or insertions, particularly for high-throughput and pipeline applications where manual refinement of each alignment is not practicable.  相似文献   

18.
Portable microcomputer software for nucleotide sequence analysis.   总被引:27,自引:10,他引:17       下载免费PDF全文
B Fristensky  J Lis    R Wu 《Nucleic acids research》1982,10(20):6451-6463
The most common types of nucleotide sequence data analyses and handling can be done more conveniently and inexpensively on microcomputers than on large time-sharing systems. We present a package of computer programs for the analysis of DNA and RNA sequence data which overcomes many of the limitations imposed by microcomputers, while offering most of the features of programs commonly available on large computers, including sequence numbering and translation, restriction site and homology searches with dot-matrix plots, nucleotide distribution analysis, and graphic display of data. Most of the programs were written in Standard Pascal (on an Apple II computer) to facilitate portability to other micro-, mini-, and and mainframe computers.  相似文献   

19.
海量生物信息数据的不断涌现迫切需要在数据压缩技术方面进行更多研究,以减轻服务器存储压力和提高网络传输及数据分析的效率。目前虽然已开发出大量数据压缩软件,但对于海量生物信息数据而言,应该选用何种软件和方法进行数据压缩,尚缺乏详细的综合比较分析。本文选择生物信息学领域中GenBank数据库中的典型核酸和蛋白质序列数据库以及典型生物信息软件Blast和EMBOSS为例,采用不同数据压缩软件进行综合比较分析,结果发现经典压缩软件compress的总体压缩效率很高,除压缩比率可接受之外,其压缩时间相对其他软件而言显著减少,甚至比并行化的hzip2(pbzip2)和gzip(pigz)软件的时间还少很多,故可优先考虑使用。7-Zip软件虽然具有最高的压缩比率,但压缩过程十分耗时,可用于数据的长期储存;而采用bzip2、rar以及gzip等软件压缩的文件,虽然压缩比率较7-Zip的偏低,但压缩过程相对而言还比较快速。具体应用中推荐使用经典压缩软件compress以及并行化运行的pbzip2和pigz软件,三者可作为同时兼顾压缩比率和压缩时间的优选。  相似文献   

20.
Research productivity assessment is increasingly relevant for allocation of research funds. On one hand, this assessment is challenging because it involves both qualitative and quantitative analysis of several characteristics, most of them subjective in nature. On the other hand, current tools and academic social networks make bibliometric data web-available to everyone for free. Those tools, especially when combined with other data, are able to create a rich environment from which information on research productivity can be extracted. In this context, our work aims at characterizing the Brazilian Computer Science graduate programs and the relationship among themselves. We (i) present views of the programs from different perspectives, (ii) rank the programs according to each perspective and a combination of them, (iii) show correlation between assessment metrics, (iv) discuss how programs relate to another, and (v) infer aspects that boost programs'' research productivity. The results indicate that programs with a higher insertion in the coauthorship network topology also possess a higher research productivity between 2004 and 2009.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号