首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.

Background  

Randomized, prospective trials involving multi-institutional collaboration have become a central part of clinical and translational research. However, data management and coordination of multi-center studies is a complex process that involves developing systems for data collection and quality control, tracking data queries and resolutions, as well as developing communication procedures. We describe DADOS-Prospective, an open-source Web-based application for collecting and managing prospective data on human subjects for clinical and translational trials. DADOS-Prospective not only permits users to create new clinical research forms (CRF) and supports electronic signatures, but also offers the advantage of containing, in a single environment, raw research data in downloadable spreadsheet format, source documentation and regulatory files stored in PDF format, and audit trails.  相似文献   

4.
  1. With an increasing number of scientific articles published each year, there is a need to synthesize and obtain insights across ever‐growing volumes of literature. Here, we present pyResearchInsights, a novel open‐source automated content analysis package that can be used to analyze scientific abstracts within a natural language processing framework.
  2. The package collects abstracts from scientific repositories, identifies topics of research discussed in these abstracts, and presents interactive concept maps to visualize these research topics. To showcase the utilities of this package, we present two examples, specific to the field of ecology and conservation biology.
  3. First, we demonstrate the end‐to‐end functionality of the package by presenting topics of research discussed in 1,131 abstracts pertaining to birds of the Tropical Andes. Our results suggest that a large proportion of avian research in this biodiversity hotspot pertains to species distributions, climate change, and plant ecology.
  4. Second, we retrieved and analyzed 22,561 abstracts across eight journals in the field of conservation biology to identify twelve global topics of conservation research. Our analysis shows that conservation policy and landscape ecology are focal topics of research. We further examined how these conservation‐associated research topics varied across five biodiversity hotspots.
  5. Lastly, we compared the utilities of this package with existing tools that carry out automated content analysis, and we show that our open‐source package has wider functionality and provides end‐to‐end utilities that seldom exist across other tools.
  相似文献   

5.

Background  

Today, data evaluation has become a bottleneck in chromatographic science. Analytical instruments equipped with automated samplers yield large amounts of measurement data, which needs to be verified and analyzed. Since nearly every GC/MS instrument vendor offers its own data format and software tools, the consequences are problems with data exchange and a lack of comparability between the analytical results. To challenge this situation a number of either commercial or non-profit software applications have been developed. These applications provide functionalities to import and analyze several data formats but have shortcomings in terms of the transparency of the implemented analytical algorithms and/or are restricted to a specific computer platform.  相似文献   

6.
MedlineR: an open source library in R for Medline literature data mining   总被引:3,自引:0,他引:3  
SUMMARY: We describe an open source library written in the R programming language for Medline literature data mining. This MedlineR library includes programs to query Medline through the NCBI PubMed database; to construct the co-occurrence matrix; and to visualize the network topology of query terms. The open source nature of this library allows users to extend it freely in the statistical programming language of R. To demonstrate its utility, we have built an application to analyze term-association by using only 10 lines of code. We provide MedlineR as a library foundation for bioinformaticians and statisticians to build more sophisticated literature data mining applications. AVAILABILITY: The library is available from http://dbsr.duke.edu/pub/MedlineR.  相似文献   

7.
Halligan BD  Greene AS 《Proteomics》2011,11(6):1058-1063
A major challenge in the field of high-throughput proteomics is the conversion of the large volume of experimental data that is generated into biological knowledge. Typically, proteomics experiments involve the combination and comparison of multiple data sets and the analysis and annotation of these combined results. Although there are some commercial applications that provide some of these functions, there is a need for a free, open source, multifunction tool for advanced proteomics data analysis. We have developed the Visualize program that provides users with the abilities to visualize, analyze, and annotate proteomics data; combine data from multiple runs, and quantitate differences between individual runs and combined data sets. Visualize is licensed under GNU GPL and can be downloaded from http://proteomics.mcw.edu/visualize. It is available as compiled client-based executable files for both Windows and Mac OS X platforms as well as PERL source code.  相似文献   

8.
MADE4: an R package for multivariate analysis of gene expression data   总被引:2,自引:0,他引:2  
SUMMARY: MADE4, microarray ade4, is a software package that facilitates multivariate analysis of microarray gene-expression data. MADE4 accepts a wide variety of gene-expression data formats. MADE4 takes advantage of the extensive multivariate statistical and graphical functions in the R package ade4, extending these for application to microarray data. In addition, MADE4 provides new graphical and visualization tools that aid in interpretation of multivariate analysis of microarray data.  相似文献   

9.
An open-source Python library EMDA for cryo-EM map and model manipulation is presented with a specific focus on validation. The use of several functionalities in the library is presented through several examples. The utility of local correlation as a metric for identifying map-model differences and unmodeled regions in maps, and how it is used as a metric of map-model validation is demonstrated. The mapping of local correlation to individual atoms, and its use to draw insights on local signal variations are discussed. EMDA’s likelihood-based map overlay is demonstrated by carrying out a superposition of two domains in two related structures. The overlay is carried out first to bring both maps into the same coordinate frame and then to estimate the relative movement of domains. Finally, the map magnification refinement in EMDA is presented with an example to highlight the importance of adjusting the map magnification in structural comparison studies.  相似文献   

10.

Background

The Immunoglobulins (IG) and the T cell receptors (TR) play the key role in antigen recognition during the adaptive immune response. Recent progress in next-generation sequencing technologies has provided an opportunity for the deep T cell receptor repertoire profiling. However, a specialised software is required for the rational analysis of massive data generated by next-generation sequencing.

Results

Here we introduce tcR, a new R package, representing a platform for the advanced analysis of T cell receptor repertoires, which includes diversity measures, shared T cell receptor sequences identification, gene usage statistics computation and other widely used methods. The tool has proven its utility in recent research studies.

Conclusions

tcR is an R package for the advanced analysis of T cell receptor repertoires after primary TR sequences extraction from raw sequencing reads. The stable version can be directly installed from The Comprehensive R Archive Network (http://cran.r-project.org/mirrors.html). The source code and development version are available at tcR GitHub (http://imminfo.github.io/tcr/) along with the full documentation and typical usage examples.  相似文献   

11.

Background  

Cluster analysis is an important technique for the exploratory analysis of biological data. Such data is often high-dimensional, inherently noisy and contains outliers. This makes clustering challenging. Mixtures are versatile and powerful statistical models which perform robustly for clustering in the presence of noise and have been successfully applied in a wide range of applications.  相似文献   

12.
13.

Background

The goal of DNA barcoding is to develop a species-specific sequence library for all eukaryotes. A 650 bp fragment of the cytochrome c oxidase 1 (CO1) gene has been used successfully for species-level identification in several animal groups. It may be difficult in practice, however, to retrieve a 650 bp fragment from archival specimens, (because of DNA degradation) or from environmental samples (where universal primers are needed).

Results

We used a bioinformatics analysis using all CO1 barcode sequences from GenBank and calculated the probability of having species-specific barcodes for varied size fragments. This analysis established the potential of much smaller fragments, mini-barcodes, for identifying unknown specimens. We then developed a universal primer set for the amplification of mini-barcodes. We further successfully tested the utility of this primer set on a comprehensive set of taxa from all major eukaryotic groups as well as archival specimens.

Conclusion

In this study we address the important issue of minimum amount of sequence information required for identifying species in DNA barcoding. We establish a novel approach based on a much shorter barcode sequence and demonstrate its effectiveness in archival specimens. This approach will significantly broaden the application of DNA barcoding in biodiversity studies.  相似文献   

14.
Suppression subtractive hybridization (SSH) is a widely used technique for the identification of differentially expressed genes. SSH as well as other types of sequencing projects generate large amounts of anonymous sequences. SSHSuite automates the handling and storage of these sequences and enables identification through similarity searches. SSHSuite also offers analysis tools for the retrieval and comparison of the resulting similarity data. SSHSuite consists of four programs: SSHHandler, SSHOverview, SSHAnalysis, and SSHCompare.  相似文献   

15.
16.

Background  

Biological imaging is an emerging field, covering a wide range of applications in biological and clinical research. However, while machinery for automated experimenting and data acquisition has been developing rapidly in the past years, automated image analysis often introduces a bottleneck in high content screening.  相似文献   

17.
Summary The Gifa program is designed for processing, displaying and analysing 1D, 2D and 3D NMR data sets. It has been constructed in a modular fashion, based on three independent modules: a set of commands that perform all the basic processing operations such as apodisation functions, a complete set of Fourier Transforms, phasing and baseline correction, peak-picking and line fitting, linear prediction and maximum entropy processing; a set of command language primitives that permit the execution of complex macro commands; and a set of graphic commands that permit to build a complete graphic user interface, allowing the user to interact easily with the program. We have tried to create a versatile program that can be easily extended according to the user's requirements and that is adapted to a novice as well as an experienced user. The program runs on any UNIX computer, with or without graphic display, in interactive or batch mode.  相似文献   

18.
A substantial time savings in the collection of multidimensional NMR data can be achieved by coupling the evolution of nuclei in the indirect dimensions. In order to save time, the sampling of the indirect dimensions is inherently incomplete. Therefore, many algorithms and samplings schemes have been developed aimed at separating the coevolved frequencies into analyzable data with limited artifacts. This paper extends the use of circulant matrices to describe coupled evolution with convolutions. By understanding the data in terms of convolutions, there is an exact solution to the inversion problem of extracting the orthogonal vectors from the coupled dimensions. Previously, this inversion problem has been solved using peak coordinates extracted from spectra. In contrast, the method described here uses spectra directly. This solution suggests a simple sampling scheme of collecting N orthogonal spectra, and N + 1 projections at specific projection angles, however, the theory developed can be extended generally to arbitrary projection angles. The circulant matrix methodology is demonstrated for simulated and real data. Further, an algorithm for separating overlapped signals in the detected dimension is presented. The algorithm involves the forward calculation of the coupled spectra from the orthogonal spectra, followed by back calculation of the orthogonal spectra from the coupled spectra, thus permitting rigorous cross-validation. This algorithm is shown to be robust in that erroneous solutions give rise to large artifacts. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

19.

Background

This paper is devoted to distance measures for leaf-labelled trees on free leafset. A leaf-labelled tree is a data structure which is a special type of a tree where only leaves (terminal) nodes are labelled. This data structure is used in bioinformatics for modelling of evolution history of genes and species and also in linguistics for modelling of languages evolution history. Many domain specific problems occur and need to be solved with help of tree postprocessing techniques such as distance measures.

Results

Here we introduce the tree edit distance designed for leaf labelled trees on free leafset, which occurs to be a metric. It is presented together with tree edit consensus tree notion. We provide statistical evaluation of provided measure with respect to R-F, MAST and frequent subsplit based dissimilarity measures as the reference measures.

Conclusions

The tree edit distance was proven to be a metric and has the advantage of using different costs for contraction and pruning, therefore their properties can be tuned depending on the needs of the user. Two of the presented methods carry the most interesting properties. E(3,1) is very discriminative (having a wide range of values) and has a very regular distance distribution which is similar to a normal distribution in its shape and is good both for similar and non-similar trees. NFC(2,1) on the other hand is proportional or nearly proportional to the number of mutation operations used, irrespective of their type.  相似文献   

20.
SUMMARY: Gene Ontology (GO) annotations have become a major tool for analysis of genome-scale experiments. We have created OntologyTraverser--an R package for GO analysis of gene lists. Our system is a major advance over previous work because (1) the system can be installed as an R package, (2) the system uses Java to instantiate the GO structure and the SJava system to integrate R and Java and (3) the system is also deployed as a publicly available web tool. AVAILABILITY: Our software is academically available through http://franklin.imgen.bcm.tmc.edu/OntologyTraverser/. Both the R package and the web tool are accessible. CONTACT: cashaw@bcm.tmc.edu  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号