首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
DNA Data Bank of Japan at work on genome sequence data.   总被引:5,自引:3,他引:2       下载免费PDF全文
We at the DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) have recently begun receiving, processing and releasing EST and genome sequence data submitted by various Japanese genome projects. The data include those for human, Arabidopsis thaliana, rice, nematode, Synechocystis sp. and Escherichia coli. Since the quantity of data is very large, we organized teams to conduct preliminary discussions with project teams about data submission and handling for release to the public. We also developed a mass submission tool to cope with a large quantity of data. In addition, to provide genome data on WWW, we developed a genome information system using Java. This system (http://mol.genes.nig.ac.jp/ecoli/) can in theory be used for any genome sequence data. These activities will facilitate processing of large quantities of EST and genome data.  相似文献   

2.

Background

Systems biology has embraced computational modeling in response to the quantitative nature and increasing scale of contemporary data sets. The onslaught of data is accelerating as molecular profiling technology evolves. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) is a community effort to catalyze discussion about the design, application, and assessment of systems biology models through annual reverse-engineering challenges.

Methodology and Principal Findings

We describe our assessments of the four challenges associated with the third DREAM conference which came to be known as the DREAM3 challenges: signaling cascade identification, signaling response prediction, gene expression prediction, and the DREAM3 in silico network challenge. The challenges, based on anonymized data sets, tested participants in network inference and prediction of measurements. Forty teams submitted 413 predicted networks and measurement test sets. Overall, a handful of best-performer teams were identified, while a majority of teams made predictions that were equivalent to random. Counterintuitively, combining the predictions of multiple teams (including the weaker teams) can in some cases improve predictive power beyond that of any single method.

Conclusions

DREAM provides valuable feedback to practitioners of systems biology modeling. Lessons learned from the predictions of the community provide much-needed context for interpreting claims of efficacy of algorithms described in the scientific literature.  相似文献   

3.
Summary Dynamic load is a technique which can be used to secure large pedons successfully in duplex profiles with massive clay B horizons. Data show that soil profiles obtained by this method have minimum change in soil physical properties including structure and compaction. The work reported here is part of that submitted as a requirement for the Ph. D. at the University of New England by the senior author.  相似文献   

4.
The Registry of Standard Biological Parts only accepts genetic parts compatible with the RFC 10 BioBrick format. This combined assembly and submission standard requires that four unique restriction enzyme sites must not occur in the DNA sequence encoding a part. We present evidence that this requirement places a nontrivial burden on iGEM teams developing large and novel parts. We further argue that the emergence of inexpensive DNA synthesis and versatile assembly methods reduces the utility of coupling submission and assembly standards and propose a submission standard that is compatible with current quality control strategies while nearly eliminating sequence constraints on submitted parts.  相似文献   

5.
There is a paucity of data in the literature concerning the validation of the grant application peer review process, which is used to help direct billions of dollars in research funds. Ultimately, this validation will hinge upon empirical data relating the output of funded projects to the predictions implicit in the overall scientific merit scores from the peer review of submitted applications. In an effort to address this need, the American Institute of Biological Sciences (AIBS) conducted a retrospective analysis of peer review data of 2,063 applications submitted to a particular research program and the bibliometric output of the resultant 227 funded projects over an 8-year period. Peer review scores associated with applications were found to be moderately correlated with the total time-adjusted citation output of funded projects, although a high degree of variability existed in the data. Analysis over time revealed that as average annual scores of all applications (both funded and unfunded) submitted to this program improved with time, the average annual citation output per application increased. Citation impact did not correlate with the amount of funds awarded per application or with the total annual programmatic budget. However, the number of funded applications per year was found to correlate well with total annual citation impact, suggesting that improving funding success rates by reducing the size of awards may be an efficient strategy to optimize the scientific impact of research program portfolios. This strategy must be weighed against the need for a balanced research portfolio and the inherent high costs of some areas of research. The relationship observed between peer review scores and bibliometric output lays the groundwork for establishing a model system for future prospective testing of the validity of peer review formats and procedures.  相似文献   

6.
MOTIVATION: The gap between the amount of newly submitted protein data and reliable functional annotation in public databases is growing. Traditional manual annotation by literature curation and sequence analysis tools without the use of automated annotation systems is not able to keep up with the ever increasing quantity of data that is submitted. Automated supplements to manually curated databases such as TrEMBL or GenPept cover raw data but provide only limited annotation. To improve this situation automatic tools are needed that support manual annotation, automatically increase the amount of reliable information and help to detect inconsistencies in manually generated annotations. RESULTS: A standard data mining algorithm was successfully applied to gain knowledge about the Keyword annotation in SWISS-PROT. 11 306 rules were generated, which are provided in a database and can be applied to yet unannotated protein sequences and viewed using a web browser. They rely on the taxonomy of the organism, in which the protein was found and on signature matches of its sequence. The statistical evaluation of the generated rules by cross-validation suggests that by applying them on arbitrary proteins 33% of their keyword annotation can be generated with an error rate of 1.5%. The coverage rate of the keyword annotation can be increased to 60% by tolerating a higher error rate of 5%. AVAILABILITY: The results of the automatic data mining process can be browsed on http://golgi.ebi.ac.uk:8080/Spearmint/ Source code is available upon request. CONTACT: kretsch@ebi.ac.uk.  相似文献   

7.
Time stretch imaging offers real-time image acquisition at millions of frames per second and subnanosecond shutter speed, and has enabled detection of rare cancer cells in blood with record throughput and specificity. An unintended consequence of high throughput image acquisition is the massive amount of digital data generated by the instrument. Here we report the first experimental demonstration of real-time optical image compression applied to time stretch imaging. By exploiting the sparsity of the image, we reduce the number of samples and the amount of data generated by the time stretch camera in our proof-of-concept experiments by about three times. Optical data compression addresses the big data predicament in such systems.  相似文献   

8.
General principles and organizational forms of epidemiological surveillance of plague in the USSR both in seaports andin natural foci are discussed. On the basis of the analysis of the authors' experience over many years and taking into consideration literary data, the authors recommend a minimum, but in their opinion effective amount of a successful realization of these recommendations, it is expedient to establish a special team (or teams) consisting of 10-12 members.  相似文献   

9.
Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data intensive in that data analytics requires complex procedures and multiple tools. To tackle these challenges, a scientific workflow framework is proposed for big geoscience data analytics. In this framework techniques are proposed by leveraging cloud computing, MapReduce, and Service Oriented Architecture (SOA). Specifically, HBase is adopted for storing and managing big geoscience data across distributed computers. MapReduce-based algorithm framework is developed to support parallel processing of geoscience data. And service-oriented workflow architecture is built for supporting on-demand complex data analytics in the cloud environment. A proof-of-concept prototype tests the performance of the framework. Results show that this innovative framework significantly improves the efficiency of big geoscience data analytics by reducing the data processing time as well as simplifying data analytical procedures for geoscientists.  相似文献   

10.

Background  

Technological advances in high-throughput techniques and efficient data acquisition methods have resulted in a massive amount of life science data. The data is stored in numerous databases that have been established over the last decades and are essential resources for scientists nowadays. However, the diversity of the databases and the underlying data models make it difficult to combine this information for solving complex problems in systems biology. Currently, researchers typically have to browse several, often highly focused, databases to obtain the required information. Hence, there is a pressing need for more efficient systems for integrating, analyzing, and interpreting these data. The standardization and virtual consolidation of the databases is a major challenge resulting in a unified access to a variety of data sources.  相似文献   

11.
C S Bryan  A F DiSalvo 《Sabouraudia》1979,17(3):209-212
A patient with chronic granulocytic leukemia developed overwhelming histoplasmosis. During massive fungemia, 59% of peripheral blood neutrophils contained yeast forms. Disseminated intravascular coagulation occurred. Histoplasma capsulatum was isolated not only from the patient's tissues and urine, but also from a serum sample submitted to a reference laboratory for serological testing. The microorganism was demonstrated by specific immunofluorescent staining of peripheral blood films. We suggest that histoplasmosis deserves a definite place on the roster of "opportunistic fungi".  相似文献   

12.

Background  

With the current technological advances in high-throughput biology, the necessity to develop tools that help to analyse the massive amount of data being generated is evident. A powerful method of inspecting large-scale data sets is gene set enrichment analysis (GSEA) and investigation of protein structural features can guide determining the function of individual genes. However, a convenient tool that combines these two features to aid in high-throughput data analysis has not been developed yet. In order to fill this niche, we developed the user-friendly, web-based application, PhenoFam.  相似文献   

13.
The implementation of evidence-based treatments to deliver high-quality care is essential to meet the healthcare demands of aging populations. However, the sustainable application of recommended practice is difficult to achieve and variable outcomes well recognised. The NHS Institute for Innovation and Improvement Sustainability Model (SM) was designed to help healthcare teams recognise determinants of sustainability and take action to embed new practice in routine care. This article describes a formative evaluation of the application of the SM by the National Institute for Health Research Collaboration for Leadership in Applied Health Research and Care for Northwest London (CLAHRC NWL).Data from project teams’ responses to the SM and formal reviews was used to assess acceptability of the SM and the extent to which it prompted teams to take action. Projects were classified as ‘engaged,’ ‘partially engaged’ and ‘non-engaged.’ Quarterly survey feedback data was used to explore reasons for variation in engagement. Score patterns were compared against formal review data and a ‘diversity of opinion’ measure was derived to assess response variance over time.Of the 19 teams, six were categorized as ‘engaged,’ six ‘partially engaged,’ and seven as ‘non-engaged.’ Twelve teams found the model acceptable to some extent. Diversity of opinion reduced over time. A minority of teams used the SM consistently to take action to promote sustainability but for the majority SM use was sporadic. Feedback from some team members indicates difficulty in understanding and applying the model and negative views regarding its usefulness.The SM is an important attempt to enable teams to systematically consider determinants of sustainability, provide timely data to assess progress, and prompt action to create conditions for sustained practice. Tools such as these need to be tested in healthcare settings to assess strengths and weaknesses and findings disseminated to aid development. This study indicates the SM provides a potentially useful approach to measuring teams’ views on the likelihood of sustainability and prompting action. Securing engagement of teams with the SM was challenging and redesign of elements may need to be considered. Capacity building and facilitation appears necessary for teams to effectively deploy the SM.  相似文献   

14.
In the last 10-15years, many new technologies and approaches have been implemented in research in the pharmaceutical industry; these include high-throughput screening or combinatorial chemistry, which result in a rapidly growing amount of biological assay and structural data in the corporate databases. Efficient use of the data from this growing data mountain is a key success factor; 'provide as much knowledge as possible as early as possible and therefore enable research teams to make the best possible decision whenever this decision can be supported by stored data'. Here, an approach which started several years ago to obtain as much information as possible out of historical assay data stored in the corporate database is described. It will be shown how important a careful preprocessing of the stored data is to enhance its information. Different possibilities for accessing and to analyzing the preconditioned data are in place. Some of will be described in the examples.  相似文献   

15.
Modern high-throughput biotechnologies such as microarray and next-generation sequencing produce a massive amount of information for each sample assayed. However, in a typical high-throughput experiment, only limited amount of data are observed for each individual feature, thus the classical “large p, small n” problem. Bayesian hierarchical model, capable of borrowing strength across features within the same dataset, has been recognized as an effective tool in analyzing such data. However, the shrinkage effect, the most prominent feature of hierarchical features, can lead to undesirable over-correction for some features. In this work, we discuss possible causes of the over-correction problem and propose several alternative solutions. Our strategy is rooted in the fact that in the Big Data era, large amount of historical data are available which should be taken advantage of. Our strategy presents a new framework to enhance the Bayesian hierarchical model. Through simulation and real data analysis, we demonstrated superior performance of the proposed strategy. Our new strategy also enables borrowing information across different platforms which could be extremely useful with emergence of new technologies and accumulation of data from different platforms in the Big Data era. Our method has been implemented in R package “adaptiveHM,” which is freely available from https://github.com/benliemory/adaptiveHM.  相似文献   

16.
With the massive amount of sequence and structural data being produced, new avenues emerge for exploiting the information therein for applications in several fields. Fold distributions can be mapped onto entire genomes to learn about the nature of the protein universe and many of the interactions between proteins can now be predicted solely on the basis of the genomic context of their genes. Furthermore, by utilising the new incoming data on single nucleotide polymorphisms by mapping them onto three-dimensional structures of proteins, problems concerning population, medical and evolutionary genetics can be addressed.  相似文献   

17.
After the major achievements of the DNA sequencing projects, an equally important challenge now is to uncover the functional relationships among genes (i.e. gene networks). It has become increasingly clear that computational algorithms are crucial for extracting meaningful information from the massive amount of data generated by high-throughput genome-wide technologies. Here, we summarise how systems identification algorithms, originating from physics and control theory, have been adapted for use in biology. We also explain how experimental perturbations combined with genome-wide measurements are being used to uncover gene networks. Perturbation techniques could pave the way for identifying gene networks in more complex settings such as multifactorial diseases and for improving the efficacy of drug evaluation.  相似文献   

18.
The growing number of bike sharing systems (BSS) in many cities largely facilitates biking for transportation and recreation. Most recent bike sharing systems produce time and location specific data, which enables the study of travel behavior and mobility of each individual. However, despite a rapid growth of interest, studies on massive bike sharing data and the underneath travel pattern are still limited. Few studies have explored and visualized spatiotemporal patterns of bike sharing behavior using flow clustering, nor examined the station functional profiles based on over-demand patterns. This study investigated the spatiotemporal biking pattern in Chicago by analyzing massive BSS data from July to December in 2013 and 2014. The BSS in Chicago gained more popularity. About 15.9% more people subscribed to this service. Specifically, we constructed bike flow similarity graph and used fastgreedy algorithm to detect spatial communities of biking flows. By using the proposed methods, we discovered unique travel patterns on weekdays and weekends as well as different travel trends for customers and subscribers from the noisy massive amount data. In addition, we also examined the temporal demands for bikes and docks using hierarchical clustering method. Results demonstrated the modeled over-demand patterns in Chicago. This study contributes to offer better knowledge of biking flow patterns, which was difficult to obtain using traditional methods. Given the trend of increasing popularity of the BSS and data openness in different cities, methods used in this study can extend to examine the biking patterns and BSS functionality in different cities.  相似文献   

19.
Miller I  Eberini I  Gianazza E 《Proteomics》2008,8(23-24):5053-5073
This compilation accounts the efforts made to characterize the proteomes of lung tissues in health and disease and to recognize proteomic patterns of diseased states in the patient's biological fluids/secretions and lavage fluids. A massive amount of primary data could not lead yet to the identification of diagnostic proteomic signatures. The variability of proteomic findings associated with lung diseases suggests that a useful diagnostic index may eventually result only from the composite predictive values of a large panel of protein markers.  相似文献   

20.
Genomic and proteomic analyses generate a massive amount of data that requires specific bioinformatic tools for its management and interpretation. GARBAN II, developed from the previous GARBAN platform, provides an integrated framework to simultaneously analyse and compare multiple datasets from DNA microarrays and proteomic studies. The general architecture, gene classification and comparison, and graphical representation have been redesigned to ensure a user-friendly feature and to improve the capabilities and efficiency of this system. Additionally, GARBAN II has been extended with new applications to display networks of coexpressed genes and to integrate access to BioRag and MotifScanner so as to facilitate the holistic analysis of users' data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号