首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
谢琛静  徐斯翀  潘琦  周莉  孙祖越 《生物磁学》2014,(14):2763-2768
档案的建立和管理在药物非临床安全性评价和科学研究中都起到非常重要的作用。我们通过查阅文献发现,关于GLP机构档案管理的文献综述极少,涉及到将它与科研档案管理作系统全面的比较的文章是没有的。现从资料档案保存机构的硬件设施、档案管理规范和应注意的问题3个方面详细介绍了资料档案管理的相关规定和经验,并将GLP档案管理规范与科研档案管理规范作比较,我们根据国家档案局发布的《科学技术研究档案管理暂行规定》和7年GLP档案管理经验以及多次国家食品药品监督管理局(SFDA)认证现场检查的经历,总结出两者在功能实现、硬件设施、温湿度要求、档案防护、制定SOP、档案管理人员资质、各方人员职责、归档范围、归档形式、资料档案的接收与审查、归档时间、保管期限、借阅返还规定、资料的书写规范性、进出记录和电子文件的保存这16个方面的异同之处,突出GLP档案管理规范的特点和重点。通过这一深入全面的比较分析,得出GLP档案管理更加明确、具体、细致和可操作。  相似文献   

2.
档案的建立和管理在药物非临床安全性评价和科学研究中都起到非常重要的作用。我们通过查阅文献发现,关于GLP机构档案管理的文献综述极少,涉及到将它与科研档案管理作系统全面的比较的文章是没有的。现从资料档案保存机构的硬件设施、档案管理规范和应注意的问题3个方面详细介绍了资料档案管理的相关规定和经验,并将GLP档案管理规范与科研档案管理规范作比较,我们根据国家档案局发布的《科学技术研究档案管理暂行规定》和7年GLP档案管理经验以及多次国家食品药品监督管理局(SFDA)认证现场检查的经历,总结出两者在功能实现、硬件设施、温湿度要求、档案防护、制定SOP、档案管理人员资质、各方人员职责、归档范围、归档形式、资料档案的接收与审查、归档时间、保管期限、借阅返还规定、资料的书写规范性、进出记录和电子文件的保存这16个方面的异同之处,突出GLP档案管理规范的特点和重点。通过这一深入全面的比较分析,得出GLP档案管理更加明确、具体、细致和可操作。  相似文献   

3.
High-throughput sequencing assays are now routinely used to study different aspects of genome organization. As decreasing costs and widespread availability of sequencing enable more laboratories to use sequencing assays in their research projects, the number of samples and replicates in these experiments can quickly grow to several dozens of samples and thus require standardized annotation, storage and management of preprocessing steps. As a part of the STATegra project, we have developed an Experiment Management System (EMS) for high throughput omics data that supports different types of sequencing-based assays such as RNA-seq, ChIP-seq, Methyl-seq, etc, as well as proteomics and metabolomics data. The STATegra EMS provides metadata annotation of experimental design, samples and processing pipelines, as well as storage of different types of data files, from raw data to ready-to-use measurements. The system has been developed to provide research laboratories with a freely-available, integrated system that offers a simple and effective way for experiment annotation and tracking of analysis procedures.  相似文献   

4.
Summary: As random shotgun metagenomic projects proliferate and become the dominant source of publicly available sequence data, procedures for the best practices in their execution and analysis become increasingly important. Based on our experience at the Joint Genome Institute, we describe the chain of decisions accompanying a metagenomic project from the viewpoint of the bioinformatic analysis step by step. We guide the reader through a standard workflow for a metagenomic project beginning with presequencing considerations such as community composition and sequence data type that will greatly influence downstream analyses. We proceed with recommendations for sampling and data generation including sample and metadata collection, community profiling, construction of shotgun libraries, and sequencing strategies. We then discuss the application of generic sequence processing steps (read preprocessing, assembly, and gene prediction and annotation) to metagenomic data sets in contrast to genome projects. Different types of data analyses particular to metagenomes are then presented, including binning, dominant population analysis, and gene-centric analysis. Finally, data management issues are presented and discussed. We hope that this review will assist bioinformaticians and biologists in making better-informed decisions on their journey during a metagenomic project.  相似文献   

5.
Associating phenotypic traits and quantitative trait loci (QTL) to causative regions of the underlying genome is a key goal in agricultural research.InterStoreDB is a suite of integrated databases designed to assist in this process.The individual databases are species independent and generic in design,providing access to curated datasets relating to plant populations,phenotypic traits,genetic maps,marker loci and QTL,with links to functional gene annotation and genomic sequence data.Each component database provides access to associated metadata,including data provenance and parameters used in analyses,thus providing users with information to evaluate the relative worth of any associations identified.The databases include CropStoreDB,for management of population,genetic map,QTL and trait measurement data,SeqStoreDB for sequence-related data and AlignStoreDB,which stores sequence alignment information,and allows navigation between genetic and genomic datasets.Genetic maps are visualized and compared using the CMAP tool,and functional annotation from sequenced genomes is provided via an EnsEMBL-based genome browser.This framework facilitates navigation of the multiple biological domains involved in genetics and genomics research in a transparent manner within a single portal.We demonstrate the value of InterStoreDB as a tool for Brassica research.InterStoreDB is available from:http://www.interstoredb.org  相似文献   

6.
Progress made in applying agent systems to molecular computational biology is reviewed and strategies by which to exploit agent technology to greater advantage are investigated. Communities of software agents could play an important role in helping genome scientists design reagents for future research. The advent of genome sequencing in cattle and swine increases the complexity of data analysis required to conduct research in livestock genomics. Databases are always expanding and semantic differences among data are common. Agent platforms have been developed to deal with generic issues such as agent communication, life cycle management and advertisement of services (white and yellow pages). This frees computational biologists from the drudgery of having to re-invent the wheel on these common chores, giving them more time to focus on biology and bioinformatics. Agent platforms that comply with the Foundation for Intelligent Physical Agents (FIPA) standards are able to interoperate. In other words, agents developed on different platforms can communicate and cooperate with one another if domain-specific higher-level communication protocol details are agreed upon between different agent developers. Many software agent platforms are peer-to-peer, which means that even if some of the agents and data repositories are temporarily unavailable, a subset of the goals of the system can still be met. Past use of software agents in bioinformatics indicates that an agent approach should prove fruitful. Examination of current problems in bioinformatics indicates that existing agent platforms should be adaptable to novel situations.  相似文献   

7.

Background  

New "next generation" DNA sequencing technologies offer individual researchers the ability to rapidly generate large amounts of genome sequence data at dramatically reduced costs. As a result, a need has arisen for new software tools for storage, management and analysis of genome sequence data. Although bioinformatic tools are available for the analysis and management of genome sequences, limitations still remain. For example, restrictions on the submission of data and use of these tools may be imposed, thereby making them unsuitable for sequencing projects that need to remain in-house or proprietary during their initial stages. Furthermore, the availability and use of next generation sequencing in industrial, governmental and academic environments requires biologist to have access to computational support for the curation and analysis of the data generated; however, this type of support is not always immediately available.  相似文献   

8.
The decision to use 10% neutral buffered formalin fixed, paraffin embedded (FFPE) archival pathology material may be dictated by the cancer research question or analytical technique, or may be governed by national ethical, legal and social implications (ELSI), biobank, and sample availability and access policy. Biobanked samples of common tumors are likely to be available, but not all samples will be annotated with treatment and outcomes data and this may limit their application. Tumors that are rare or very small exist mostly in FFPE pathology archives. Pathology departments worldwide contain millions of FFPE archival samples, but there are challenges to availability. Pathology departments lack resources for retrieving materials for research or for having pathologists select precise areas in paraffin blocks, a critical quality control step. When samples must be sourced from several pathology departments, different fixation and tissue processing approaches create variability in quality. Researchers must decide what sample quality and quality tolerance fit their specific purpose and whether sample enrichment is required. Recent publications report variable success with techniques modified to examine all common species of molecular targets in FFPE samples. Rigorous quality management may be particularly important in sample preparation for next generation sequencing and for optimizing the quality of extracted proteins for proteomics studies. Unpredictable failures, including unpublished ones, likely are related to pre-analytical factors, unstable molecular targets, biological and clinical sampling factors associated with specific tissue types or suboptimal quality management of pathology archives. Reproducible results depend on adherence to pre-analytical phase standards for molecular in vitro diagnostic analyses for DNA, RNA and in particular, extracted proteins. With continuing adaptations of techniques for application to FFPE, the potential to acquire much larger numbers of FFPE samples and the greater convenience of using FFPE in assays for precision medicine, the choice of material in the future will become increasingly biased toward FFPE samples from pathology archives. Recognition that FFPE samples may harbor greater variation in quality than frozen samples for several reasons, including variations in fixation and tissue processing, requires that FFPE results be validated provided a cohort of frozen tissue samples is available.  相似文献   

9.
张源笙  夏琳  桑健  李漫  刘琳  李萌伟  牛广艺  曹佳宝  滕徐菲  周晴  章张 《遗传》2018,40(11):1039-1043
生命与健康多组学数据是生命科学研究和生物医学技术发展的重要基础。然而,我国缺乏生物数据管理和共享平台,不但无法满足国内日益增长的生物医学及相关学科领域的研究发展需求,而且严重制约我国生物大数据整合共享与转化利用。鉴于此,中国科学院北京基因组研究所于2016年初成立生命与健康大数据中心(BIG Data Center, BIGD),围绕国家人口健康和重要战略生物资源,建立生物大数据管理平台和多组学数据资源体系。本文重点介绍BIGD的生命与健康大数据资源系统,主要包括组学原始数据归档库、基因组数据库、基因组变异数据库、基因表达数据库、甲基化数据库、生物信息工具库和生命科学维基知识库,提供生物大数据汇交、整合与共享服务,为促进我国生命科学数据管理、推动国家生物信息中心建设奠定重要基础。  相似文献   

10.
Data sharing by scientists: practices and perceptions   总被引:10,自引:0,他引:10  

Background

Scientific research in the 21st century is more data intensive and collaborative than in the past. It is important to study the data practices of researchers – data accessibility, discovery, re-use, preservation and, particularly, data sharing. Data sharing is a valuable part of the scientific method allowing for verification of results and extending research from prior results.

Methodology/Principal Findings

A total of 1329 scientists participated in this survey exploring current data sharing practices and perceptions of the barriers and enablers of data sharing. Scientists do not make their data electronically available to others for various reasons, including insufficient time and lack of funding. Most respondents are satisfied with their current processes for the initial and short-term parts of the data or research lifecycle (collecting their research data; searching for, describing or cataloging, analyzing, and short-term storage of their data) but are not satisfied with long-term data preservation. Many organizations do not provide support to their researchers for data management both in the short- and long-term. If certain conditions are met (such as formal citation and sharing reprints) respondents agree they are willing to share their data. There are also significant differences and approaches in data management practices based on primary funding agency, subject discipline, age, work focus, and world region.

Conclusions/Significance

Barriers to effective data sharing and preservation are deeply rooted in the practices and culture of the research process as well as the researchers themselves. New mandates for data management plans from NSF and other federal agencies and world-wide attention to the need to share and preserve data could lead to changes. Large scale programs, such as the NSF-sponsored DataNET (including projects like DataONE) will both bring attention and resources to the issue and make it easier for scientists to apply sound data management principles.  相似文献   

11.
树轮灰度与树轮密度的对比分析及其对气候要素的响应   总被引:2,自引:0,他引:2  
通过对比新疆巩乃斯地区艾肯达坂采样点雪岭云杉5种树轮灰度年表与其对应4种密度年表的特征参数、年表曲线及其在全频域、高频域及低频域上的相关系数,发现早材平均灰度和晚材平均灰度的变化能够较好的反映早材平均密度和晚材平均密度的变化,而年轮最大灰度和年轮最小灰度的变化对年轮最小密度和年轮最大密度的变化则反映较差.与这一地区气象资料的相关分析结果表明,当年5月至8月平均最高气温与年轮平均灰度年表的相关性最好且具有明确的树木生理学意义,最高单相关系数为-0.542 (P<0.0001,n=51).证明了树轮灰度在历史时期气候变化研究中的应用潜力,同时也为将来在这一地区开展利用树轮灰度重建历史时期气候变化打下了基础.  相似文献   

12.
Genome data are becoming increasingly important for modern medicine. As the rate of increase in DNA sequencing outstrips the rate of increase in disk storage capacity, the storage and data transferring of large genome data are becoming important concerns for biomedical researchers. We propose a two-pass lossless genome compression algorithm, which highlights the synthesis of complementary contextual models, to improve the compression performance. The proposed framework could handle genome compression with and without reference sequences, and demonstrated performance advantages over best existing algorithms. The method for reference-free compression led to bit rates of 1.720 and 1.838 bits per base for bacteria and yeast, which were approximately 3.7% and 2.6% better than the state-of-the-art algorithms. Regarding performance with reference, we tested on the first Korean personal genome sequence data set, and our proposed method demonstrated a 189-fold compression rate, reducing the raw file size from 2986.8 MB to 15.8 MB at a comparable decompression cost with existing algorithms. DNAcompact is freely available at https://sourceforge.net/projects/dnacompact/for research purpose.  相似文献   

13.
《Genomics》2019,111(3):441-449
The Mongolian gerbil (Meriones unguiculatus) is a member of the rodent family that displays several features not found in mice or rats, including sensory specializations and social patterns more similar to those in humans. These features have made gerbils a valuable animal for research studies of auditory and visual processing, brain development, learning and memory, and neurological disorders. Here, we report the whole gerbil annotated genome sequence, and identify important similarities and differences to the human and mouse genomes. We further analyze the chromosomal structure of eight genes with high relevance for controlling neural signaling and demonstrate a high degree of homology between these genes in mouse and gerbil. This homology increases the likelihood that individual genes can be rapidly identified in gerbil and used for genetic manipulations. The availability of the gerbil genome provides a foundation for advancing our knowledge towards understanding evolution, behavior and neural function in mammals.Accession numberThe Whole Genome Shotgun sequence data from this project has been deposited at DDBJ/ENA/GenBank under the accession NHTI00000000. The version described in this paper is version NHTI01000000. The fragment reads, and mate pair reads have been deposited in the Sequence Read Archive under BioSample accession SAMN06897401.  相似文献   

14.

Background  

Whole genome shotgun sequencing produces increasingly higher coverage of a genome with random sequence reads. Progressive whole genome assembly and eventual finishing sequencing is a process that typically takes several years for large eukaryotic genomes. In the interim, all sequence reads of public sequencing projects are made available in repositories such as the NCBI Trace Archive. For a particular locus, sequencing coverage may be high enough early on to produce a reliable local genome assembly. We have developed software, Tracembler, that facilitates in silico chromosome walking by recursively assembling reads of a selected species from the NCBI Trace Archive starting with reads that significantly match sequence seeds supplied by the user.  相似文献   

15.
The open sharing of genomic data provides an incredibly rich resource for the study of bacterial evolution and function and even anthropogenic activities such as the widespread use of antimicrobials. However, these data consist of genomes assembled with different tools and levels of quality checking, and of large volumes of completely unprocessed raw sequence data. In both cases, considerable computational effort is required before biological questions can be addressed. Here, we assembled and characterised 661,405 bacterial genomes retrieved from the European Nucleotide Archive (ENA) in November of 2018 using a uniform standardised approach. Of these, 311,006 did not previously have an assembly. We produced a searchable COmpact Bit-sliced Signature (COBS) index, facilitating the easy interrogation of the entire dataset for a specific sequence (e.g., gene, mutation, or plasmid). Additional MinHash and pp-sketch indices support genome-wide comparisons and estimations of genomic distance. Combined, this resource will allow data to be easily subset and searched, phylogenetic relationships between genomes to be quickly elucidated, and hypotheses rapidly generated and tested. We believe that this combination of uniform processing and variety of search/filter functionalities will make this a resource of very wide utility. In terms of diversity within the data, a breakdown of the 639,981 high-quality genomes emphasised the uneven species composition of the ENA/public databases, with just 20 of the total 2,336 species making up 90% of the genomes. The overrepresented species tend to be acute/common human pathogens, aligning with research priorities at different levels from individual interests to funding bodies and national and global public health agencies.

This study presents the first uniformly assembled, comprehensively described and searchable dataset of 661,405 bacterial genomes; this resource will empower more scientists to harness the multitude of data in public sequencing archives, but also reveals the biased composition of these archives, with 90% of the data originating from just 20 species.  相似文献   

16.
The Genome Warehouse (GWH) is a public repository housing genome assembly data for a wide range of species and delivering a series of web services for genome data submission, storage, release, and sharing. As one of the core resources in the National Genomics Data Center (NGDC), part of the China National Center for Bioinformation (CNCB; https://ngdc.cncb.ac.cn), GWH accepts both full and partial (chloroplast, mitochondrion, and plasmid) genome sequences with different assembly levels, as well as an update of existing genome assemblies. For each assembly, GWH collects detailed genome-related metadata of biological project, biological sample, and genome assembly, in addition to genome sequence and annotation. To archive high-quality genome sequences and annotations, GWH is equipped with a uniform and standardized procedure for quality control. Besides basic browse and search functionalities, all released genome sequences and annotations can be visualized with JBrowse. By May 21, 2021, GWH has received 19,124 direct submissions covering a diversity of 1108 species and has released 8772 of them. Collectively, GWH serves as an important resource for genome-scale data management and provides free and publicly accessible data to support research activities throughout the world. GWH is publicly accessible at https://ngdc.cncb.ac.cn/gwh.  相似文献   

17.

Background  

The Trace Archive is a repository for the raw, unanalyzed data generated by large-scale genome sequencing projects. The existence of this data offers scientists the possibility of discovering additional genomic sequences beyond those originally sequenced. In particular, if the source DNA for a sequencing project came from a species that was colonized by another organism, then the project may yield substantial amounts of genomic DNA, including near-complete genomes, from the symbiotic or parasitic organism.  相似文献   

18.
Research data management (RDM) requires standards, policies, and guidelines. Findable, accessible, interoperable, and reusable (FAIR) data management is critical for sustainable research. Therefore, collaborative approaches for managing FAIR-structured data are becoming increasingly important for long-term, sustainable RDM. However, they are rather hesitantly applied in bioengineering. One of the reasons may be found in the interdisciplinary character of the research field. In addition, bioengineering as application of principles of biology and tools of process engineering, often have to meet different criteria. In consequence, RDM is complicated by the fact that researchers from different scientific institutions must meet the criteria of their home institution, which can lead to additional conflicts. Therefore, centrally provided general repositories implementing a collaborative approach that enables data storage from the outset In a biotechnology research network with over 20 tandem projects, it was demonstrated how FAIR-RDM can be implemented through a collaborative approach and the use of a data structure. In addition, the importance of a structure within a repository was demonstrated to keep biotechnology research data available throughout the entire data lifecycle. Furthermore, the biotechnology research network highlighted the importance of a structure within a repository to keep research data available throughout the entire data lifecycle.  相似文献   

19.
Data management systems are fast becoming required components in many biology laboratories as the role of computer-based information grows. Although the need for data management systems is on the rise, their inherent complexities can deter the full and routine use of their computational capabilities. The significant undertaking to implement a capable production system can be reduced in part by adapting an established data management system. In such a way, we are leveraging the Genomics Unified Schema (GUS) developed at the Computational Biology and Informatics Laboratory at the University of Pennsylvania as a foundation for managing and analysing DNA sequence data in centromere research projects around Arabidopsis thaliana and related species. Because GUS provides a core schema that includes support for genome sequences, mRNA and its expression, and annotated chromosomes, it is ideal for synthesising a variety of parameters to analyse these repetitive and highly dynamic portions of the genome. Despite this, production-strength data management frameworks are complex, requiring dedicated efforts to adapt and maintain. The work reported in this article addresses one component of such an effort, namely the pivotal task of marshalling data from various sources into GUS. In order to harness GUS for our project, and motivated by efficiency needs, we developed a structured framework for transferring data into GUS from outside sources. This technology is embodied in a GUS object-layer processor, XMLGUS. XMLGUS facilitates incorporating data into GUS by (i) formulating an XML interface that includes relational database key constraint definitions, (ii) regularising traversal through that XML, (iii) realising automatic processing of the XML with database key constraints and (iv) allowing for special processing of input data within the framework for automated processing. The application of XMLGUS to production pipeline processing for a sequencing project and inputting the Arabidopsis genome into GUS is discussed. XMLGUS is available from the Flora website (http://flora.ittc.ku.edu/).  相似文献   

20.
《TARGETS》2002,1(4):139-146
The pharmaceutical industry is facing the challenge of managing the exponential increase in volume, diversity and complexity of data generated by high-throughput technologies such as genome sequencing, gene-expression profiling, protein-expression profiling, metabolic profiling and high-throughput screening. These novel ‘genomics’ technologies are expected to reshape the approach of life science companies to research. Unfortunately, in many cases genomics technologies have been used uncritically, and some preliminary results have been disappointing. The lack of standardized data validation and quality assurance processes is recognized as one of the major hurdles for successfully implementing genomics technologies. This is particularly important for industrialized drug discovery processes, because more and more key conclusions and far-reaching decisions in the pharmaceutical industry are based on data that is generated automatically. Therefore, automated, specialized quality-control systems that can spot erroneous data that might obscure important biological effects are needed urgently. In this article, special emphasis is placed on DNA microarray technologies, a key genomics technology that suffers from severe problems with data quality. A generic, automatable data-quality-assurance workflow is discussed that will ultimately improve the quality of the drug candidates and, at the same time, reduce overall drug-development costs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号