首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The analysis of multiple-indicator dilution curves to estimate the rates of transport of ions and substrates across the sarcolemma of myocardial cells requires the formulation of models for the blood-interstitial fluid-cell exchanges. The fitting of models to the sets of experimental data is dependent on acquiring a large enough data set, in one physiological state, that there is at least as much information in the data as there are unknown model parameters to be determined. Inasmuch as data are necessarily noisy, redundancy of data and overdetermination of the unknowns are highly desirable. Sensitivity functions are useful in demonstrating which portions of the data relate to which unknown parameters. They are also useful in adjusting model parameters to fit the model to the data, and therefore in parameter evaluation.  相似文献   

2.
Systematic data in the form of collections data are useful in biodiversity studies in many ways, most importantly because they serve as the only direct evidence of species distributions. However, collecting bias has been demonstrated for most areas of the world and has led some to propose methods that circumvent the need for collections data. New methods that model collections data in combination with abiotic data and predict potential total species distribution are examined using 25,111 records representing 5,123 species of plants and animals from Guyana; some methods use the reduced number of 320 species. These modeled species distributions are evaluated and potential high-priority biodiversity sites are selected based on the concept of irreplaceability, a measure of uniqueness. The major impediments to using collections data are the lack of data that are available in a useful format and the reluctance of most systematists to become involved in biodiversity and conservation research.  相似文献   

3.
Plant-macrofossil analysis is being increasingly used in Quaternary science, particularly palaeoecology and vegetation history. Although the techniques of macrofossil analysis are well-tried and relatively simple, the resulting data consisting of qualitative binary presences and absences, ordinal classes, and quantitative counts are not simple from the viewpoint of numerical data-analysis. This essay reviews the nature of macrofossil data and discusses the problem of zero and non-zero values. Problems in the presentation of macrofossil data are outlined and possible solutions are discussed. The handling of such data is discussed in terms of data summarisation, data analysis, and data interpretation. Newly developed numerical methods that take account of the mixed nature and the stratigraphical ordering of macrofossil data are outlined, such as (distance-based) multivariate regression trees, canonical analysis of principal coordinates, principal curves, cascade multivariate regression trees, and RLQ analysis. These and other techniques outlined have the potential to help exploit the full potential of macrofossil stratigraphical data in Quaternary palaeoecology.  相似文献   

4.
基因表达谱芯片和核酸序列数据在癌症研究中占有很重要的地位。基因表达谱芯片被广泛的应用在医学研究中,它的主要优势在于灵敏快速成本低,缺点只能对现有基因进行研究,无法进行新基因发现以及变异等方面的研究;而核酸序列数据在这方面则具有很大优势。总体来说,二者在癌症研究中都发挥着巨大的作用。随着精准医学的不断发展,对这些高通量数据的深入研究可以有助于人们进一步了解癌症的分子机制,从而加速个体化治疗的进程。  相似文献   

5.
Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data intensive in that data analytics requires complex procedures and multiple tools. To tackle these challenges, a scientific workflow framework is proposed for big geoscience data analytics. In this framework techniques are proposed by leveraging cloud computing, MapReduce, and Service Oriented Architecture (SOA). Specifically, HBase is adopted for storing and managing big geoscience data across distributed computers. MapReduce-based algorithm framework is developed to support parallel processing of geoscience data. And service-oriented workflow architecture is built for supporting on-demand complex data analytics in the cloud environment. A proof-of-concept prototype tests the performance of the framework. Results show that this innovative framework significantly improves the efficiency of big geoscience data analytics by reducing the data processing time as well as simplifying data analytical procedures for geoscientists.  相似文献   

6.
Challenges and opportunities in proteomics data analysis   总被引:2,自引:0,他引:2  
Accurate, consistent, and transparent data processing and analysis are integral and critical parts of proteomics workflows in general and for biomarker discovery in particular. Definition of common standards for data representation and analysis and the creation of data repositories are essential to compare, exchange, and share data within the community. Current issues in data processing, analysis, and validation are discussed together with opportunities for improving the process in the future and for defining alternative workflows.  相似文献   

7.
There is a concerted global effort to digitize biodiversity occurrence data from herbarium and museum collections that together offer an unparalleled archive of life on Earth over the past few centuries. The Global Biodiversity Information Facility provides the largest single gateway to these data. Since 2004 it has provided a single point of access to specimen data from databases of biological surveys and collections. Biologists now have rapid access to more than 120 million observations, for use in many biological analyses. We investigate the quality and coverage of data digitally available, from the perspective of a biologist seeking distribution data for spatial analysis on a global scale. We present an example of automatic verification of geographic data using distributions from the International Legume Database and Information Service to test empirically, issues of geographic coverage and accuracy. There are over 1/2 million records covering 31% of all Legume species, and 84% of these records pass geographic validation. These data are not yet a global biodiversity resource for all species, or all countries. A user will encounter many biases and gaps in these data which should be understood before data are used or analyzed. The data are notably deficient in many of the world's biodiversity hotspots. The deficiencies in data coverage can be resolved by an increased application of resources to digitize and publish data throughout these most diverse regions. But in the push to provide ever more data online, we should not forget that consistent data quality is of paramount importance if the data are to be useful in capturing a meaningful picture of life on Earth.  相似文献   

8.
Incorrect statistical methods are often used for the analysisof ordinal response data. Such data are frequently summarizedinto mean scores for comparisons, a fallacious practice becauseordinal data are inherently not equidistant. The ubiquitousPearson chi-square test is invalid because it ignores the rankingof ordinal data. Although some of the non-parametric statisticalmethods take into account the ordering of ordinal data, thesemethods do not accommodate statistical adjustment of confoundingor assessment of effect modification, two overriding analyticgoals in virtually all etiologic inference in biology and medicine.The cumulative logit model is eminently suitable for the anlaysisof ordinal response data. This multivariate method not onlyconsiders the ranked order inherent in ordinal response data,but it also allows adjustment of confounding and assessmentof effect modification based on modest sample size. A non-technicalaccount of the cumulative logit model is given and its applicationsare illustrated by two research examples. The SAS programs forthe data analysis of the research examples are available fromthe author.  相似文献   

9.
Research on practices to share and reuse data will inform the design of infrastructure to support data collection, management, and discovery in the long tail of science and technology. These are research domains in which data tend to be local in character, minimally structured, and minimally documented. We report on a ten-year study of the Center for Embedded Network Sensing (CENS), a National Science Foundation Science and Technology Center. We found that CENS researchers are willing to share their data, but few are asked to do so, and in only a few domain areas do their funders or journals require them to deposit data. Few repositories exist to accept data in CENS research areas.. Data sharing tends to occur only through interpersonal exchanges. CENS researchers obtain data from repositories, and occasionally from registries and individuals, to provide context, calibration, or other forms of background for their studies. Neither CENS researchers nor those who request access to CENS data appear to use external data for primary research questions or for replication of studies. CENS researchers are willing to share data if they receive credit and retain first rights to publish their results. Practices of releasing, sharing, and reusing of data in CENS reaffirm the gift culture of scholarship, in which goods are bartered between trusted colleagues rather than treated as commodities.  相似文献   

10.
The completion of the Arabidopsis genome and the large collections of other plant sequences generated in recent years have sparked extensive functional genomics efforts. However, the utilization of this data is inefficient, as data sources are distributed and heterogeneous and efforts at data integration are lagging behind. PlaNet aims to overcome the limitations of individual efforts as well as the limitations of heterogeneous, independent data collections. PlaNet is a distributed effort among European bioinformatics groups and plant molecular biologists to establish a comprehensive integrated database in a collaborative network. Objectives are the implementation of infrastructure and data sources to capture plant genomic information into a comprehensive, integrated platform. This will facilitate the systematic exploration of Arabidopsis and other plants. New methods for data exchange, database integration and access are being developed to create a highly integrated, federated data resource for research. The connection between the individual resources is realized with BioMOBY. BioMOBY provides an architecture for the discovery and distribution of biological data through web services. While knowledge is centralized, data is maintained at its primary source without a need for warehousing. To standardize nomenclature and data representation, ontologies and generic data models are defined in interaction with the relevant communities.Minimal data models should make it simple to allow broad integration, while inheritance allows detail and depth to be added to more complex data objects without losing integration. To allow expert annotation and keep databases curated, local and remote annotation interfaces are provided. Easy and direct access to all data is key to the project.  相似文献   

11.
随着生物测序技术的快速发展,积累了海量的生物数据。生物数据资源作为生物分析研究及应用的核心和源头,为保证数据的正确性、可用性和安全性,对生物数据资源进行标准化的管理非常重要和迫切。本文综述了目前国内外生物数据标准化研制进展,目前国内外对生物数据缺少一个总体的规划,生物数据语义存在大量的不兼容性,数据格式多种多样,在生物数据收集、处理、存储和共享等方面缺乏统一的标准。国内外生物数据标准化处于起步阶段,但各国生物专家都在努力进行标准研制工作。文章最后从生物数据术语、生物数据资源收集、处理和交换、存储、生物数据库建设和生物数据伦理规范等方面出发,对标准研制工作进行一一探讨,期望能为生物数据标准制定提供一定的参考和依据。  相似文献   

12.
13.
Proposed standard for image cytometry data files   总被引:1,自引:0,他引:1  
P Dean  L Mascio  D Ow  D Sudar  J Mullikin 《Cytometry》1990,11(5):561-569
A number of different types of computers running a variety of operating systems are presently used for the collection and analysis of image cytometry data. In order to facilitate the development of sharable data analysis programs, to allow for the transport of image cytometry data from one installation to another, and to provide a uniform and controlled means for including textual information in data files, this document describes a data storage format that is proposed as a standard for use in image cytometry. In this standard, data from an image measurement are stored in a minimum of two files. One file is written in ASCII to include information about the way the image data are written and optionally, information about the sample, experiment, equipment, etc. The image data are written separately into a binary file. This standard is proposed with the intention that it will be used internationally for the storage and handling of biomedical image cytometry data. The method of data storage described in this paper is similar to those methods published in American Association of Physicists in Medicine (AAPM) Report Number 10 and in ACR-NEMA Standards Publication Number 300-1985.  相似文献   

14.
Although most statistical methods for the analysis of longitudinal data have focused on retrospective models of association, new advances in mobile health data have presented opportunities for predicting future health status by leveraging an individual's behavioral history alongside data from similar patients. Methods that incorporate both individual-level and sample-level effects are critical to using these data to its full predictive capacity. Neural networks are powerful tools for prediction, but many assume input observations are independent even when they are clustered or correlated in some way, such as in longitudinal data. Generalized linear mixed models (GLMM) provide a flexible framework for modeling longitudinal data but have poor predictive power particularly when the data are highly nonlinear. We propose a generalized neural network mixed model that replaces the linear fixed effect in a GLMM with the output of a feed-forward neural network. The model simultaneously accounts for the correlation structure and complex nonlinear relationship between input variables and outcomes, and it utilizes the predictive power of neural networks. We apply this approach to predict depression and anxiety levels of schizophrenic patients using longitudinal data collected from passive smartphone sensor data.  相似文献   

15.
Functional understanding of signaling pathways requires detailed information about the constituent molecules and their interactions. Simulations of signaling pathways therefore build upon a great deal of data from various sources. We first survey electronic data resources for cell signaling modeling and then based on the type of data representation the data sources are broadly classified into five groups. None of the data sources surveyed provide all required data in a ready-to-be-modeled fashion. We then put forward a "wish list" for the desired attributes for an ideal modeling centric database. Finally, we close with perspectives on how electronic data sources for cell signaling modeling have developed. We suggest that future directions in such data sources are largely model-driven and are hinged on interoperability of data sources.  相似文献   

16.
The cancer classification problem is one of the most challenging problems in bioinformatics. The data provided by Netherland Cancer Institute consists of 295 breast cancer patient; 101 patients are with distant metastases and 194 patients are without distant metastases. Combination of features sets based on kernel method to classify the patient who are with or without distant metastases will be investigated. The single data set will be compared with three data integration strategies and also weighted data integration strategies based on kernel method. Least Square Support Vector Machine (LS-SVM) is chosen as the classifier because it can handle very high dimensional features, for instance, microarray data. The experiment result shows that the performance of weighted late integration and the using of only microarray data are almost similar. The data integration strategy is not always better than using single data set in this case. The performance of classification absolutely depends on the features that are used to represent the object.  相似文献   

17.
This study makes use of three sources of data, morphology and two chloroplast DNA sequences,ndhF andrbcL, to resolve relationships in Gesneriaceae. Cladograms from each of the three data sets separately are not topologically congruent. Statistical indices suggest that each data set is congruent with thendhF data althoughrbcL and morphology are themselves incongruent. Consensus methods provide no resolution of taxonomic relationships when trees from the different data sets are combined. Combining data sets generally results in cladograms that are more fully resolved than each of the data sets analyzed separately and support for the clades increases based on higher decay index and bootstrap values. These results indicate that there is a phylogenetic signal common to each of the data sets, however, the noise (errors due to homoplasy, mis-scoring, etc.) unique to each data source masks this signal. In combining the data, the evidence for the common evolutionary history in each data set overcomes the noise and is apparent in the resulting trees.  相似文献   

18.
Previous phylogenetic analyses of caecilian neuroanatomical data yield results that are difficult to reconcile with those based upon more traditional morphological and molecular data. A review of the literature reveals problems in both the analyses and the data upon which the analyses were based. Revision of the neuroanatomical data resolves some, but not all, of these problems and yields a data set that, based on comparative measures of data quality, appears to represent some improvement over previous treatments. An extended data set of more traditional primarily morphological data is developed to facilitate the evaluation of caecilian relationships and the quality and utility of neuroanatomical and more traditional data. Separate and combined analyses of the neuroanatomical and traditional data produce a variety of results dependent upon character weighting, with little congruence among the results of the separate analyses and little support for relationships among the ‘higher’ caecilians with the combined data. Randomization tests indicate that: (i) there is significantly less incompatibility within each data set than that expected by chance alone; (2) the between-data-set incompatibility is significantly greater than that expected for random partitions of characters so the two data sets are significantly heterogeneous; (3) the neuroanatomical data appear generally of lower quality than the traditional data; (4) the neuroanatomical data are more compatible with the traditional data than are phylogenetically uninformative data. The lower quality of the neuroanatomical data may reflect small sample sizes. In addition, a subset of the neuroanatomical characters supports an unconventional grouping of all those caecilians with the most rudimentary eyes, which may reflect concerted homoplasy. Although the neuroanatomical data may be of lower quality than the traditional data, their compatibility with the traditional data suggests that they cannot be dismissed as phylogenetically meaningless. Conclusions on caecilian relationships are constrained by the conflict between the neuroanatomical and traditional data, the sensitivity of the combined analyses to weighting schemes, and by the limited support for the majority of groups in the majority of the analyses. Those hypotheses that are well supported are uncontroversial, although some have not been tested previously by numerical phylogenetic analyses. However, the data do not justify an hypothesis of ‘higher’ caecilian phylogeny that is both well resolved and well supported.  相似文献   

19.
常见寿命数据类型及生命表的编制方法   总被引:1,自引:0,他引:1  
生命表是描述种群死亡过程的有用工具,介绍了4种常见的寿命数据类型;寿终数据,右删失数据,左删失数据和区间型数据特征及其相应的数据分析处理方法即生命表法,乘积限估计和Turbull估计法,对生命表法和乘积限估计法应用上的特点进行了比较,同时还对特殊的寿命数据类型--截断数据做了简要介绍。  相似文献   

20.
Data requirements and data sources for biodiversity priority area selection   总被引:9,自引:0,他引:9  
The data needed to prioritize areas for biodiversity protection are records of biodiversity features — species, species assemblages, environmental classes — for each candidate area. Prioritizing areas means comparing candidate areas, so the data used to make such comparisons should be comparable in quality and quantity. Potential sources of suitable data include museums, herbariums and natural resource management agencies. Issues of data precision, accuracy and sampling bias in data sets from such sources are discussed and methods for treating data to minimize bias are reviewed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号