首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The reliable estimation of animal location, and its associated error is fundamental to animal ecology. There are many existing techniques for handling location error, but these are often ad hoc or are used in isolation from each other. In this study we present a Bayesian framework for determining location that uses all the data available, is flexible to all tagging techniques, and provides location estimates with built-in measures of uncertainty. Bayesian methods allow the contributions of multiple data sources to be decomposed into manageable components. We illustrate with two examples for two different location methods: satellite tracking and light level geo-location. We show that many of the problems with uncertainty involved are reduced and quantified by our approach. This approach can use any available information, such as existing knowledge of the animal''s potential range, light levels or direct location estimates, auxiliary data, and movement models. The approach provides a substantial contribution to the handling uncertainty in archival tag and satellite tracking data using readily available tools.  相似文献   

2.
This paper considers the clustering problem of physical step count data recorded on wearable devices. Clustering step data give an insight into an individual's activity status and further provide the groundwork for health‐related policies. However, classical methods, such as K‐means clustering and hierarchical clustering, are not suitable for step count data that are typically high‐dimensional and zero‐inflated. This paper presents a new clustering method for step data based on a novel combination of ensemble clustering and binning. We first construct multiple sets of binned data by changing the size and starting position of the bin, and then merge the clustering results from the binned data using a voting method. The advantage of binning, as a critical component, is that it substantially reduces the dimension of the original data while preserving the essential characteristics of the data. As a result, combining clustering results from multiple binned data can provide an improved clustering result that reflects both local and global structures of the data. Simulation studies and real data analysis were carried out to evaluate the empirical performance of the proposed method and demonstrate its general utility.  相似文献   

3.
Advances in molecular “omics” technologies have motivated new methodologies for the integration of multiple sources of high-content biomedical data. However, most statistical methods for integrating multiple data matrices only consider data shared vertically (one cohort on multiple platforms) or horizontally (different cohorts on a single platform). This is limiting for data that take the form of bidimensionally linked matrices (eg, multiple cohorts measured on multiple platforms), which are increasingly common in large-scale biomedical studies. In this paper, we propose bidimensional integrative factorization (BIDIFAC) for integrative dimension reduction and signal approximation of bidimensionally linked data matrices. Our method factorizes data into (a) globally shared, (b) row-shared, (c) column-shared, and (d) single-matrix structural components, facilitating the investigation of shared and unique patterns of variability. For estimation, we use a penalized objective function that extends the nuclear norm penalization for a single matrix. As an alternative to the complicated rank selection problem, we use results from the random matrix theory to choose tuning parameters. We apply our method to integrate two genomics platforms (messenger RNA and microRNA expression) across two sample cohorts (tumor samples and normal tissue samples) using the breast cancer data from the Cancer Genome Atlas. We provide R code for fitting BIDIFAC, imputing missing values, and generating simulated data.  相似文献   

4.

Background

Analysis of data from multiple sources has the potential to enhance knowledge discovery by capturing underlying structures, which are, otherwise, difficult to extract. Fusing data from multiple sources has already proved useful in many applications in social network analysis, signal processing and bioinformatics. However, data fusion is challenging since data from multiple sources are often (i) heterogeneous (i.e., in the form of higher-order tensors and matrices), (ii) incomplete, and (iii) have both shared and unshared components. In order to address these challenges, in this paper, we introduce a novel unsupervised data fusion model based on joint factorization of matrices and higher-order tensors.

Results

While the traditional formulation of coupled matrix and tensor factorizations modeling only shared factors fails to capture the underlying structures in the presence of both shared and unshared factors, the proposed data fusion model has the potential to automatically reveal shared and unshared components through modeling constraints. Using numerical experiments, we demonstrate the effectiveness of the proposed approach in terms of identifying shared and unshared components. Furthermore, we measure a set of mixtures with known chemical composition using both LC-MS (Liquid Chromatography - Mass Spectrometry) and NMR (Nuclear Magnetic Resonance) and demonstrate that the structure-revealing data fusion model can (i) successfully capture the chemicals in the mixtures and extract the relative concentrations of the chemicals accurately, (ii) provide promising results in terms of identifying shared and unshared chemicals, and (iii) reveal the relevant patterns in LC-MS by coupling with the diffusion NMR data.

Conclusions

We have proposed a structure-revealing data fusion model that can jointly analyze heterogeneous, incomplete data sets with shared and unshared components and demonstrated its promising performance as well as potential limitations on both simulated and real data.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-239) contains supplementary material, which is available to authorized users.  相似文献   

5.
Hepatitis B is an infectious disease cause by the hepatitis B virus (HBV). In recent years, HBV-DNA level clinically gets more attention for its detailed information than other serological markers. Unfortunately, common clinical method for HBV-DNA level detection is limited for its hours consuming. This study combined infrared spectroscopy with machine learning to investigate the feasibility of near-infrared (NIR) and mid-infrared (MIR) spectra for rapid detection of HBV-DNA level. Based on partial least squares-discriminant analysis (PLS-DA) modeling method, the optimal NIR and MIR models and traditional data fusion models were constructed, respectively. Considering inequal weight between interval and point data in machine learning, interval-point data fusion method was used to compare with other traditional date fusion methods. The results of the study illustrate that interval-point data fusion of NIR and MIR spectra combined with PLS-DA modeling can provide a rapid method for HBV-DNA level detection.  相似文献   

6.
The recent increase in data accuracy from high resolution accelerometers offers substantial potential for improved understanding and prediction of animal movements. However, current approaches used for analysing these multivariable datasets typically require existing knowledge of the behaviors of the animals to inform the behavioral classification process. These methods are thus not well‐suited for the many cases where limited knowledge of the different behaviors performed exist. Here, we introduce the use of an unsupervised learning algorithm. To illustrate the method's capability we analyse data collected using a combination of GPS and Accelerometers on two seabird species: razorbills (Alca torda) and common guillemots (Uria aalge). We applied the unsupervised learning algorithm Expectation Maximization to characterize latent behavioral states both above and below water at both individual and group level. The application of this flexible approach yielded significant new insights into the foraging strategies of the two study species, both above and below the surface of the water. In addition to general behavioral modes such as flying, floating, as well as descending and ascending phases within the water column, this approach allowed an exploration of previously unstudied and important behaviors such as searching and prey chasing/capture events. We propose that this unsupervised learning approach provides an ideal tool for the systematic analysis of such complex multivariable movement data that are increasingly being obtained with accelerometer tags across species. In particular, we recommend its application in cases where we have limited current knowledge of the behaviors performed and existing supervised learning approaches may have limited utility.  相似文献   

7.
Multisensor data fusion (MDF) is an emerging technology to fuse data from multiple sensors in order to make a more accurate estimation of the environment through measurement and detection. Applications of MDF cross a wide spectrum in military and civilian areas. With the rapid evolution of computers and the proliferation of micro-mechanical/electrical systems sensors, the utilization of MDF is being popularized in research and applications. This paper focuses on application of MDF for high quality data analysis and processing in measurement and instrumentation. A practical, general data fusion scheme was established on the basis of feature extraction and merge of data from multiple sensors. This scheme integrates artificial neural networks for high performance pattern recognition. A number of successful applications in areas of NDI (Non-Destructive Inspection) corrosion detection, food quality and safety characterization, and precision agriculture are described and discussed in order to motivate new applications in these or other areas. This paper gives an overall picture of using the MDF method to increase the accuracy of data analysis and processing in measurement and instrumentation in different areas of applications.  相似文献   

8.
A screening level human health risk assessment (HHRA) was applied to evaluate the human health implications of consuming selenium found in fish tissues collected downstream of coal mines in southeastern British Columbia, Canada. The study evaluated the potential for adverse human health effects associated with selenium, and considered known and potential benefits of selenium and fish ingestion. The results indicated that risks of selenosis due to consumption of selenium-contaminated fish in the region are negligible. Conclusions were strengthened by consideration of the potential benefits of selenium to human health, including: selenium essentiality for maintenance of good health; potential cancer prevention properties due to its role as an antioxidant; potential benefits for cardiovascular health; and other positive health benefits. The findings indicated that some aspects of the traditional framework for HHRA (e.g., application of safety factors to “err on the side of safety”) are inappropriate for the assessment of selenium-contaminated fish. Due to both deficiency and toxicity in the selenium dose-response relationship, application of compounding conservatism in risk assessment may lead to recommended intakes of fish that are contrary to the public health interest. The need for balancing risk types, for incorporating positive responses in risk assessments, and the linkage to the precautionary principle, are discussed.  相似文献   

9.
Hyphal fusion is involved in the formation of an interconnected colony in filamentous fungi, and it is the first process in sexual/parasexual reproduction. However, it was difficult to evaluate hyphal fusion efficiency due to the low frequency in Aspergillus oryzae in spite of its industrial significance. Here, we established a method to quantitatively evaluate the hyphal fusion ability of A. oryzae with mixed culture of two different auxotrophic strains, where the ratio of heterokaryotic conidia growing without the auxotrophic requirements reflects the hyphal fusion efficiency. By employing this method, it was demonstrated that AoSO and AoFus3 are required for hyphal fusion, and that hyphal fusion efficiency of A. oryzae was increased by depleting nitrogen source, including large amounts of carbon source, and adjusting pH to 7.0.  相似文献   

10.
Transgenic plants are potentially safe and inexpensive vehicles to produce and mucosally deliver protective antigens. However, the application of this technology is limited by the poor response of the immune system to non-particulate, subunit vaccines. Co-delivery of therapeutic proteins with carrier proteins could increase the effectiveness of the antigen. This paper reports the ability of transgenic Arabidopsis thaliana plants to produce a fusion protein consisting of the B subunit of the Escherichia coli heat-labile enterotoxin and a 6 kDa tuberculosis antigen, the early secretory antigenic target ESAT-6. Both components of the fusion protein were detected using GM1-ganglioside-dependent enzyme-linked immunosorbant assay. This suggested the fusion protein retained both its native antigenicity and the ability to form pentamers.Abbreviations ELISA Enzyme linked immunosorbant assay - ESAT-6 Early secretory antigenic target (6 kDa) - ETEC Enterotoxigenic Escherichia coli - LTB B subunit of E. coli heat-labile enterotoxin Communicated by W.A. Parrott  相似文献   

11.
《MABS-AUSTIN》2013,5(3):456-460
Stefan R. Schmidt consolidates the hugely diverse field of fusion proteins and their application in the creation of biopharmaceuticals. The text is replete with case studies and clinical data that inform and intrigue the reader as to the myriad possibilities available when considering the creation of a fusion protein. This valuable text will serve the novice as a broad introduction or the seasoned professional as a thorough review of the state of the art. The first marketed therapeutic recombinant protein was human insulin (Humulin® R). Its approval in 1982 was followed by other such products, including erythropoietin (EPO), interferon (IFN), and tissue plasminogen activator (tPa). Since the 1980s, the number and general availability of recombinant products that replace natural proteins harvested from animal or human sources has increased considerably. Following the initial success, researchers started de novo designs of therapeutic proteins that do not occur in nature. The first of these new drugs to be approved was etanercept (Enbrel®), a fusion portion containing a section of the tumor necrosis factor (TNF) receptor fused to the Fc portion of human IgG1.  相似文献   

12.
We have previously developed a method to purify recombinant proteins, termed inverse transition cycling (ITC) that eliminates the need for column chromatography. ITC exploits the inverse solubility phase transition of an elastin‐like polypeptide (ELP) that is fused to a protein of interest. In ITC, a recombinant ELP fusion protein is cycled through its phase transition, resulting in separation of the ELP fusion protein from other Escherichia coli contaminants. Herein, we examine the role of the position of the ELP in the fusion protein on the expression levels and yields of purified protein for four recombinant ELP fusion proteins. Placing the ELP at the C‐terminus of the target protein (protein‐ELP) results in a higher expression level for the four ELP fusion proteins, which also translates to a greater yield of purified protein. The position of the fusion protein also has a significant impact on its specific activity, as ELP‐protein constructs have a lower specific activity than protein‐ELP constructs for three out of the four proteins. Our results show no difference in mRNA levels between protein‐ELP and ELP‐protein fusion constructs. Instead, we suggest two possible explanations for these results: first, the translational efficiency of mRNA may differ between the fusion protein in the two orientations and second, the lower level of protein expression and lower specific activity is consistent with a scenario that placement of the ELP at the N‐terminus of the fusion protein increases the fraction of misfolded, and less active conformers, which are also preferentially degraded compared to fusion proteins in which the ELP is present at the C‐terminal end of the protein.  相似文献   

13.
Research for three decades and major recent advances have provided crucial insights into how neurotransmitters are released by Ca2+‐triggered synaptic vesicle exocytosis, leading to reconstitution of basic steps that underlie Ca2+‐dependent membrane fusion and yielding a model that assigns defined functions for central components of the release machinery. The soluble N‐ethyl maleimide sensitive factor attachment protein receptors (SNAREs) syntaxin‐1, SNAP‐25, and synaptobrevin‐2 form a tight SNARE complex that brings the vesicle and plasma membranes together and is key for membrane fusion. N‐ethyl maleimide sensitive factor (NSF) and soluble NSF attachment proteins (SNAPs) disassemble the SNARE complex to recycle the SNAREs for another round of fusion. Munc18‐1 and Munc13‐1 orchestrate SNARE complex formation in an NSF‐SNAP‐resistant manner by a mechanism whereby Munc18‐1 binds to synaptobrevin and to a self‐inhibited “closed” conformation of syntaxin‐1, thus forming a template to assemble the SNARE complex, and Munc13‐1 facilitates assembly by bridging the vesicle and plasma membranes and catalyzing opening of syntaxin‐1. Synaptotagmin‐1 functions as the major Ca2+ sensor that triggers release by binding to membrane phospholipids and to the SNAREs, in a tight interplay with complexins that accelerates membrane fusion. Many of these proteins act as both inhibitors and activators of exocytosis, which is critical for the exquisite regulation of neurotransmitter release. It is still unclear how the actions of these various proteins and multiple other components that control release are integrated and, in particular, how they induce membrane fusion, but it can be expected that these fundamental questions can be answered in the near future, building on the extensive knowledge already available.  相似文献   

14.
陈元鹏  任佳  王力 《生态学报》2019,39(23):8789-8797
回顾了山水林田湖草生态保护修复项目的实施背景,针对生态保护修复项目监测监管范围广、技术难等问题,强调了基于多源遥感数据开展项目遥感监测的重要性与必要性。从监测指标拟定、遥感地物信息提取、多源遥感数据融合、动态变化检测等方面评述了基于多源遥感数据的生态保护修复项目区监测方法,包括基于中高空间分辨率遥感数据的地物信息提取、融合机器学习的非线性混合像元分析、基于混合像元分析的时空融合等。在总结技术和工作推进方面的优势、局限基础上,提出要结合实际工作,持续优化国土空间生态保护修复监测指标;充分挖掘遥感数据解析的相关算法潜力,提升地物信息提取和混合像元分析的精度;加强时空融合算法与变化检测方法的研究探索,加强相关方法的实践应用;以“山水林田湖草生态保护修复工程试点”项目为平台,建立稳定的国土空间生态保护修复遥感监测运行机制,加强科技创新,形成技术标准,指导工作开展。  相似文献   

15.
Liu M  Taylor JM  Belin TR 《Biometrics》2000,56(4):1157-1163
This paper outlines a multiple imputation method for handling missing data in designed longitudinal studies. A random coefficients model is developed to accommodate incomplete multivariate continuous longitudinal data. Multivariate repeated measures are jointly modeled; specifically, an i.i.d. normal model is assumed for time-independent variables and a hierarchical random coefficients model is assumed for time-dependent variables in a regression model conditional on the time-independent variables and time, with heterogeneous error variances across variables and time points. Gibbs sampling is used to draw model parameters and for imputations of missing observations. An application to data from a study of startle reactions illustrates the model. A simulation study compares the multiple imputation procedure to the weighting approach of Robins, Rotnitzky, and Zhao (1995, Journal of the American Statistical Association 90, 106-121) that can be used to address similar data structures.  相似文献   

16.
基于生态系统服务功能的生态系统评估是识别生态环境问题、开展生态系统恢复和生物多样性保护、建立生态补偿机制的重要基础,也是保障国家生态安全、推进生态文明建设的重要环节。生态系统评估涉及生态系统多个方面,需要多要素、多类型、多尺度的生态系统观测数据作为支撑。地面观测数据和遥感数据是生态系统评估的两大数据源,但是其在使用时常存在观测标准不一、观测要素不全面、时间连续性不足、尺度不匹配等问题,给生态系统评估增加了极大的不确定性。如何融合不同尺度的观测数据量化生态系统服务功能是实现生态系统准确评估的关键。为此,从观测尺度出发,阐述了地面观测数据、近地面遥感数据、机载遥感数据和卫星遥感数据的特点及其在问题,并综述了这几类数据源进行融合的常用方法,并以生产力、固碳能力、生物多样性几个关键生态参数为例介绍了“基于多源数据融合的生态系统评估技术及其应用研究”项目的多源数据融合体系。最后,总结面向生态系统评估的多源数据融合体系,并指出了该研究的未来发展方向。  相似文献   

17.
Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions.  相似文献   

18.
In mammalian cell culture producing therapeutic proteins, one of the important challenges is the use of several complex raw materials whose compositional variability is relatively high and their influences on cell culture is poorly understood. Under these circumstances, application of spectroscopic techniques combined with chemometrics can provide fast, simple, and non‐destructive ways to evaluate raw material quality, leading to more consistent cell culture performance. In this study, a comprehensive data fusion strategy of combining multiple spectroscopic techniques is investigated for the prediction of raw material quality in mammalian cell culture. To achieve this purpose, four different spectroscopic techniques of near‐infrared, Raman, 2D fluorescence, and X‐ray fluorescence spectra were employed for comprehensive characterization of soy hydrolysates which are commonly used as supplements in culture media. First, the different spectra were compared separately in terms of their prediction capability. Then, ensemble partial least squares (EPLS) was further employed by combining all of these spectral datasets in order to produce a more accurate estimation of raw material properties, and compared with other data fusion techniques. The results showed that data fusion models based on EPLS always exhibit best prediction accuracy among all the models including individual spectroscopic methods, demonstrating the synergetic effects of data fusion in characterizing the raw material quality. Biotechnol. Bioeng. 2012; 109: 2819–2828. © 2012 Wiley Periodicals, Inc.  相似文献   

19.
Delivery and expression of multiple genes is an important requirement in a range of applications such as the engineering of synthetic signaling pathways and the induction of pluripotent stem cells. However, conventional approaches are often inefficient, nonstoichiometric and may limit the maximum number of genes that can be simultaneously expressed. We here describe a versatile approach for multiple gene delivery using a single expression vector by mimicking the protein expression strategy of RNA viruses. This was accomplished by first expressing the genes together with TEV protease as a single fusion protein, then proteolytically self-cleaving the fusion protein into functional components. To demonstrate this method in E. coli cells, we analyzed the translation products using SDS-PAGE and showed that the fusion protein was efficiently cleaved into its components, which can then be purified individually or as a binding complex. To demonstrate this method in mammalian cells, we designed a differential localization scheme and used live cell imaging to observe the distinctive subcellular targeting of the processed products. We also showed that the stoichiometry of the processed products was consistent and corresponded with the frequency of appearance of their genes on the expression vector. In summary, the efficient expression and separation of up to three genes was achieved in both E. coli and mammalian cells using a single TEV protease self-processing vector.  相似文献   

20.
Acidification inside membrane compartments is a common feature of all eukaryotic cells. The acidic milieu is involved in many physiological processes including secretion, protein processing, and others. However, its cellular relevance has not been well established beyond the results of in vitro studies involving cultured cell systems. In the last decade, human and mouse genetics have revealed that the acidification machinery is implicated in multiple pathophysiological disorders, and thus our understanding of physiological consequences of the defective acidification in multicellular organisms has improved. In invertebrates including Drosophila and nematodes, mutations of V-ATPase were found to lead the development of rather unexpected phenotypes. Studies have suggested that V-ATPase may be involved in membrane fusion and vesicle formation, important processes for membrane trafficking, and have further implied its involvement in cell–cell fusion. This rather novel idea arose from the phenotypes associated with genetic disorders involving V-ATPase genes in various genetic model systems. In this article, we focus and overview the non-classical, beyond proton-pumping function of the vacuolar-type ATPase in exo/endocytic systems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号