首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 500 毫秒
1.
Recent years have seen an exponential increase in the amount of data available in all sciences and application domains. Macroecology is part of this “Big Data” trend, with a strong rise in the volume of data that we are using for our research. Here, we summarize the most recent developments in macroecology in the age of Big Data that were presented at the 2018 annual meeting of the Specialist Group Macroecology of the Ecological Society of Germany, Austria and Switzerland (GfÖ). Supported by computational advances, macroecology has been a rapidly developing field over recent years. Our meeting highlighted important avenues for further progress in terms of standardized data collection, data integration, method development and process integration. In particular, we focus on (a) important data gaps and new initiatives to close them, for example through space- and airborne sensors, (b) how various data sources and types can be integrated, (c) how uncertainty can be assessed in data-driven analyses and (d) how Big Data and machine learning approaches have opened new ways of investigating processes rather than simply describing patterns. We discuss how Big Data opens up new opportunities, but also poses new challenges to macroecological research. In the future, it will be essential to carefully assess data quality, the reproducibility of data compilation and analytical methods, and the communication of uncertainties. Major progress in the field will depend on the definition of data standards and workflows for macroecology, such that scientific quality and integrity are guaranteed, and collaboration in research projects is made easier.  相似文献   

2.
Qianxing Mo  Faming Liang 《Biometrics》2010,66(4):1284-1294
Summary ChIP‐chip experiments are procedures that combine chromatin immunoprecipitation (ChIP) and DNA microarray (chip) technology to study a variety of biological problems, including protein–DNA interaction, histone modification, and DNA methylation. The most important feature of ChIP‐chip data is that the intensity measurements of probes are spatially correlated because the DNA fragments are hybridized to neighboring probes in the experiments. We propose a simple, but powerful Bayesian hierarchical approach to ChIP‐chip data through an Ising model with high‐order interactions. The proposed method naturally takes into account the intrinsic spatial structure of the data and can be used to analyze data from multiple platforms with different genomic resolutions. The model parameters are estimated using the Gibbs sampler. The proposed method is illustrated using two publicly available data sets from Affymetrix and Agilent platforms, and compared with three alternative Bayesian methods, namely, Bayesian hierarchical model, hierarchical gamma mixture model, and Tilemap hidden Markov model. The numerical results indicate that the proposed method performs as well as the other three methods for the data from Affymetrix tiling arrays, but significantly outperforms the other three methods for the data from Agilent promoter arrays. In addition, we find that the proposed method has better operating characteristics in terms of sensitivities and false discovery rates under various scenarios.  相似文献   

3.
4.
Yujie Zhao  Rui Tang  Yeting Du  Ying Yuan 《Biometrics》2023,79(2):1459-1471
In the era of targeted therapies and immunotherapies, the traditional drug development paradigm of testing one drug at a time in one indication has become increasingly inefficient. Motivated by a real-world application, we propose a master-protocol–based Bayesian platform trial design with mixed endpoints (PDME) to simultaneously evaluate multiple drugs in multiple indications, where different subsets of efficacy measures (eg, objective response and landmark progression-free survival) may be used by different indications as single or multiple endpoints. We propose a Bayesian hierarchical model to accommodate mixed endpoints and reflect the trial structure of indications that are nested within treatments. We develop a two-stage approach that first clusters the indications into homogeneous subgroups and then applies the Bayesian hierarchical model to each subgroup to achieve precision information borrowing. Patients are enrolled in a group-sequential way and adaptively assigned to treatments according to their efficacy estimates. At each interim analysis, the posterior probabilities that the treatment effect exceeds prespecified clinically relevant thresholds are used to drop ineffective treatments and “graduate” effective treatments. Simulations show that the PDME design has desirable operating characteristics compared to existing method.  相似文献   

5.
We propose a method - Frequency extracted hierarchical decomposition (FEHD) - for studying multivariate time series that identifies linear combinations of its components that possess a causally hierarchical structure - the method orders the components so that those at the “top” of the hierarchy drive those below. The method shares many of the features of the “hierarchical decomposition” method of Repucci et al. (Annals of Biomedical Engineering, 29, 1135–1149, 2001) but makes a crucial advance - the proposed method is capable of determining this causal hierarchy over arbitrarily specified frequency bands. Additionally, a novel minimization strategy is used to generate the decomposition resulting in an increase in stability, reliability, and an improvement in the sensitivity to model parameters. We demonstrate the utility of the method by applying it to both artificial time series constructed to have specific causal graphs, and to the EEG of healthy volunteers and patient subjects who are recovering from a severe brain injury.  相似文献   

6.

Background

Lately, biomarker discovery has become one of the most significant research issues in the biomedical field. Owing to the presence of high-throughput technologies, genomic data, such as microarray data and RNA-seq, have become widely available. Many kinds of feature selection techniques have been applied to retrieve significant biomarkers from these kinds of data. However, they tend to be noisy with high-dimensional features and consist of a small number of samples; thus, conventional feature selection approaches might be problematic in terms of reproducibility.

Results

In this article, we propose a stable feature selection method for high-dimensional datasets. We apply an ensemble L 1 -norm support vector machine to efficiently reduce irrelevant features, considering the stability of features. We define the stability score for each feature by aggregating the ensemble results, and utilize backward feature elimination on a purified feature set based on this score; therefore, it is possible to acquire an optimal set of features for performance without the need to set a specific threshold. The proposed methodology is evaluated by classifying the binary stage of renal clear cell carcinoma with RNA-seq data.

Conclusion

A comparison with established algorithms, i.e., a fast correlation-based filter, random forest, and an ensemble version of an L 2 -norm support vector machine-based recursive feature elimination, enabled us to prove the superior performance of our method in terms of classification as well as stability in general. It is also shown that the proposed approach performs moderately on high-dimensional datasets consisting of a very large number of features and a smaller number of samples. The proposed approach is expected to be applicable to many other researches aimed at biomarker discovery.
  相似文献   

7.
Data in distributed systems are often replicated into different storage elements in order to facilitate their access. This allows optimizing execution time and bandwidth consumption, ensures load balancing and increases data availability and quality of service. Several replication strategies have then been proposed in the literature. In this work, a new evaluation metric for replication strategies is introduced and experimentally evaluated. This metric, called SAvE, serves to tackle a key feature, although neglected in the literature, which is the ability of a replication strategy to exploit the most available sites in the system. The design of such a metric requires an accurate determination of the availability degree of each site. A new measurement of site availability, denoted SA, is then designed to be integrated into SAvE while overcoming the drawbacks experienced by existing measurements. Extensive experiments are performed using the OptorSim simulator to show the accuracy and the effectiveness of our contributions.  相似文献   

8.
Document similarity has important real life applications such as finding duplicate web sites and identifying plagiarism. While the basic techniques such as k-similarity algorithms have been long known, overwhelming amount of data, being collected such as in big data setting, calls for novel algorithms to find highly similar documents in reasonably short amount of time. In particular, pairwise comparison of documents’ features, a key operation in calculating document similarity, necessitates prohibitively high storage and computation power. In this paper, we propose a new filtering technique that decreases the number of comparisons between the query set and the search set to find highly similar documents. The proposed filtering technique utilizes Z-order prefix, based on the cosine similarity measure, in which only the most important features are used first to find highly similar documents. We propose a three-phase approach, where the phases are near duplicate detection, common important terms and join phase. We utilize the Hadoop distributed file system and the MapReduce parallel programming model to scale our techniques to big data setting. Our experimental results on real data show that the proposed method performs better than the previous work in the literature in terms of the number of joins, and therefore, speed.  相似文献   

9.
10.
11.
12.
This review summarizes available data on the problem of taxonomic and evolutionary differentiation in the “araneus” groups of species of the genus Sorex (Eulipotyphla, Mammalia). Report 2 describes the hierarchical structuring, population system, and interracial hybrid zones in the common shrew (Sorex araneus).  相似文献   

13.
Rapid improvements in mass spectrometry sensitivity and mass accuracy combined with improved liquid chromatography separation technologies allow acquisition of high throughput metabolomics data, providing an excellent opportunity to understand biological processes. While spectral deconvolution software can identify discrete masses and their associated isotopes and adducts, the utility of metabolomic approaches for many statistical analyses such as identifying differentially abundant ions depends heavily on data quality and robustness, especially, the accuracy of aligning features across multiple biological replicates. We have developed a novel algorithm for feature alignment using density maximization. Instead of a greedy iterative, hence local, merging strategy, which has been widely used in the literature and in commercial applications, we apply a global merging strategy to improve alignment quality. Using both simulated and real data, we demonstrate that our new algorithm provides high map (e.g. chromatogram) coverage, which is critically important for non-targeted comparative metabolite profiling of highly replicated biological datasets.  相似文献   

14.
15.
The skull of the polydolopimorphian marsupialiform Epidolops ameghinoi is described in detail for the first time, based on a single well-preserved cranium and associated left and right dentaries plus additional craniodental fragments, all from the early Eocene (53–50 million year old) Itaboraí fauna in southeastern Brazil. Notable craniodental features of E. ameghinoi include absence of a masseteric process, very small maxillopalatine fenestrae, a prominent pterygoid fossa enclosed laterally by a prominent ectopterygoid crest, an absent or tiny transverse canal foramen, a simple, planar glenoid fossa, and a postglenoid foramen that is immediately posterior to the postglenoid process. Most strikingly, the floor of the hypotympanic sinus was apparently unossified, a feature found in several stem marsupials but absent in all known crown marsupials. “Type II” marsupialiform petrosals previously described from Itaboraí plausibly belong to E. ameghinoi; in published phylogenetic analyses, these petrosals fell outside (crown-clade) Marsupialia. “IMG VII” tarsals previously referred to E. ameghinoi do not share obvious synapomorphies with any crown marsupial clade, nor do they resemble those of the only other putative polydolopimorphians represented by tarsal remains, namely the argyrolagids. Most studies have placed Polydolopimorphia within Marsupialia, related to either Paucituberculata, or to Microbiotheria and Diprotodontia. However, diprotodonty almost certainly evolved independently in polydolopimorphians, paucituberculatans and diprotodontians, and Epidolops does not share obvious synapomorphies with any marsupial order. Epidolops is dentally specialized, but several morphological features appear to be more plesiomorphic than any crown marsupial. It seems likely Epidolops that falls outside Marsupialia, as do morphologically similar forms such as Bonapartherium and polydolopids. Argyrolagids differ markedly in their known morphology from Epidolops but share some potential apomorphies with paucituberculatans. It is proposed that Polydolopimorphia as currently recognised is polyphyletic, and that argyrolagids (and possibly other taxa currently included in Argyrolagoidea, such as groeberiids and patagoniids) are members of Paucituberculata. This hypothesis is supported by Bayesian non-clock phylogenetic analyses of a total evidence matrix comprising DNA sequence data from five nuclear protein-coding genes, indels, retroposon insertions, and morphological characters: Epidolops falls outside Marsupialia, whereas argyrolagids form a clade with the paucituberculatans Caenolestes and Palaeothentes, regardless of whether the Type II petrosals and IMG VII tarsals are used to score characters for Epidolops or not. There is no clear evidence for the presence of crown marsupials at Itaboraí, and it is possible that the origin and early evolution of Marsupialia was restricted to the “Austral Kingdom” (southern South America, Antarctica, and Australia).  相似文献   

16.
17.
This is a review of Patrick Meier’s 2015 book, Digital Humanitarians: How Big Data Is Changing the Face of Humanitarian Response. The book explores the role of technologies such as high-resolution satellite imagery, online social media, drones, and artificial intelligence in humanitarian responses during disasters such as the 2010 Haiti earthquake. In this analysis, the book is examined using a humanitarian health ethics perspective.  相似文献   

18.
Hyphal anastomosis testing and molecular methods have been the primary criteria employed to understand the evolutionary and taxonomic relationships of the soil-borne fungal plant pathogen Rhizoctonia solani species complex. In this study, a metabolomics-based approach for characterizing and identifying isolates of R. solani using gas chromatography/mass spectrometry (GC/MS) metabolite profiling and footprinting was developed. Multivariate and hierarchical cluster analyses of GC/MS data provided resolution of isolates belonging to anastomosis groups (AGs) 1–6, 9, and 10 of R. solani. Clustering of R. solani AG-3 isolates, based on host origin, was also observed and attributed to metabolite-biomarkers belonging to amino, carboxylic and fatty acids. The chemotaxonomic approach using metabolomics is a high-throughput methodology that complements existing molecular approaches for the taxonomic investigation of Rhizoctonia isolates and monitoring of fungal metabolism.  相似文献   

19.
Peach belongs to the genus Prunus, which includes Prunus persica and its relative species, P. mira, P. davidiana, P. kansuensis, and P. ferganensis. Of these, P. ferganensis have been classified as a species, subspecies, or geographical population of P. persica. To explore the genetic difference between P. ferganensis and P. persica, high-throughput sequencing was used in different peach accessions belonging to different species. First, low-depth sequencing data of peach accessions belonging to four categories revealed that similarity between P. ferganensis and P. persica was similar to that between P. persica accessions from different geographical populations. Then, to further detect the genomic variation in P. ferganensis, the P. ferganensis accession “Xinjiang Pan Tao 1” and the P. persica accession “Xia Miao 1” were sequenced with high depth, and sequence reads were assembled. The results showed that the collinearity of “Xinjiang Pan Tao 1” with the reference genome “Lovell” was higher than that of “Xia Miao 1” and “Lovell” peach. Additionally, the number of genetic variants, including single nucleotide polymorphisms (SNPs), structural variations (SVs), and the specific genes annotated from unmapped sequence in “Xia Miao 1” was higher than that in “Xinjiang Pan Tao 1” peach. The data showed that there was a close distance between “Xinjiang Pan Tao 1” (P. ferganensis) and reference genome which belong to P. persica, comparing “Xia Miao 1” (P. persica) and reference ones. The results accompany with phylogenetic tree and structure analysis confirmed that P. ferganensis should be considered as a geographic population of P. persica rather than a subspecies or a distinct species. Furthermore, gene ontology analysis was performed using the gene comprising large-effect variation to understand the phenotypic difference between two accessions. The result revealed that the pathways of gene function affected by SVs but SNPs and insertion-deletions markedly differed between the two peach accessions.  相似文献   

20.
Bayesian hierarchical models have been applied in clinical trials to allow for information sharing across subgroups. Traditional Bayesian hierarchical models do not have subgroup classifications; thus, information is shared across all subgroups. When the difference between subgroups is large, it suggests that the subgroups belong to different clusters. In that case, placing all subgroups in one pool and borrowing information across all subgroups can result in substantial bias for the subgroups with strong borrowing, or a lack of efficiency gain with weak borrowing. To resolve this difficulty, we propose a hierarchical Bayesian classification and information sharing (BaCIS) model for the design of multigroup phase II clinical trials with binary outcomes. We introduce subgroup classification into the hierarchical model. Subgroups are classified into two clusters on the basis of their outcomes mimicking the hypothesis testing framework. Subsequently, information sharing takes place within subgroups in the same cluster, rather than across all subgroups. This method can be applied to the design and analysis of multigroup clinical trials with binary outcomes. Compared to the traditional hierarchical models, better operating characteristics are obtained with the BaCIS model under various scenarios.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号