首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 656 毫秒
1.
台湾生物多样性资料整合之经验与策略   总被引:3,自引:1,他引:2  
台湾生物多样性数据库之整合从2001年开始, 是因为数位典藏计划、生物多样性推动方案, 及台湾加入全球生物多样性信息网络(Global Biodiversity Information Facility, GBIF), 均在这一年启动。2002年“中研院”开始建置台湾物种名录数据库(TaiBNET), GBIF之台湾入口网TaiBIF则是在2004年时建置, 用来整合台湾生物多样性的资料并与国际接轨。所采用之方法及格式均依循GBIF所发展的交换标准, 一来可以整合台湾的数据, 二来可及时与国际交换数据。虽然TaiBNET及TaiBIF已突破智慧财产权(知识产权)的障碍, 可搜集整合数位典藏各子计划逐年累积的资料, 但跨部门间及非数位典藏计划所产生的数据, 仍因各单位及个人的本位主义而难以整合分享, 特别是生态分布原始数据。因此2008年在“中研院”成立了跨主管部门的委员会, 制订可行之资料搜集、整合与公开的政策, 并要求各主管部门在委办合约中纳入。无人否认数据库整合的重要, 但在现行对研究人员考评制度下, 研究人员大多不愿投入数据库建置的学术服务工作, 所获得的人力与经费亦日益短缺而难以永续经营。亟需相关部门的重视与支持。TaiBIF在过去6年来的推动成果虽未臻理想, 但所获的经验和心得仍有可供外界参考与借镜之处。  相似文献   

2.
海洋生物多样性甚高, 但却饱受人为的破坏及干扰。目前全球最大的含点位数据的在线开放性数据库是海洋生物地理信息系统(OBIS), 共约12万种3,700万笔资料; 另一个较大的数据库世界海洋生物物种登录(WoRMS)已收集全球22万种海洋生物之物种分类信息。除此之外, 以海洋生物为主的单一类群的数据库只有鱼库(FishBase)、藻库(AlgaeBase)及世界六放珊瑚(Hexacorallians of the World)3个。跨类群及跨陆海域的全球性物种数据库则甚多, 如网络生命大百科(EOL)、全球生物物种名录(CoL)、整合分类信息系统(ITIS)、维基物种(Wikispecies)、ETI生物信息(ETI Bioinformatics)、生命条形码(BOL)、基因库(GenBank)、生物多样性历史文献图书馆(BHL)、海洋生物库(SeaLifeBase); 海洋物种鉴定入口网(Marine Species Identification Portal)、FAO渔业及水产养殖概要(FAO Fisheries and Aquaculture Fact Sheets)等可查询以分类或物种解说为主的数据库。全球生物多样性信息网络(GBIF)、发现生命(Discover Life)、水生物图库(AquaMaps)等则是以生态分布数据为主, 且可作地理分布图并提供下载功能, 甚至于可以改变水温、盐度等环境因子的参数值, 利用既定的模式作参数改变后之物种分布预测。谷歌地球(Google Earth)及国家地理(National Geographic)网站中的海洋子网页, 以及珊瑚礁库(ReefBase)等官方机构或非政府组织之网站, 则大多以海洋保育的教育倡导为主, 所提供的信息及素材可谓包罗万象, 令人目不暇给。更令用户感到方便的是上述许多网站或数据库彼此间均已可交互链接及查询。另外, 属于搜索引擎的谷歌图片(Google Images)与谷歌学术(Google Scholar)透过海洋生物数据库所提供的直接链接, 在充实物种生态图片与学术论文上亦发挥极大帮助, 让用户获得丰富多样的信息。为了保育之目的, 生物多样性数据库除了整合与公开分享外, 还应鼓励并推荐大家来使用。本文乃举Rainer Froese在巴黎演讲之内容为例, 介绍如何使用海洋生物多样性之数据来预测气候变迁对鱼类分布的影响。最后就中国大陆与台湾目前海洋生物多样性数据库的现况、两岸的合作及如何与国际接轨作介绍。  相似文献   

3.
王军  赵超 《生物多样性》2022,30(12):22128-225
菌食性管蓟马是土壤动物的重要组分, 它们在生物多样性保护与利用、植物保护、动物地理等领域具有重要的研究价值, 但在我国其分类和物种多样性研究仍有较大不足, 大尺度分布格局形成原因也不清楚。本文基于对我国大部分地区广泛的野外采集调查和国内外多家研究机构馆藏标本的检视, 整理了我国菌食性管蓟马物种名录和地理分布信息, 总结了我国菌食性管蓟马的分类研究现状与简史, 分析了菌食性管蓟马物种多样性的分布格局并探讨了该格局形成原因。目前我国记录菌食性管蓟马237种, 其中管蓟马亚科39属156种, 灵管蓟马亚科22属81种; 竹管蓟马属(Bamboosiella)、剪管蓟马属(Psalidothrips)、网管蓟马属(Apelaunothrips)和全管蓟马属(Holothrips)是物种较多的属, 包含物种数均超过10种; 中国特有73种; 广东、台湾、海南和云南是物种最丰富的省份, 均超过60种, 这些省份都具热带和亚热带气候特征, 适合其生存; 相对多度分析结果表明在热带和亚热带地区森林凋落物层菌食性管蓟马是土壤动物的常见类群; 气温、降水量、食物等是限制其分布的主要因素。本结果丰富了土壤生物多样性的研究内容, 为菌食性管蓟马多样性大尺度空间格局研究提供了数据支持。  相似文献   

4.
横断山区树线以上区域种子植物的标本分布与物种丰富度   总被引:2,自引:0,他引:2  
植物标本是植物空间分布信息的重要来源, 也是估算物种丰富度的主要数据资料。本文收集了有关数据库和标本馆的标本资料, 以分析横断山区树线以上区域种子植物标本的采集现状和物种丰富度。将树线以上区域(4,100–5,500 m)划分成14个100 m海拔带, 将每号标本的海拔信息记录到相应的海拔带内。共收集8,316号标本信息, 记载种子植物1,820种, 其中横断山区特有种655种。这些标本在物种间的分布极不均匀, 仅有1–2号标本的物种最多, 共974种, 占53.5%。各海拔带内标本总数、种均标本数和物种丰富度随海拔的增加而下降, 但物种稀疏曲线不能很好地描述物种丰富度沿海拔梯度的分布格局。因此, 需要开展更多的样地调查和标本采集工作, 为物种丰富度的估算积累更多的资料。  相似文献   

5.
区域生命之树是对一个区域内的所有物种进行生命之树重建,在最近10年已成为生命科学领域的研究热点。生命之树反映了物种间的亲缘关系和进化信息,可以将生物区系形成与发展过程中的进化和生态因素联系起来,是揭示区系来源和演化规律的有效手段。本文从3个方面总结了区域生命之树在植物区系研究中的应用:(1)在时间维度上,通过生命之树类群分化时间和进化速率估算,反映区系演化历史,揭示区系的时间分化格局;(2)在空间维度上,结合系统发育信息与物种分布数据,揭示区系内生物多样性的空间格局,并在此基础上进行区系分区;(3)整合生物地理信息和气候环境数据,分析区系中生物类群对古地理事件以及气候变化的响应机制,以揭示形成现存生物多样性格局的生态、地理和历史因素。此外,我们阐述了区域生命之树与全球生命之树之间的关系;指出由于类群取样不全而造成的时间估算偏差是区域生命之树研究中需要注意的问题;建议对生物多样性热点地区从不同尺度进行大数据的整合分析。  相似文献   

6.
无尾两栖动物的鸣声通常具有物种特异性,了解其鸣声特征信息,是利用生物声学进行物种多样性调查及物种监测的前提。本文汇总、整理了2012–2020年间利用高保真录音设备在野外记录的43种(隶属于7科26属)无尾两栖动物的鸣声数据,以及相应的鸣声采集信息。对音频文件进行降噪处理后,提供了由61个鸣声的波形图及语图组成的鸣声特征数据集。本数据集展示了鸣声的多种时域和频域信息,如单音节或多音节、音节数、音节时长、音节间隔、鸣声时长、主频、基频、谐波等,为我国无尾两栖类的声学研究、物种多样性调查及鸣声监测提供了数据支持。  相似文献   

7.
物种多维生态位宽度测度   总被引:50,自引:6,他引:50  
经典的生态位宽度测度式基于物种在单一生态位维上各资源状态的分布比例量,而难以应用于物种在多维生态位空间的宽度测度,本文在N维生态位空间分割为分室的基础上,定义物种生态位宽度为物种在分室上分布与样本在分室的频率分布之间的吻合度,根据最小差别信息统计量,推导出一可基于物种分布比例量也可基于实测值的生态位宽测度式,并以华南鼎湖山自然保护区的厚壳桂群落中物种与土壤因子数据为例加以说明,结果表明优势树种具较  相似文献   

8.
徐洲锋  刘恩德  陈家辉 《广西植物》2022,42(Z1):164-179
Biotracks 是一款自然观察类的公众科学应用,目前已经被各类科学调查和自然观察项目广泛使用。该文利用Biotracks 的标本采集项目将野外采集的数据与标本馆的数字馆藏系统连接起来,使用户在手机上记录的信息可以被应用到标本馆的标本数字化中。这种方式不仅提升了数字标本的转录效率,而且从根本上改变了整个标本收集流程中的数据整合方式,使得标本从采集到收藏的各个环节都能获得高质量的效率提升。同时,新的标本收集模式还能自然地将标本的野外照片与数字标本融为一体,从而使得传统标本原本很难呈现的颜色、行为、立体结构、环境等信息最终可以通过数字标本再次展现给研究者。这在信息维度上不仅拓展了传统标本的内涵,结合公众科学,未来还有望进一步延伸馆藏标本鉴定和讨论的时空范围。此外,公众科学在解决标本馆问题中所展现出来的潜质,为重新审视标本馆的领域价值提供了新的视角。  相似文献   

9.
黑龙江省位于中国的东北端, 省下辖区划分复杂而繁多, 植被可分为3大区域: 寒温带针叶林区域、温带针阔混交林区域、温带草原区域。本名录主要参考《中国生物物种名录(2021版)》《东北植物志》《东北植物检索表》《黑龙江省植物志》《黑龙江省树木志》《东北植物分布图集》《东北草本植物志》等著作、近年来发表的论文和来自国家标本资源共享平台(NSII)、全球生物多样性信息网络(GBIF)、东北林业大学植物标本室(NEFI)、东北农业大学生命科学学院植物标本室(NEAU)、中国科学院沈阳应用生态研究所东北生物标本馆(IFP)的标本资料以及最近发表的文献资料。由于“无分布”比“有分布”的信息相对较难确定, 我们在整理数据时, 轻易不删除重要著作中已经有收录的物种, 除非有较可信的证据, 这可能使得本数据集的物种数量稍偏多。本名录中物种分布精确到县级, 共收录黑龙江省野生维管植物132科651属2,276种(亚种、变种), 有凭证标本的本土植物2,122种(亚种、变种), 广泛入侵的44种, 无凭证标本但是有较可靠资料的154种。其中石松类2科6属17种, 蕨类植物16科34属81种, 裸子植物3科6属20种, 被子植物111科605属2,158种。收录国家级重点保护植物22科25属39种。黑龙江野生维管植物中, 菊科(67属258种)、禾本科(61属187种)、莎草科(14属174种)、毛茛科(18属124种)和蔷薇科(24属112种)所含种数较多。统计发现, 黑龙江省物种县域分布差异较大, 131个县级行政单位中, 物种数超过1,000的约占1/6, 且县级分布数据严重不均衡。本名录中, 省级名录的可信度相对较高, 县级分布数据质量还需要大大提升。  相似文献   

10.
生物多样性信息学研究进展   总被引:4,自引:0,他引:4  
生物多样性信息学是一门蓬勃发展的新学科。它将现代的信息技术带入生物多样性及其相关学科的研究领域。它在生物多样性基础数据的数字化、模型工具和各种工具软件的开发、数据整合, 以及全球、地区和国家尺度生物多样性信息网络等多个方面的发展, 向我们展示了未来在全球范围内自由、免费共享生物多样性数据和信息, 以及人们行动起来共同关注、调查与监测野外生物多样性的前景。目前, 已有大量数字化的物种编目、标本馆标本、多媒体影像、研究文献等生物多样性基础信息可以通过互联网检索和利用。其中, 最值得关注的是一些成功的国际性研究项目, 如物种2000、全球生物多样性信息网络、生命条形码以及网络生命大百科全书。这些项目的成功不仅体现在对大量基础信息和数据的发布, 而且它们通过与生物多样性信息标准TDWG(Biodiversity Information Standards: TDWG)的合作, 推动了达尔文核心标准(Darwin Core)等一些重要的生物多样性信息标准的应用, 以及地区和国家性生物多样性信息节点的建立, 这些都为将来全球范围生物多样性信息的共享和数据交换奠定了重要基础。在数字化信息的基础上, 研究人员也开发了一些在特定研究领域应用的数据挖掘和模型工具, 例如基于数字化标本的地理分布预测工具MAXENT, 分类学专家知识管理的LifeDesk。公民科学理念的发展则向我们展示了公众和科学爱好者广泛参与以互联网为基础的生物多样性信息学研究活动。因此, 生物多样性信息学的发展前景广阔, 它将为我们实现全球保护战略目标, 应对生物多样性危机, 解决全球气候变化条件下生物多样性资源管理和利用建立坚实的信息基础。  相似文献   

11.
The increasing abundance of large-scale, high-throughput datasets for many closely related organisms provides opportunities for comparative analysis via the simultaneous biclustering of datasets from multiple species. These analyses require a reformulation of how to organize multi-species datasets and visualize comparative genomics data analyses results. Recently, we developed a method, multi-species cMonkey, which integrates heterogeneous high-throughput datatypes from multiple species to identify conserved regulatory modules. Here we present an integrated data visualization system, built upon the Gaggle, enabling exploration of our method's results (available at http://meatwad.bio.nyu.edu/cmmr.html). The system can also be used to explore other comparative genomics datasets and outputs from other data analysis procedures - results from other multiple-species clustering programs or from independent clustering of different single-species datasets. We provide an example use of our system for two bacteria, Escherichia coli and Salmonella Typhimurium. We illustrate the use of our system by exploring conserved biclusters involved in nitrogen metabolism, uncovering a putative function for yjjI, a currently uncharacterized gene that we predict to be involved in nitrogen assimilation.  相似文献   

12.
CellDepot containing over 270 datasets from 8 species and many tissues serves as an integrated web application to empower scientists in exploring single-cell RNA-seq (scRNA-seq) datasets and comparing the datasets among various studies through a user-friendly interface with advanced visualization and analytical capabilities. To begin with, it provides an efficient data management system that users can upload single cell datasets and query the database by multiple attributes such as species and cell types. In addition, the graphical multi-logic, multi-condition query builder and convenient filtering tool backed by MySQL database system, allows users to quickly find the datasets of interest and compare the expression of gene(s) across these. Moreover, by embedding the cellxgene VIP tool, CellDepot enables fast exploration of individual dataset in the manner of interactivity and scalability to gain more refined insights such as cell composition, gene expression profiles, and differentially expressed genes among cell types by leveraging more than 20 frequently applied plotting functions and high-level analysis methods in single cell research. In summary, the web portal available at http://celldepot.bxgenomics.com, prompts large scale single cell data sharing, facilitates meta-analysis and visualization, and encourages scientists to contribute to the single-cell community in a tractable and collaborative way. Finally, CellDepot is released as open-source software under MIT license to motivate crowd contribution, broad adoption, and local deployment for private datasets.  相似文献   

13.
SIMLR (S ingle‐cell I nterpretation via M ulti‐kernel L eaR ning), an open‐source tool that implements a novel framework to learn a sample‐to‐sample similarity measure from expression data observed for heterogenous samples, is presented here. SIMLR can be effectively used to perform tasks such as dimension reduction, clustering, and visualization of heterogeneous populations of samples. SIMLR was benchmarked against state‐of‐the‐art methods for these three tasks on several public datasets, showing it to be scalable and capable of greatly improving clustering performance, as well as providing valuable insights by making the data more interpretable via better a visualization. SIMLR is available on https://github.com/BatzoglouLabSU/SIMLR GitHub in both R and MATLAB implementations. Furthermore, it is also available as an R package on http://bioconductor.org  相似文献   

14.
Single-cell RNA sequencing enables us to characterize the cellular heterogeneity in single cell resolution with the help of cell type identification algorithms. However, the noise inherent in single-cell RNA-sequencing data severely disturbs the accuracy of cell clustering, marker identification and visualization. We propose that clustering based on feature density profiles can distinguish informative features from noise. We named such strategy as ‘entropy subspace’ separation and designed a cell clustering algorithm called ENtropy subspace separation-based Clustering for nOise REduction (ENCORE) by integrating the ‘entropy subspace’ separation strategy with a consensus clustering method. We demonstrate that ENCORE performs superiorly on cell clustering and generates high-resolution visualization across 12 standard datasets. More importantly, ENCORE enables identification of group markers with biological significance from a hard-to-separate dataset. With the advantages of effective feature selection, improved clustering, accurate marker identification and high-resolution visualization, we present ENCORE to the community as an important tool for scRNA-seq data analysis to study cellular heterogeneity and discover group markers.  相似文献   

15.
Accurate knowledge of species’ habitat associations is important for conservation planning and policy. Assessing habitat associations is a vital precursor to selecting appropriate indicator species for prioritising sites for conservation or assessing trends in habitat quality. However, much existing knowledge is based on qualitative expert opinion or local scale studies, and may not remain accurate across different spatial scales or geographic locations. Data from biological recording schemes have the potential to provide objective measures of habitat association, with the ability to account for spatial variation. We used data on 50 British butterfly species as a test case to investigate the correspondence of data-derived measures of habitat association with expert opinion, from two different butterfly recording schemes. One scheme collected large quantities of occurrence data (c. 3 million records) and the other, lower quantities of standardised monitoring data (c. 1400 sites). We used general linear mixed effects models to derive scores of association with broad-leaf woodland for both datasets and compared them with scores canvassed from experts.Scores derived from occurrence and abundance data both showed strongly positive correlations with expert opinion. However, only for occurrence data did these fell within the range of correlations between experts. Data-derived scores showed regional spatial variation in the strength of butterfly associations with broad-leaf woodland, with a significant latitudinal trend in 26% of species. Sub-sampling of the data suggested a mean sample size of 5000 occurrence records per species to gain an accurate estimation of habitat association, although habitat specialists are likely to be readily detected using several hundred records. Occurrence data from recording schemes can thus provide easily obtained, objective, quantitative measures of habitat association.  相似文献   

16.
Traditional k-means and most k-means variants are still computationally expensive for large datasets, such as microarray data, which have large datasets with large dimension size d. In k-means clustering, we are given a set of n data points in d-dimensional space Rd and an integer k. The problem is to determine a set of k points in Rd, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this work, we develop a novel k-means algorithm, which is simple but more efficient than the traditional k-means and the recent enhanced k-means. Our new algorithm is based on the recently established relationship between principal component analysis and the k-means clustering. We provided the correctness proof for this algorithm. Results obtained from testing the algorithm on three biological data and six non-biological data (three of these data are real, while the other three are simulated) also indicate that our algorithm is empirically faster than other known k-means algorithms. We assessed the quality of our algorithm clusters against the clusters of a known structure using the Hubert-Arabie Adjusted Rand index (ARIHA). We found that when k is close to d, the quality is good (ARIHA>0.8) and when k is not close to d, the quality of our new k-means algorithm is excellent (ARIHA>0.9). In this paper, emphases are on the reduction of the time requirement of the k-means algorithm and its application to microarray data due to the desire to create a tool for clustering and malaria research. However, the new clustering algorithm can be used for other clustering needs as long as an appropriate measure of distance between the centroids and the members is used. This has been demonstrated in this work on six non-biological data.  相似文献   

17.
It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering. Multiple imputation (MI) was used in four ways, multiple agglomerative hierarchical clustering, normal distribution model, normal regression model, and predictive mean match. The later three models used both Bayesian analysis and non-Bayesian analysis, while the first approach used a clustering procedure with randomly selected attributes and assigned real values from the nearest neighbour to the one with missing observations. Different proportions of data entries in six complete datasets were randomly selected to be missing and the MI methods were compared based on the efficiency and accuracy of estimating those values. The results indicated that the models using Bayesian analysis had slightly higher accuracy of estimation performance than those using non-Bayesian analysis but they were more time-consuming. However, the novel approach of multiple agglomerative hierarchical clustering demonstrated the overall best performances.  相似文献   

18.
In this paper we have addressed the problem of analysing Next Generation Sequencing samples with an expected large biodiversity content. We analysed several well-known 16S rRNA datasets from experimental samples, including both large and short sequences, in numbers of tens of thousands, in addition to carefully crafted synthetic datasets containing more than 7000 OTUs. From this data analysis several patterns were identified and used to develop new guidelines for experimentation in conditions of high biodiversity. We analysed the suitability of different clustering packages for these type of situations, the problem of even sampling, the relative effectiveness of Chao1 and ACE estimators as well as their effect on sampling size for a variety of population distributions. As regards practical analysis procedures, we advocated an approach that retains as much high-quality experimental data as possible. By carefully applying selection rules combining the taxonomic assignment with clustering strategies, we derived a set of recommendations for ultra-sequencing data analysis at high biodiversity levels.  相似文献   

19.
Large-scale studies are needed to increase our understanding of how large-scale conservation threats, such as climate change and deforestation, are impacting diverse tropical ecosystems. These types of studies rely fundamentally on access to extensive and representative datasets (i.e., “big data”). In this study, I asses the availability of plant species occurrence records through the Global Biodiversity Information Facility (GBIF) and the distribution of networked vegetation census plots in tropical South America. I analyze how the amount of available data has changed through time and the consequent changes in taxonomic, spatial, habitat, and climatic representativeness. I show that there are large and growing amounts of data available for tropical South America. Specifically, there are almost 2,000,000 unique geo-referenced collection records representing more than 50,000 species of plants in tropical South America and over 1,500 census plots. However, there is still a gaping “data void” such that many species and many habitats remain so poorly represented in either of the databases as to be functionally invisible for most studies. It is important that we support efforts to increase the availability of data, and the representativeness of these data, so that we can better predict and mitigate the impacts of anthropogenic disturbances.  相似文献   

20.
The number of large-scale experimental datasets generated from high-throughput technologies has grown rapidly. Biological knowledge resources such as the Gene Ontology Annotation (GOA) database, which provides high-quality functional annotation to proteins within the UniProt Knowledgebase, can play an important role in the analysis of such data. The integration of GOA with analytical tools has proved to aid the clustering, annotation and biological interpretation of such large expression datasets. GOA is also useful in the development and validation of automated annotation tools, in particular text-mining systems. The increasing interest in GOA highlights the great potential of this freely available resource to assist both the biological research and bioinformatics communities.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号