首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Complementarity-based reserve selection algorithms efficiently prioritize sites for biodiversity conservation, but they are data-intensive and most regions lack accurate distribution maps for the majority of species. We explored implications of basing conservation planning decisions on incomplete and biased data using occurrence records of the plant family Proteaceae in South Africa. Treating this high-quality database as 'complete', we introduced three realistic sampling biases characteristic of biodiversity databases: a detectability sampling bias and two forms of roads sampling bias. We then compared reserve networks constructed using complete, biased, and randomly sampled data. All forms of biased sampling performed worse than both the complete data set and equal-effort random sampling. Biased sampling failed to detect a median of 1-5% of species, and resulted in reserve networks that were 9-17% larger than those designed with complete data. Spatial congruence and the correlation of irreplaceability scores between reserve networks selected with biased and complete data were low. Thus, reserve networks based on biased data require more area to protect fewer species and identify different locations than those selected with randomly sampled or complete data.  相似文献   

2.
3.
Taxon sampling may be critically important for phylogenetic accuracy because adding taxa can help to subdivide misleading long branches. Although the idea that added taxa can break up long branches was exemplified by a study of "incomplete" fossil taxa, the issue of taxon completeness (i.e., proportion of missing data) has been largely ignored in most subsequent discussions of taxon sampling and long-branch attraction. In this article, I use simulations to test the ability of incomplete taxa to subdivide long branches and improve phylogenetic accuracy in situations of potential long-branch attraction. The results show that for most methods and conditions examined, adding taxa that are only 50% complete may provide similar benefits to adding the same number of complete taxa (suggesting that the advantages of increased taxon sampling may be obtained with less data than previously considered). For parsimony, taxa that are less complete (5% to 25% complete) may often have limited ability to rescue analyses from long-branch attraction. In contrast, highly incomplete taxa can be surprisingly beneficial when using model-based methods. The results also suggest the importance of model-based methods in phylogenetic analyses that combine molecular and fossil data.  相似文献   

4.
Missing data are a great concern in longitudinal studies, because few subjects will have complete data and missingness could be an indicator of an adverse outcome. Analyses that exclude potentially informative observations due to missing data can be inefficient or biased. To assess the extent of these problems in the context of genetic analyses, we compared case-wise deletion to two multiple imputation methods available in the popular SAS package, the propensity score and regression methods. For both the real and simulated data sets, the propensity score and regression methods produced results similar to case-wise deletion. However, for the simulated data, the estimates of heritability for case-wise deletion and the two multiple imputation methods were much lower than for the complete data. This suggests that if missingness patterns are correlated within families, then imputation methods that do not allow this correlation can yield biased results.  相似文献   

5.
Mitochondrial genomes are useful tools for inferring evolutionary history. However, many taxa are poorly represented by available data. Thus, to further understand the phylogenetic potential of complete mitochondrial genome sequence data in Annelida (segmented worms), we examined the complete mitochondrial sequence for Clymenella torquata (Maldanidae) and an estimated 80% of the sequence of Riftia pachyptila (Siboglinidae). These genomes have remarkably similar gene orders to previously published annelid genomes, suggesting that gene order is conserved across annelids. This result is interesting, given the high variation seen in the closely related Mollusca and Brachiopoda. Phylogenetic analyses of DNA sequence, amino acid sequence, and gene order all support the recent hypothesis that Sipuncula and Annelida are closely related. Our findings suggest that gene order data is of limited utility in annelids but that sequence data holds promise. Additionally, these genomes show AT bias (approximately 66%) and codon usage biases but have a typical gene complement for bilaterian mitochondrial genomes.  相似文献   

6.
A new method for analyzing steady-state enzyme kinetic data is presented. The technique, which is based on the numerical differentiation of the complete reaction curve, has several advantages over initial velocity and integrated Michaelis-Menten equation methods. The differentiated data are fit to the differential equation describing the appropriate kinetic scheme. This approach is particularly valuable in cases of strong competitive product inhibition and of changing concentrations of active enzyme. The method assumes a reversible reaction and is applicable to a very wide variety of steady-state kinetic schemes. A particular advantage of this approach over integrated methods is that it is independent of [S0] and hence of errors in [S0]. The combination of complete progress curve and computer analysis makes this approach very efficient with respect to both time and materials. Running on an IBM PC XT or equivalent microcomputer with an 8087 coprocessor, the analyses are very fast, the complete process usually being complete in a minute or two. The utility of the technique is demonstrated by application to both simulated and real data. We show that the differentiation of the progress curve for the ribonuclease-catalyzed hydrolysis of 2',3'-cyclic cytidine monophosphate reveals strong product inhibition by 3'-CMP, and this product inhibition accounts for the large discrepancies reported in the literature for the value of Km for this substrate. The method was also applied to determine the rate of reactivation of beta-lactamase which had been reversibly inactivated by cloxacillin. Since large numbers of data points are required for the numerical differentiation the method has become practical only with the advent of computer-acquired data systems.  相似文献   

7.
卢聪聪  刘倩  黄晓磊 《生物多样性》2022,30(7):22204-216
完整的线粒体基因组已被广泛应用于分子进化、基因组学、系统发育等方面的研究。蚜虫是一类重要的农林业害虫, 但目前公开报道的蚜虫完整线粒体基因组非常有限, 因此获得更多的基因组数据对相关研究具有重要价值。本文报道了榕毛管蚜(Greenidea ficicola)、橘二叉蚜(Aphis aurantia)和油杉纩蚜(Mindarus keteleerifoliae) 3种蚜虫的完整线粒体基因组的序列、详细注释信息、基因结构图、密码子使用情况等。该数据集可为昆虫系统发育关系、种群分化格局、害虫防治等方面的工作提供帮助。  相似文献   

8.
Surveys of genetic variation in natural populations represent a valuable and often irreplaceable resource. It may be desirable to reanalyze data as new methods are developed for comparisons with other populations or for comparisons with the same populations at different times. We evaluated existing mechanisms of data preservation in a survey of 627 published surveys of mitochondrial DNA variation in animal and found that over half of the datasets (56%) contained insufficient information for reanalysis. In many cases, publication of complete data would not have added excessively to the length of the publication. Because at present, publications represent the main archive of population genetic data, we offer recommendations for how the essential data from mtDNA surveys can be presented in a form that is complete and concise.  相似文献   

9.
Supermatrices are often characterized by a large amount of missing data. One possible approach to minimize such missing data is to create composite taxa. These taxa are formed by sampling sequences from different species in order to obtain a composite sequence that includes a maximum number of genes. Although this approach is increasingly used, its accuracy has rarely been tested and some authors prefer to analyze incomplete supermatrices by coding unavailable sequences as missing. To further validate the composite taxon approach, it was applied to complete mitochondrial matrices of 102 mammal species representing 93 families with varying amount of missing data. On average, missing data and composite matrices showed similar congruence to model trees obtained from the complete sequence matrix. As expected, the level of congruence to model trees decreased as missing data increased, with both approaches. We conclude that the composite taxon approach is worth considering in a phylogenomic context since it performs well and reduces computing time when compared to missing data matrices.  相似文献   

10.
目的 获得中国地鼠线粒体基因组序列,为线粒体疾病模型提供分子数据.方法 参照近缘物种的线粒体基因组序列,设计27对特异引物,采用TD-PCR及测序技术获得了中国地鼠的线粒体全基因组序列,分析了其基因组特点和各基因的定位.还结合GenBank中已发表的其他5种啮齿类动物的线粒体基因组序列,探讨啮齿类动物不同科间的系统进化关系.结果 中国地鼠线粒体基因组全长为16 283 bp,碱基组成为33.53%A、30.50%T、12.98%G、22.80%C,包括13个蛋白质编码基因、2个rRNA基因、22个tRNA基因和1个非编码基因控制区.中国地鼠和金黄地鼠亲缘关系最近.结论 中国地鼠线粒体基因组各基因长度、位置与典型的啮齿类动物相似,其编码蛋白质区域和rRNA基因与其他啮齿类动物具有很高的同源性,显示线粒体基因组在进化上十分保守.5种动物的分子系统进化树与传统分类地位一致.  相似文献   

11.
The complete mitochondrial genome of Tupaia belangeri, a representative of the eutherian order Scandentia, was determined and compared with full-length mitochondrial sequences of other eutherian orders described to date. The complete mitochondrial genome is 16, 754 nt in length, with no obvious deviation from the general organization of the mammalian mitochondrial genome. Thus, features such as start codon usage, incomplete stop codons, and overlapping coding regions, as well as the presence of tandem repeats in the control region, are within the range of mammalian mitochondrial (mt) DNA variation. To address the question of a possible close phylogenetic relationship between primates and Tupaia, the evolutionary affinities among primates, Tupaia and bats as representatives of the Archonta superorder, ferungulates, guinea pigs, armadillos, rats, mice, and hedgehogs were examined on the basis of the complete mitochondrial DNA sequences. The opossum sequence was used as an outgroup. The trees, estimated from 12 concatenated genes encoded on the mitochondrial H-strand, add further molecular evidence against an Archonta monophyly. With the new data described in this paper, most of both the mitochondrial and the nuclear data point away from Scandentia as the closest extant relatives to primates. Instead, the complete mitochondrial data support a clustering of Scandentia with Lagomorpha connecting to the branch leading to ferungulates. This closer phylogenetic relationship of Tupaia to rabbits than to primates first received support from several analyses of nuclear and partial mitochondrial DNA data sets. Given that short sequences are of limited use in determining deep mammalian relationships, the partial mitochondrial data available to date support this hypothesis only tentatively. Our complete mitochondrial genome data therefore add considerably more evidence in support of this hypothesis.  相似文献   

12.
From basepairs to birdsongs: phylogenetic data in the age of genomics   总被引:4,自引:0,他引:4  
Given the quantity of molecular data now available, including complete genomes for some organisms, one can ask whether there is a need for any data beyond complete genomic sequences for phylogenetic analysis. One reason to look beyond the genome is that not all character information is encoded in organismal genomes. We propose a hierarchy of characters that ranges from biologically transmitted but nongenomically encoded characters, such as bird songs, to characters that are genomically encoded. All of these characters can retain historical information and are potentially useful for phylogenetic analysis. In addition, a number of phenotypic levels that are expressions of the genome can be identified. The question whether it is worth including any of these levels if all of the underlying sequence data have been collected arises, since issues of redundancy occur. Utilization of phenotypic levels that are ultimately based on sequences may facilitate reconstructing homologies that are not evident from sequence data alone. We propose the use of simultaneous analysis of sequence data and as many levels of phenotypic characters as possible to take advantage of homology information that may be more easily recovered from the latter. A method that eliminates redundancy to the degree that it can be detected is proposed.  相似文献   

13.
Consider the problem of making inference about the initial relative infection rate of a stochastic epidemic model. A relatively complete analysis of infectious disease data is possible when it is assumed that the latent and infectious periods are non-random. Here two related martingale-based techniques are used to derive estimates and associated standard errors for the initial relative infection rate. The first technique requires complete information on the epidemic, the second only the total number of people who were infected and the population size. Explicit expressions for the estimates are obtained. The estimates of the parameter and its associated standard error are easily computed and compare well with results of other methods in an application to smallpox data. Asymptotic efficiency differences between the two martingale techniques are considered.  相似文献   

14.
Unprecedented global surveillance of viruses will result in massive sequence data sets that require new statistical methods. These data sets press the limits of Bayesian phylogenetics as the high-dimensional parameters that comprise a phylogenetic tree increase the already sizable computational burden of these techniques. This burden often results in partitioning the data set, for example, by gene, and inferring the evolutionary dynamics of each partition independently, a compromise that results in stratified analyses that depend only on data within a given partition. However, parameter estimates inferred from these stratified models are likely strongly correlated, considering they rely on data from a single data set. To overcome this shortfall, we exploit the existing Monte Carlo realizations from stratified Bayesian analyses to efficiently estimate a nonparametric hierarchical wavelet-based model and learn about the time-varying parameters of effective population size that reflect levels of genetic diversity across all partitions simultaneously. Our methods are applied to complete genome influenza A sequences that span 13 years. We find that broad peaks and trends, as opposed to seasonal spikes, in the effective population size history distinguish individual segments from the complete genome. We also address hypotheses regarding intersegment dynamics within a formal statistical framework that accounts for correlation between segment-specific parameters.  相似文献   

15.
The complete sequence of the mitochondrial genome of the giant tiger prawn, Penaeus monodon (Arthropoda, Crustacea, Malacostraca), is presented. The gene content and gene order are identical to those observed in Drosophila yakuba. The overall AT composition is lower than that observed in the known insect mitochondrial genomes, but higher than that observed in the other two crustaceans for which complete mitochondrial sequence is available. Analysis of the effect of nucleotide bias on codon composition across the Arthropoda reveals a trend with the crustaceans represented showing the lowest proportion of AT-rich codons in mitochondrial protein genes. Phylogenetic analysis among arthropods using concatenated protein-coding sequences provides further support for the possibility that Crustacea are paraphyletic. Furthermore, in contrast to data from the nuclear gene EF1alpha, the first complete sequence of a malacostracan mitochondrial genome supports the possibility that Malacostraca are more closely related to Insecta than to Branchiopoda.  相似文献   

16.
A common problem in clinical trials is the missing data that occurs when patients do not complete the study and drop out without further measurements. Missing data cause the usual statistical analysis of complete or all available data to be subject to bias. There are no universally applicable methods for handling missing data. We recommend the following: (1) Report reasons for dropouts and proportions for each treatment group; (2) Conduct sensitivity analyses to encompass different scenarios of assumptions and discuss consistency or discrepancy among them; (3) Pay attention to minimize the chance of dropouts at the design stage and during trial monitoring; (4) Collect post-dropout data on the primary endpoints, if at all possible; and (5) Consider the dropout event itself an important endpoint in studies with many.  相似文献   

17.

Objectives

Participants with complete accelerometer data often represent a low proportion of the total sample and, in some cases, may be distinguishable from participants with incomplete data. Because traditional reliability methods characterize the consistency of complete data, little is known about reliability properties for an entire sample. This study employed Generalizability theory to report an index of reliability characterizing complete (7 days) and observable (1 to 7 days) accelerometer data.

Design

Cross-sectional.

Methods

Accelerometer data from the Study of Early Child Care and Youth Development were analyzed in this study. Missing value analyses were conducted to describe the pattern and mechanism of missing data. Generalizability coefficients were derived from variance components to report reliability parameters for complete data and also for the entire observable sample. Analyses were conducted separately by age (9, 11, 12, and 15 yrs) and daily wear time criteria (6, 8, 10, and 12 hrs).

Results

Participants with complete data were limited (<34%) and, most often, data were not considered to be missing completely at random. Across conditions, reliability coefficients for complete data were between 0.74 and 0.87. Relatively lower reliability properties were found across all observable data, ranging from 0.52 to 0.67. Sample variability increased with longer wear time criteria, but decreased with advanced age.

Conclusions

A reliability coefficient that includes all participants, not just those with complete data, provides a global perspective of reliability that could be used to further understand group level associations between activity and health outcomes.  相似文献   

18.
Satten GA  Carroll RJ 《Biometrics》2000,56(2):384-388
We consider methods for analyzing categorical regression models when some covariates (Z) are completely observed but other covariates (X) are missing for some subjects. When data on X are missing at random (i.e., when the probability that X is observed does not depend on the value of X itself), we present a likelihood approach for the observed data that allows the same nuisance parameters to be eliminated in a conditional analysis as when data are complete. An example of a matched case-control study is used to demonstrate our approach.  相似文献   

19.
Sequence-based species identification relies on the extent and integrity of sequence data available in online databases such as GenBank. When identifying species from a sample of unknown origin, partial DNA sequences obtained from the sample are aligned against existing sequences in databases. When the sequence from the matching species is not present in the database, high-scoring alignments with closely related sequences might produce unreliable results on species identity. For species identification in mammals, the cytochrome b (cyt b) gene has been identified to be highly informative; thus, large amounts of reference sequence data from the cyt b gene are much needed. To enhance availability of cyt b gene sequence data on a large number of mammalian species in GenBank and other such publicly accessible online databases, we identified a primer pair for complete cyt b gene sequencing in mammals. Using this primer pair, we successfully PCR amplified and sequenced the complete cyt b gene from 40 of 44 mammalian species representing 10 orders of mammals. We submitted 40 complete, correctly annotated, cyt b protein coding sequences to GenBank. To our knowledge, this is the first single primer pair to amplify the complete cyt b gene in a broad range of mammalian species. This primer pair can be used for the addition of new cyt b gene sequences and to enhance data available on species represented in GenBank. The availability of novel and complete gene sequences as high-quality reference data can improve the reliability of sequence-based species identification.  相似文献   

20.
《Journal of Physiology》2013,107(5):369-398
An important property of visual systems is to be simultaneously both selective to specific patterns found in the sensory input and invariant to possible variations. Selectivity and invariance (tolerance) are opposing requirements. It has been suggested that they could be joined by iterating a sequence of elementary selectivity and tolerance computations. It is, however, unknown what should be selected or tolerated at each level of the hierarchy. We approach this issue by learning the computations from natural images. We propose and estimate a probabilistic model of natural images that consists of three processing layers. Two natural image data sets are considered: image patches, and complete visual scenes downsampled to the size of small patches. For both data sets, we find that in the first two layers, simple and complex cell-like computations are performed. In the third layer, we mainly find selectivity to longer contours; for patch data, we further find some selectivity to texture, while for the downsampled complete scenes, some selectivity to curvature is observed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号