首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The abundance of different SSU rRNA (“16S”) gene sequences in environmental samples is widely used in studies of microbial ecology as a measure of microbial community structure and diversity. However, the genomic copy number of the 16S gene varies greatly – from one in many species to up to 15 in some bacteria and to hundreds in some microbial eukaryotes. As a result of this variation the relative abundance of 16S genes in environmental samples can be attributed both to variation in the relative abundance of different organisms, and to variation in genomic 16S copy number among those organisms. Despite this fact, many studies assume that the abundance of 16S gene sequences is a surrogate measure of the relative abundance of the organisms containing those sequences. Here we present a method that uses data on sequences and genomic copy number of 16S genes along with phylogenetic placement and ancestral state estimation to estimate organismal abundances from environmental DNA sequence data. We use theory and simulations to demonstrate that 16S genomic copy number can be accurately estimated from the short reads typically obtained from high-throughput environmental sequencing of the 16S gene, and that organismal abundances in microbial communities are more strongly correlated with estimated abundances obtained from our method than with gene abundances. We re-analyze several published empirical data sets and demonstrate that the use of gene abundance versus estimated organismal abundance can lead to different inferences about community diversity and structure and the identity of the dominant taxa in microbial communities. Our approach will allow microbial ecologists to make more accurate inferences about microbial diversity and abundance based on 16S sequence data.  相似文献   

2.
3.
Exome sequencing constitutes an important technology for the study of human hereditary diseases and cancer. However, the ability of this approach to identify copy number alterations in primary tumor samples has not been fully addressed. Here we show that somatic copy number alterations can be reliably estimated using exome sequencing data through a strategy that we have termed exome2cnv. Using data from 86 paired normal and primary tumor samples, we identified losses and gains of complete chromosomes or large genomic regions, as well as smaller regions affecting a minimum of one gene. Comparison with high-resolution comparative genomic hybridization (CGH) arrays revealed a high sensitivity and a low number of false positives in the copy number estimation between both approaches. We explore the main factors affecting sensitivity and false positives with real data, and provide a side by side comparison with CGH arrays. Together, these results underscore the utility of exome sequencing to study cancer samples by allowing not only the identification of substitutions and indels, but also the accurate estimation of copy number alterations.  相似文献   

4.
Discovery of lineage-specific somatic copy number variation (CNV) in mammals has led to debate over whether CNVs are mutations that propagate disease or whether they are a normal, and even essential, aspect of cell biology. We show that 1,000N polyploid trophoblast giant cells (TGCs) of the mouse placenta contain 47 regions, totaling 138 Megabases, where genomic copies are underrepresented (UR). UR domains originate from a subset of late-replicating heterochromatic regions containing gene deserts and genes involved in cell adhesion and neurogenesis. While lineage-specific CNVs have been identified in mammalian cells, classically in the immune system where V(D)J recombination occurs, we demonstrate that CNVs form during gestation in the placenta by an underreplication mechanism, not by recombination nor deletion. Our results reveal that large scale CNVs are a normal feature of the mammalian placental genome, which are regulated systematically during embryogenesis and are propagated by a mechanism of underreplication.  相似文献   

5.
Whole genome sequencing of matched tumor-normal sample pairs is becoming routine in cancer research. However, analysis of somatic copy-number changes from sequencing data is still challenging because of insufficient sequencing coverage, unknown tumor sample purity and subclonal heterogeneity. Here we describe a computational framework, named SomatiCA, which explicitly accounts for tumor purity and subclonality in the analysis of somatic copy-number profiles. Taking read depths (RD) and lesser allele frequencies (LAF) as input, SomatiCA will output 1) admixture rate for each tumor sample, 2) somatic allelic copy-number for each genomic segment, 3) fraction of tumor cells with subclonal change in each somatic copy number aberration (SCNA), and 4) a list of substantial genomic aberration events including gain, loss and LOH. SomatiCA is available as a Bioconductor R package at http://www.bioconductor.org/packages/2.13/bioc/html/SomatiCA.html.  相似文献   

6.
The abundance, diversity and composition of bacterial and archaeal communities in a freshwater iron-rich microbial mat were investigated using culture-dependent and culture-independent methods. The sampling site is a mixing zone where ferrous-iron-rich fluids encounter oxygen-rich environments. Quantitative PCR analysis shows that Bacteria dominated the mat community (>99% of the total cell numbers). Phylotypes related to iron-oxidizers in Gallionellaceae, methano/methylotrophs in Methylophilaceae and Methylococcaceae, sulfide-oxidizers in Sulfuricurvum and an uncultured clone group, called Terrestrial group I or the 1068 group, in the Epsilonproteobacteria were detected in the clone library from the original sample and/or the enrichment cultures. This result suggests that these members may play a role in Fe, S and C cycling in the mixing zone. Although Archaea were minor constituents numerically, phylogenetic analysis indicates that unique and diverse yet-uncultivated Archaea are present in the iron-rich mat. The phylotypes of these yet-uncultivated Archaea belong to environmental clone groups that have been recovered from other mixing zones in terrestrial and marine environments, and some of our phylotypes have significantly low similarity (80% or lower) with the archaeal clones reported previously. Our results provide further insights into the bacterial and archaeal communities in a microaerobic iron-rich freshwater environment in mixing zones.  相似文献   

7.
8.
Population sizing from still aerial pictures is of wide applicability in ecological and social sciences. The problem is long standing because current automatic detection and counting algorithms are known to fail in most cases, and exhaustive manual counting is tedious, slow, difficult to verify and unfeasible for large populations. An alternative is to multiply population density with some reference area but, unfortunately, sampling details, handling of edge effects, etc., are seldom described. For the first time we address the problem using principles of geometric sampling. These principles are old and solid, but largely unknown outside the areas of three dimensional microscopy and stereology. Here we adapt them to estimate the size of any population of individuals lying on an essentially planar area, e.g. people, animals, trees on a savanna, etc. The proposed design is unbiased irrespective of population size, pattern, perspective artifacts, etc. The implementation is very simple—it is based on the random superimposition of coarse quadrat grids. Also, an objective error assessment is often lacking. For the latter purpose the quadrat counts are often assumed to be independent. We demonstrate that this approach can perform very poorly, and we propose (and check via Monte Carlo resampling) a new theoretical error prediction formula. As far as efficiency, counting about 50 (100) individuals in 20 quadrats, can yield relative standard errors of about 8% (5%) in typical cases. This fact effectively breaks the barrier hitherto imposed by the current lack of automatic face detection algorithms, because semiautomatic sampling and manual counting becomes an attractive option.  相似文献   

9.
Copy number variations (CNVs) are one of the main contributors to genetic diversity in animals and are broadly distributed in the genomes of swine. Investigating the performance and evolutionary impacts of pig CNVs requires comprehensive knowledge of their structure and function within and between breeds. In the current study, 4 different programs (i.e., GADA, PennCNV, QuantiSNP, and cnvPartition) were used to analyze Porcine SNP60 genotyping data of 585 pigs from one Large White × Minzhu intercross population to detect copy number variant regions (CNVRs). Overlapping CNVRs recalled by at least 2 programs were used to construct a powerful and comprehensive CNVR map, which contained249 CNVRs (i.e., 70 gains, 43 losses, and 136 gains/losses) and covered 26.22% of the regions in the swine genome. Ten CNVRs, representing different predicted statuses, were selected for validation via quantitative real-time PCR (QPCR); 9/10 CNVRs (i.e., 90%) were validated. When being traced back to the F0 generation, 58 events were identified in only Minzhu F0 parents and 2 events were identified in only Large White F0 parents. A series of CNVR function analyses were performed. Some of the CNVRs functions were predicted, and several interesting CNVRs for meat quality traits and hematological parameters were obtained. A comprehensive and lower false rate genome-wide CNV map was constructed for Large White and Minzhu pig genomes in this study. Our results may provide an important basis for determining the relationship between CNVRs and important qualitative and quantitative traits. In addition, it can help to further understand genetic processes in pigs.  相似文献   

10.
The problem of estimating M, the number of classes in a population, is formulated as an occupancy problem in which N items are drawn from M urns. Under the assumption of a uniform distribution for the number of classes in the population, the sufficient statistic for M, which is the number of distinct classes observed, does not depend upon the number of repetitions in the sample. Point and interval estimates of M are developed using maximum likelihood and the method of moments. Both techniques give rise to the same basic equation which requires a simple iterative solution. These same techniques are used in the more general situation in which the classes can be further subdivided according to type.  相似文献   

11.
12.
J. D. Boeke  D. J. Eichinger    G. Natsoulis 《Genetics》1991,129(4):1043-1052
Haploid yeast strains bearing approximately double the normal number of Ty1 elements have been constructed using marked GAL/Ty1 fusion plasmids. The strains maintain their high transposon copy number and overall genome structure in the absence of selection. The strains bearing extra Ty1 copies are surprisingly similar phenotypically to the parental strain. The results suggest that the limit to transposon copy number, if any, has not been reached. When these strains are crossed by wild-type strains (i.e., bearing the normal complement of Ty1 elements) or by strains of opposite mating type also bearing excess Ty1 elements, normal to very slightly reduced spore viability is observed, indicating that increasing the extent of transposon homology scattered around the genome does not result in significant increases in frequency of ectopic reciprocal recombination. The results suggest that yeast cells have evolved mechanisms for coping with excess transposon copies in the genome.  相似文献   

13.
Although copy number variation (CNV) has recently received much attention as a form of structure variation within the human genome, knowledge is still inadequate on fundamental CNV characteristics such as occurrence rate, genomic distribution and ethnic differentiation. In the present study, we used the Affymetrix GeneChip® Mapping 500K Array to discover and characterize CNVs in the human genome and to study ethnic differences of CNVs between Caucasians and Asians. Three thousand and nineteen CNVs, including 2381 CNVs in autosomes and 638 CNVs in X chromosome, from 985 Caucasian and 692 Asian individuals were identified, with a mean length of 296 kb. Among these CNVs, 190 had frequencies greater than 1% in at least one ethnic group, and 109 showed significant ethnic differences in frequencies (p<0.01). After merging overlapping CNVs, 1135 copy number variation regions (CNVRs), covering approximately 439 Mb (14.3%) of the human genome, were obtained. Our findings of ethnic differentiation of CNVs, along with the newly constructed CNV genomic map, extend our knowledge on the structural variation in the human genome and may furnish a basis for understanding the genomic differentiation of complex traits across ethnic groups.  相似文献   

14.
15.
Copy number variants (CNVs) contribute to human genetic and phenotypic diversity. However, the distribution of larger CNVs in the general population remains largely unexplored. We identify large variants in ~2500 individuals by using Illumina SNP data, with an emphasis on “hotspots” prone to recurrent mutations. We find variants larger than 500 kb in 5%–10% of individuals and variants greater than 1 Mb in 1%–2%. In contrast to previous studies, we find limited evidence for stratification of CNVs in geographically distinct human populations. Importantly, our sample size permits a robust distinction between truly rare and polymorphic but low-frequency copy number variation. We find that a significant fraction of individual CNVs larger than 100 kb are rare and that both gene density and size are strongly anticorrelated with allele frequency. Thus, although large CNVs commonly exist in normal individuals, which suggests that size alone can not be used as a predictor of pathogenicity, such variation is generally deleterious. Considering these observations, we combine our data with published CNVs from more than 12,000 individuals contrasting control and neurological disease collections. This analysis identifies known disease loci and highlights additional CNVs (e.g., 3q29, 16p12, and 15q25.2) for further investigation. This study provides one of the first analyses of large, rare (0.1%–1%) CNVs in the general population, with insights relevant to future analyses of genetic disease.  相似文献   

16.
Since the food-borne pathogen Listeria monocytogenes is common in dairy farm environments, it is likely that phages infecting this bacterium (“listeriaphages”) are abundant on dairy farms. To better understand the ecology and diversity of listeriaphages on dairy farms and to develop a diverse phage collection for further studies, silage samples collected on two dairy farms were screened for L. monocytogenes and listeriaphages. While only 4.5% of silage samples tested positive for L. monocytogenes, 47.8% of samples were positive for listeriaphages, containing up to >1.5 × 104 PFU/g. Host range characterization of the 114 phage isolates obtained, with a reference set of 13 L. monocytogenes strains representing the nine major serotypes and four lineages, revealed considerable host range diversity; phage isolates were classified into nine lysis groups. While one serotype 3c strain was not lysed by any phage isolates, serotype 4 strains were highly susceptible to phages and were lysed by 63.2 to 88.6% of phages tested. Overall, 12.3% of phage isolates showed a narrow host range (lysing 1 to 5 strains), while 28.9% of phages represented broad host range (lysing ≥11 strains). Genome sizes of the phage isolates were estimated to range from approximately 26 to 140 kb. The extensive host range and genomic diversity of phages observed here suggest an important role of phages in the ecology of L. monocytogenes on dairy farms. In addition, the phage collection developed here has the potential to facilitate further development of phage-based biocontrol strategies (e.g., in silage) and other phage-based tools.  相似文献   

17.
In this paper, we consider the problem of estimating the size N of a finite and closed population, using data obtained from capture-recapture experiments. By defining an appropriate model, we investigate the maximum of the likelihood, of the profile likelihood and of an orthogonal adjusted profile likelihood (COX and REID, 1987) function. We show that they all may present infinity as the maximum likelihood estimator of N. This seems to be a characteristic of the likelihood approach in this problem. Further, we present a Bayesian approach with minimum prior information as a way of countering this difficulty. Exact analytical expressions for the posterior modes are also obtained.  相似文献   

18.
Global spectrum of CNVs is required to catalog variations to provide a high-resolution on the dynamics of genome-organization and human migration. In this study, we performed genome-wide genotyping using high-resolution arrays and identified 44,109 CNVs from 1,715 genomes across 12 populations. The study unraveled the force of independent evolutionary dynamics on genome-organizational plasticity across populations. We demonstrated the use of CNV tool to study human migration and identified a second major settlement establishing new migration routes in addition to existing ones.  相似文献   

19.
将采自天津潮间带的污泥进行热休克处理,厌氧条件下富集混合菌群进行产氢试验。混合菌群分别接种于含葡萄糖、蔗糖、乳糖、淀粉和蛋白胨的培养液中,测定不同底物培养条件下混合菌群产氢量及菌群组成。结果表明,以蔗糖为底物时,混合菌群产氢量最高,为(787±24)mL/L;混合菌群以葡糖糖、乳糖和淀粉为底物时,产氢量依次为(530±20)、(46±5)和(455±35)mL/L;混合菌群不能利用蛋白胨为底物产氢。变性梯度凝胶电泳(DGGE)分析不同底物培养条件下的产氢混合菌群组成。混合菌群16S rRNA基因的DGGE分离结果表明,在蛋白胨培养条件下,混合菌群没有能够形成优势菌,其他底物培养时,混合菌群的优势菌是Clostridium sp.。  相似文献   

20.
ABSTRACT: We characterised 9 strains selected from primary isolates referable to Paramoeba/Neoparamoeba spp. Based on ultrastructural study, 5 strains isolated from fish (amoebic gill disease [AGD]-affected Atlantic salmon and dead southern bluefin tuna), 1 strain from netting of a floating sea cage and 3 strains isolated from invertebrates (sea urchins and crab) were assigned to the genus Neoparamoeba Page, 1987. Phylogenetic analyses based on SSU rDNA sequences revealed affiliations of newly introduced and previously analysed Neoparamoeba strains. Three strains from the invertebrates and 2 out of 3 strains from gills of southern bluefin tunas were members of the N. branchiphila clade, while the remaining, fish-isolated strains, as well as the fish cage strain, clustered within the clade of N. pemaquidensis. These findings and previous reports point to the possibility that N. pemaquidensis and N. branchiphila can affect both fish and invertebrates. A new potential fish host, southern bluefin tuna, was included in the list of farmed fish endangered by N. branchiphila. The sequence of P. eilhardi (Culture Collection of Algae and Protozoa [CCAP] strain 1560/2) appeared in all analyses among sequences of strain representatives of Neoparamoeba species, in a position well supported by bootstrap value, Bremer index and Bayesian posterior probability. Our research shows that isolation of additional strains from invertebrates and further analyses of relations between molecular data and morphological characters of the genera Paramoeba and Neoparamoeba are required. This complexity needs to be considered when attempting to define molecular markers for identification of Paramoeba/Neoparamoeba species in tissues of fish and invertebrates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号