首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 10 毫秒
1.
A generally applicable technique is described that permits easy identification and isolation of heterokaryons a few hours after fusion. It is based on the labelling of living cells with different fluorochromes, which, at appropriate concentrations do not affect viability or gene expression. Both fluorochromes are relatively stable and do not cross-contaminate unlabelled cells. The technique has a powerful potential in studies on gene regulation in somatic cell hybrids since heterofluorescent hybrids between any type of cells can be isolated directly from large populations of monofluorescent parental cells by using a cell sorter equipped with a single laser. Thus the technique avoids the need for genetically marked parental cells for selection.  相似文献   

2.
3.

Background  

Horizontal gene transfer (HGT) is considered a strong evolutionary force shaping the content of microbial genomes in a substantial manner. It is the difference in speed enabling the rapid adaptation to changing environmental demands that distinguishes HGT from gene genesis, duplications or mutations. For a precise characterization, algorithms are needed that identify transfer events with high reliability. Frequently, the transferred pieces of DNA have a considerable length, comprise several genes and are called genomic islands (GIs) or more specifically pathogenicity or symbiotic islands.  相似文献   

4.
5.

Background  

Although many genomic features have been used in the prediction of protein-protein interactions (PPIs), frequently only one is used in a computational method. After realizing the limited power in the prediction using only one genomic feature, investigators are now moving toward integration. So far, there have been few integration studies for PPI prediction; one failed to yield appreciable improvement of prediction and the others did not conduct performance comparison. It remains unclear whether an integration of multiple genomic features can improve the PPI prediction and, if it can, how to integrate these features.  相似文献   

6.
The RNA secondary structure prediction is a classical problem in bioinformatics. The most efficient approach to this problem is based on the idea of a comparative analysis. In this approach the algorithms utilize multiple alignment of the RNA sequences and find common RNA structure. This paper describes a new algorithm for this task. This algorithm does not require predefined multiple alignment. The main idea of the algorithm is based on MEME-like iterative searching of abstract profile on different levels. On the first level the algorithm searches the common blocks in the RNA sequences and creates chain of this blocks. On the next step the algorithm refines the chain of common blocks. On the last stage the algorithm searches sets of common helices that have consistent locations relative to common blocks. The algorithm was tested on sets of tRNA with a subset of junk sequences and on RFN riboswitches. The algorithm is implemented as a web server (http://bioinf.fbb.msu.ru/RNAAlign/).  相似文献   

7.
Human adenovirus species D (HAdV-D), which is composed of clinically and epidemiologically important pathogens worldwide, contains more taxonomic “types” than any other species of the genus Mastadenovirus, although the mechanisms accounting for the high level of diversity remain to be disclosed. Recent studies of known and new types of HAdV-D have indicated that intertypic recombination between distant types contributes to the increasing diversity of the species. However, such findings raise the question as to how homologous recombination events occur between diversified types since homologous recombination is suppressed as nucleotide sequences diverge. In order to address this question, we investigated the distribution of the recombination boundaries in comparison with the landscape of intergenomic sequence conservation assessed according to the synonymous substitution rate (dS). The results revealed that specific genomic segments are conserved between even the most distantly related genomes; we call these segments “universally conserved segments” (UCSs). These findings suggest that UCSs facilitate homologous recombination, resulting in intergenomic segmental exchanges of UCS-flanking genomic regions as recombination modules. With the aid of such a mechanism, the haploid genomes of HAdV-Ds may have been reshuffled, resulting in chimeric genomes out of diversified repertoires in the HAdV-D population analogous to the MHC region reshuffled via crossing over in vertebrates. In addition, some HAdVs with chimeric genomes may have had the opportunity to avoid host immune responses thereby causing epidemics.  相似文献   

8.
Genomic aberrations recurrent in a particular cancer type can be important prognostic markers for tumor progression. Typically in early tumorigenesis, cells incur a breakdown of the DNA replication machinery that results in an accumulation of genomic aberrations in the form of duplications, deletions, translocations, and other genomic alterations. Microarray methods allow for finer mapping of these aberrations than has previously been possible; however, data processing and analysis methods have not taken full advantage of this higher resolution. Attention has primarily been given to analysis on the single sample level, where multiple adjacent probes are necessarily used as replicates for the local region containing their target sequences. However, regions of concordant aberration can be short enough to be detected by only one, or very few, array elements. We describe a method called Multiple Sample Analysis for assessing the significance of concordant genomic aberrations across multiple experiments that does not require a-priori definition of aberration calls for each sample. If there are multiple samples, representing a class, then by exploiting the replication across samples our method can detect concordant aberrations at much higher resolution than can be derived from current single sample approaches. Additionally, this method provides a meaningful approach to addressing population-based questions such as determining important regions for a cancer subtype of interest or determining regions of copy number variation in a population. Multiple Sample Analysis also provides single sample aberration calls in the locations of significant concordance, producing high resolution calls per sample, in concordant regions. The approach is demonstrated on a dataset representing a challenging but important resource: breast tumors that have been formalin-fixed, paraffin-embedded, archived, and subsequently UV-laser capture microdissected and hybridized to two-channel BAC arrays using an amplification protocol. We demonstrate the accurate detection on simulated data, and on real datasets involving known regions of aberration within subtypes of breast cancer at a resolution consistent with that of the array. Similarly, we apply our method to previously published datasets, including a 250K SNP array, and verify known results as well as detect novel regions of concordant aberration. The algorithm has been fully implemented and tested and is freely available as a Java application at http://www.cbil.upenn.edu/MSA.  相似文献   

9.
We present a simple and highly accurate computational method for operon prediction, based on intergenic distances and functional relationships between the protein products of contiguous genes, as defined by STRING database (Jensen,L.J., Kuhn,M., Stark,M., Chaffron,S., Creevey,C., Muller,J., Doerks,T., Julien,P., Roth,A., Simonovic,M. et al. (2009) STRING 8–a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res., 37, D412–D416). These two parameters were used to train a neural network on a subset of experimentally characterized Escherichia coli and Bacillus subtilis operons. Our predictive model was successfully tested on the set of experimentally defined operons in E. coli and B. subtilis, with accuracies of 94.6 and 93.3%, respectively. As far as we know, these are the highest accuracies ever obtained for predicting bacterial operons. Furthermore, in order to evaluate the predictable accuracy of our model when using an organism''s data set for the training procedure, and a different organism''s data set for testing, we repeated the E. coli operon prediction analysis using a neural network trained with B. subtilis data, and a B. subtilis analysis using a neural network trained with E. coli data. Even for these cases, the accuracies reached with our method were outstandingly high, 91.5 and 93%, respectively. These results show the potential use of our method for accurately predicting the operons of any other organism. Our operon predictions for fully-sequenced genomes are available at http://operons.ibt.unam.mx/OperonPredictor/.  相似文献   

10.

Background

Using haplotype blocks as predictors rather than individual single nucleotide polymorphisms (SNPs) may improve genomic predictions, since haplotypes are in stronger linkage disequilibrium with the quantitative trait loci than are individual SNPs. It has also been hypothesized that an appropriate selection of a subset of haplotype blocks can result in similar or better predictive ability than when using the whole set of haplotype blocks. This study investigated genomic prediction using a set of haplotype blocks that contained the SNPs with large effects estimated from an individual SNP prediction model. We analyzed protein yield, fertility and mastitis of Nordic Holstein cattle, and used high-density markers (about 770k SNPs). To reach an optimum number of haplotype variables for genomic prediction, predictions were performed using subsets of haplotype blocks that contained a range of 1000 to 50 000 main SNPs.

Results

The use of haplotype blocks improved the prediction reliabilities, even when selection focused on only a group of haplotype blocks. In this case, the use of haplotype blocks that contained the 20 000 to 50 000 SNPs with the highest effect was sufficient to outperform the model that used all individual SNPs as predictors (up to 1.3 % improvement in prediction reliability for mastitis, compared to individual SNP approach), and the achieved reliabilities were similar to those using all haplotype blocks available in the genome data (from 0.6 % lower to 0.8 % higher reliability).

Conclusions

Haplotype blocks used as predictors can improve the reliability of genomic prediction compared to the individual SNP model. Furthermore, the use of a subset of haplotype blocks that contains the main SNP effects from genomic data could be a feasible approach to genomic prediction in dairy cattle, given an increase in density of genotype data available. The predictive ability of the models that use a subset of haplotype blocks was similar to that obtained using either all haplotype blocks or all individual SNPs, with the benefit of having a much lower computational demand.  相似文献   

11.
In this study, we report a modified procedure for extraction of high-quality genomic DNA that is rapid, simple, biologically nonhazardous, and generally applicable to pathogenic bacteria. Bacterial cells were pretreated with 70% ethanol prior to enzymatic digestion with lysozyme. Exposure of bacterial cells to 70% ethanol sterilized the cultures, making the process biologically safe and increased the susceptibility of the cells to lysozyme-induced lysis. Consistently high yields of genomic DNA (mean average yield, 0.5-2.5 mg/ml) were obtained from 465 isolates representing over 30 clinically important bacterial species. Genomic DNA obtained was determined to be suitable for further analysis, including bacterial fingerprinting techniques like restriction endonuclease analysis, Southern hybridization, and repetitive PCR. Availability of a generally applicable procedure for extraction of high-quality and high-quantity genomic DNA would be immensely beneficial for laboratories engaged in molecular surveillance of nosocomial and community-based outbreaks.  相似文献   

12.
A high-density consensus map of A and B wheat genomes   总被引:1,自引:0,他引:1  
A durum wheat consensus linkage map was developed by combining segregation data from six mapping populations. All of the crosses were derived from durum wheat cultivars, except for one accession of T. ssp. dicoccoides. The consensus map was composed of 1,898 loci arranged into 27 linkage groups covering all 14 chromosomes. The length of the integrated map and the average marker distance were 3,058.6 and 1.6?cM, respectively. The order of the loci was generally in agreement with respect to the individual maps and with previously published maps. When the consensus map was aligned to the deletion bin map, 493 markers were assigned to specific bins. Segregation distortion was found across many durum wheat chromosomes, with a higher frequency for the B genome. This high-density consensus map allowed the scanning of the genome for chromosomal rearrangements occurring during the wheat evolution. Translocations and inversions that were already known in literature were confirmed, and new putative rearrangements are proposed. The consensus map herein described provides a more complete coverage of the durum wheat genome compared with previously developed maps. It also represents a step forward in durum wheat genomics and an essential tool for further research and studies on evolution of the wheat genome.  相似文献   

13.
Xiong ZY  Tan GX  He GY  He GC  Song YC 《Cell research》2006,16(3):260-266
The genomic structures of Oryza sativa (A genome) and O. meyeriana (G genome) were comparatively studied using bicolor genomic in situ hybridization (GISH). GISH was clearly able to discriminate between the chromosomes of O. sativa and O. meyeriana in the interspecific F1 hybrids without blocking DNA, and co-hybridization was hardly detected. The average mitotic chromosome length of O. meyeriana was found to be 1.69 times that of O. sativa. A comparison of 4,6-diamidino-2-phenylindole staining showed that the chromosomes of O. meyeriana were more extensively labelled, suggesting that the G genome is amplified with more repetitive sequences than the A genome. In interphase nuclei, 9-12 chromocenters were normally detected and nearly all the chromocenters constituted the G genome-specific DNA. More and larger chromocenters formed by chromatin compaction corresponding to the G genome were detected in the hybrid compared with its parents. During pachytene of the F1 hybrid, most chromosomes of A and G did not synapse each other except for 1-2 chromosomes paired at the end of their arms. At meiotic metaphase I, three types of chromosomal associations, i.e.O, sativa-O, sativa (A-A), O. sativa-O, meyeriana (A-G) and O. meyeriana-O, meyeriana (G-G), were observed in the F1 hybrid. The A-G chromosome pairing configurations included bivalents and trivalents. The results provided a foundation toward studying genome organization and evolution of O. meyeriana.  相似文献   

14.

Background

Genomic prediction of breeding values from dense single nucleotide polymorphisms (SNP) genotypes is used for livestock and crop breeding, and can also be used to predict disease risk in humans. For some traits, the most accurate genomic predictions are achieved with non-linear estimates of SNP effects from Bayesian methods that treat SNP effects as random effects from a heavy tailed prior distribution. These Bayesian methods are usually implemented via Markov chain Monte Carlo (MCMC) schemes to sample from the posterior distribution of SNP effects, which is computationally expensive. Our aim was to develop an efficient expectation–maximisation algorithm (emBayesR) that gives similar estimates of SNP effects and accuracies of genomic prediction than the MCMC implementation of BayesR (a Bayesian method for genomic prediction), but with greatly reduced computation time.

Methods

emBayesR is an approximate EM algorithm that retains the BayesR model assumption with SNP effects sampled from a mixture of normal distributions with increasing variance. emBayesR differs from other proposed non-MCMC implementations of Bayesian methods for genomic prediction in that it estimates the effect of each SNP while allowing for the error associated with estimation of all other SNP effects. emBayesR was compared to BayesR using simulated data, and real dairy cattle data with 632 003 SNPs genotyped, to determine if the MCMC and the expectation-maximisation approaches give similar accuracies of genomic prediction.

Results

We were able to demonstrate that allowing for the error associated with estimation of other SNP effects when estimating the effect of each SNP in emBayesR improved the accuracy of genomic prediction over emBayesR without including this error correction, with both simulated and real data. When averaged over nine dairy traits, the accuracy of genomic prediction with emBayesR was only 0.5% lower than that from BayesR. However, emBayesR reduced computing time up to 8-fold compared to BayesR.

Conclusions

The emBayesR algorithm described here achieved similar accuracies of genomic prediction to BayesR for a range of simulated and real 630 K dairy SNP data. emBayesR needs less computing time than BayesR, which will allow it to be applied to larger datasets.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-014-0082-4) contains supplementary material, which is available to authorized users.  相似文献   

15.
We recently constructed a 7000-rad porcine whole-genome radiation hybrid (RH) panel with the primary objective of integrating linkage maps of microsatellites with evolutionary conserved genes into one ordered map. In order to evaluate the resolution of this RH panel, we have now constructed a radiation hybrid map of the Chromosome (Chr) 15q2.3-q2.6 region containing the RN gene. This gene has large effects on glycogen content in muscle and meat quality. Ten microsatellites covering a region of 55 centiMorgans and eight genes (AE3, FN1, IGFBP5, INHA, IRS1, PAX3, TNP1, and VIL1) were placed on the Sscr15 RH map. All the genes, except IRS1, were mapped on the RH map between microsatellites located in 15q2.5. The relative order of AE3 and INHA was inverted on the porcine physical map in comparison with the mouse linkage map. The order of other genes already mapped in the mouse (FN1, IGFBP5, TNP1, VIL1, INHA/AE3, and PAX3) was identical in pigs. We found no clear difference between the gene order on pig Chr 15 and human Chr 2q. Received: 4 November 1998 / Accepted: 8 February 1999  相似文献   

16.
17.

Background  

Molecular phylogenetics and phylogenomics have greatly revised and enriched the fungal systematics in the last two decades. Most of the analyses have been performed by comparing single or multiple orthologous gene regions. Sequence alignment has always been an essential element in tree construction. These alignment-based methods (to be called the standard methods hereafter) need independent verification in order to put the fungal Tree of Life (TOL) on a secure footing. The ever-increasing number of sequenced fungal genomes and the recent success of our newly proposed alignment-free composition vector tree (CVTree, see Methods) approach have made the verification feasible.  相似文献   

18.

Background

Genomic prediction uses two sources of information: linkage disequilibrium between markers and quantitative trait loci, and additive genetic relationships between individuals. One way to increase the accuracy of genomic prediction is to capture more linkage disequilibrium by regression on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information.

Methods

A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (π) of the haplotype covariates had zero effect, i.e. a Bayesian mixture method.

Results

About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some cases, decreased the bias of prediction. With the Bayesian method, accuracy of prediction was less sensitive to parameter π when fitting haplotypes compared to fitting markers.

Conclusions

Use of haplotypes based on genealogy can slightly increase the accuracy of genomic prediction. Improved methods to cluster the haplotypes constructed from local genealogy could lead to additional gains in accuracy.  相似文献   

19.
Prokaryote gene annotation is complicated by large numbers of short open reading frames (ORFs) that arise naturally from genetic code design. Historically, many hypothetical ORFs have been annotated as genes in microbes, usually with an arbitrary length threshold (e.g. greater than 100 codons). Given the use of such thresholds, what is the extent of genuine undiscovered short genes in the current sampling of prokaryote genomes? To assess rigorously the potential under-annotation of short ORFs with homology, we exhaustively compared the polyORFome--all possible ORFs in 64 prokaryotes (53 bacteria and 11 archaea) plus budding yeast--to itself and to all known proteins. The novelty of our analysis is that, firstly, sequence comparisons to/between both annotated and un-annotated ORFs are considered, and secondly a two-step disabled-homology filter is applied to set aside putative pseudogenes and spurious ORFs. We find that un-annotated homologous short ORFs (uhORFs) correspond to a small but non-negligible fraction of the annotated prokaryote proteomes (0.5-3.8%, depending on selection criteria). Moreover, the disabled-homology filter indicates that about a third of uhORFs correspond to putative pseudogenes or spurious ORFs. Our analysis shows that the use of annotation length thresholds is unnecessary, as there are manageable numbers of short ORF homologies conserved (without disablements) across microbial genomes. Data on uhORFs are available from http://pseudogene.org/polyo  相似文献   

20.
We have found certain conserved motifs and secondary structural patterns present in the vicinity of interior domain boundary points (dbps) by a data-driven approach without any a priori constraint on the type and number of such features, and without any requirement of sequence homology. We have used these motifs and patterns to rerank the solutions obtained by the well-known domain guess by size (DGS) algorithm. We predict, overall, five solutions. The average accuracy of overall (i.e., top five) predictions by our method [domain boundary prediction using conserved patterns (DPCP)] has improved the average accuracy of the top five solutions of DGS from 71.74 to 82.88 %, in the case of two-continuous-domain proteins, and from 21.38 to 80.56 %, for two-discontinuous-domain proteins. Considering only the top solution, the gains in accuracy are from 0 to 72.74 % for two-continuous-domain proteins with chain lengths up to 300 residues, and from 0 to 62.85 % for those with up to 400 residues. In the case of discontinuous domains, top_min solutions (the minimum number of solutions required for predicting all dbps of a protein) of DPCP improve the average accuracy of DGS prediction from 12.5 to 76.3 % in proteins with chain lengths up to 300 residues, and from 13.33 to 70.84 % for proteins with up to 400 residues. In our validation experiments, the performance of DPCP was also found to be superior to that of domain identification from secondary structure element alignment (DomSSEA), the best method reported so far for efficient prediction of domain boundaries using predicted secondary structure. The average accuracies of the topmost solution of DomSSEA are 61 and 52 % for proteins with up to 300 residues and 400, respectively, in the case of continuous domains; the corresponding accuracies for the discontinuous case are 28 and 21 %.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号