首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A theory of an early stage of genome evolution by combinatorial fusion of circular DNA units is suggested, based on protein sequence fossil evidence. The evidence includes preference of protein sequence lengths for certain sizes—multiples of 123 as for eukaryotes and multiples of 152 as for prokaryotes. At the DNA level these sizes correspond to 350–450 base pairs—the known optimal range for DNA ring closure. The methionine residues repeatedly appear along the sequences with the same period of about 120 as (in eukaryotes), presumably marking the sites of insertion of the early genes—rings of protein-coding DNA. No torsional constraint in this DNA results in very sharp estimate of the helical periodicity of the early DNA, indistinguishable from the experimental mean value for extant DNA. According to the combinatorial fusion theory, based on the above evidence, in the pregenomic, prerecombinational stage the genes and the noncoding sequences existed in form of autonomously replicating DNA rings of close to standard size, randomly segregating between dividing cells, like modern plasmids do. In the recombinational early genomic stage the rings started to fuse, forming larger DNA molecules consisting of several unit genes connected in various combinations and forming long protein-coding sequences (combinatorial fusion). This process, which involved, perhaps, noncoding sequences as well, eventually resulted in the formation of large genomes. The dispersed circular DNA—or, rather, evolutionarily advanced derivatives thereof—may still exist in the form of various mobile DNA elements.  相似文献   

2.
W Michalek  G Künzel  A Graner 《Génome》1999,42(5):849-853
The "Igri/Franka" (I/F) map ranks among the most comprehensive genetic linkage maps of barley (Hordeum vulgare), containing a large number of markers derived from cDNA and genomic PstI clones. Fourty-three cDNA clones and 259 genomic clones were at least partially sequenced and compared with the major data bases of protein and nucleic acid sequences. Of the cDNA clones, 53% show significant similarity to known sequences in protein data bases. A comparison of sequences from genomic clones to nucleic acid sequence data bases revealed similarities for 9% of the clones. For cDNA sequences analyzed the same way, significant similarities were observed for 35% of the clones. These results show that genomic PstI clones, although containing genes at a significant frequency, represent an inappropriate source for an efficient, systematic gene identification in barley. Sequence information obtained in the context of the present study provides a resource for the conversion of these markers into sequence-tagged site (STS) markers and their use in PCR assays.  相似文献   

3.

Background

Large yellow croaker (Larimichthys crocea) is an important commercial fish in China and East-Asia. The annual product of the species from the aqua-farming industry is about 90 thousand tons. In spite of its economic importance, genetic studies of economic traits and genomic selections of the species are hindered by the lack of genomic resources. Specifically, a whole-genome physical map of large yellow croaker is still missing. The traditional BAC-based fingerprint method is extremely time- and labour-consuming. Here we report the first genome map construction using the high-throughput whole-genome mapping technique by nanochannel arrays in BioNano Genomics Irys system.

Results

For an optimal marker density of ~10 per 100 kb, the nicking endonuclease Nt.BspQ1 was chosen for the genome map generation. 645,305 DNA molecules with a total length of ~112 Gb were labelled and detected, covering more than 160X of the large yellow croaker genome. Employing IrysView package and signature patterns in raw DNA molecules, a whole-genome map of large yellow croaker was assembled into 686 maps with a total length of 727 Mb, which was consistent with the estimated genome size. The N50 length of the whole-genome map, including 126 maps, was up to 1.7 Mb. The excellent hybrid alignment with large yellow croaker draft genome validated the consensus genome map assembly and highlighted a promising application of whole-genome mapping on draft genome sequence super-scaffolding. The genome map data of large yellow croaker are accessible on lycgenomics.jmu.edu.cn/pm.

Conclusion

Using the state-of-the-art whole-genome mapping technique in Irys system, the first whole-genome map for large yellow croaker has been constructed and thus highly facilitates the ongoing genomic and evolutionary studies for the species. To our knowledge, this is the first public report on genome map construction by the whole-genome mapping for aquatic-organisms. Our study demonstrates a promising application of the whole-genome mapping on genome maps construction for other non-model organisms in a fast and reliable manner.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1871-z) contains supplementary material, which is available to authorized users.  相似文献   

4.
5.
Recently a number of computational approaches have been developed for the prediction of protein–protein interactions. Complete genome sequencing projects have provided the vast amount of information needed for these analyses. These methods utilize the structural, genomic, and biological context of proteins and genes in complete genomes to predict protein interaction networks and functional linkages between proteins. Given that experimental techniques remain expensive, time-consuming, and labor-intensive, these methods represent an important advance in proteomics. Some of these approaches utilize sequence data alone to predict interactions, while others combine multiple computational and experimental datasets to accurately build protein interaction maps for complete genomes. These methods represent a complementary approach to current high-throughput projects whose aim is to delineate protein interaction maps in complete genomes. We will describe a number of computational protocols for protein interaction prediction based on the structural, genomic, and biological context of proteins in complete genomes, and detail methods for protein interaction network visualization and analysis.  相似文献   

6.
PepLine is a fully automated software which maps MS/MS fragmentation spectra of trypsic peptides to genomic DNA sequences. The approach is based on Peptide Sequence Tags (PSTs) obtained from partial interpretation of QTOF MS/MS spectra (first module). PSTs are then mapped on the six-frame translations of genomic sequences (second module) giving hits. Hits are then clustered to detect potential coding regions (third module). Our work aimed at optimizing the algorithms of each component to allow the whole pipeline to proceed in a fully automated manner using raw nucleic acid sequences (i.e., genomes that have not been "reduced" to a database of ORFs or putative exons sequences). The whole pipeline was tested on controlled MS/MS spectra sets from standard proteins and from Arabidopsis thaliana envelope chloroplast samples. Our results demonstrate that PepLine competed with protein database searching softwares and was fast enough to potentially tackle large data sets and/or high size genomes. We also illustrate the potential of this approach for the detection of the intron/exon structure of genes.  相似文献   

7.
We use the extensive published information describing the genome of Escherichia coli and new restriction map alignment software to align DNA sequence, genetic, and physical maps. Restriction map alignment software is used which considers restriction maps as strings analogous to DNA or protein sequences except that two values, enzyme name and DNA base address, are associated with each position on the string. The resulting alignments reveal a nearly linear relationship between the physical and genetic maps of the E. coli chromosome. Physical map comparisons with the 1976, 1980, and 1983 genetic maps demonstrate a better fit with the more recent maps. The results of these alignments are genomic kilobase coordinates, orientation and rank of the alignment that best fits the genetic data. A statistical measure based on extreme value distribution is applied to the alignments. Additional computer analyses allow us to estimate the accuracy of the published E. coli genomic restriction map, simulate rearrangements of the bacterial chromosome, and search for repetitive DNA. The procedures we used are general enough to be applicable to other genome mapping projects.  相似文献   

8.
The University of California Santa Cruz (UCSC) Genome Bioinformatics website consists of a suite of free, open-source, on-line tools that can be used to browse, analyze, and query genomic data. These tools are available to anyone who has an Internet browser and an interest in genomics. The website provides a quick and easy-to-use visual display of genomic data. It places annotation tracks beneath genome coordinate positions, allowing rapid visual correlation of different types of information. Many of the annotation tracks are submitted by scientists worldwide; the others are computed by the UCSC Genome Bioinformatics group from publicly available sequence data. It also allows users to upload and display their own experimental results or annotation sets by creating a custom track. The suite of tools, downloadable data files, and links to documentation and other information can be found at http://genome.ucsc.edu/.  相似文献   

9.
10.
11.
Gene identification in genomic DNA from eukaryotes is complicated by the vast combinatorial possibilities of potential exon assemblies. If the gene encodes a protein that is closely related to known proteins, gene identification is aided by matching similarity of potential translation products to those target proteins. The genomic DNA and protein sequences can be aligned directly by scoring the implied residues of in-frame nucleotide triplets against the protein residues in conventional ways, while allowing for long gaps in the alignment corresponding to introns in the genomic DNA. We describe a novel method for such spliced alignment. The method derives an optimal alignment based on scoring for both sequence similarity of the predicted gene product to the protein sequence and intrinsic splice site strength of the predicted introns. Application of the method to a representative set of 50 known genes from Arabidopsis thaliana showed significant improvement in prediction accuracy compared to previous spliced alignment methods. The method is also more accurate than ab initio gene prediction methods, provided sufficiently close target proteins are available. In view of the fast growth of public sequence repositories, we argue that close targets will be available for the majority of novel genes, making spliced alignment an excellent practical tool for high-throughput automated genome annotation.  相似文献   

12.
Homing endonucleases have great potential as tools for targeted gene therapy and gene correction, but identifying variants of these enzymes capable of cleaving specific DNA targets of interest is necessary before the widespread use of such technologies is possible. We identified homologues of the LAGLIDADG homing endonuclease I-AniI and their putative target insertion sites by BLAST searches followed by examination of the sequences of the flanking genomic regions. Amino acid substitutions in these homologues that were located close to the target site DNA, and thus potentially conferring differences in target specificity, were grafted onto the I-AniI scaffold. Many of these grafts exhibited novel and unexpected specificities. These findings show that the information present in genomic data can be exploited for endonuclease specificity redesign.  相似文献   

13.
Annotation features from the 1.9-fold whole-genome shotgun (WGS) sequences of domestic cat have been organized into an interactive web application, Genome Annotation Resource Fields (GARFIELD) (http://lgd.abcc.ncifcrf.gov) at the Laboratory of Genomic Diversity and Advanced Biomedical Computing Center (ABCC) at The National Cancer Institute (NCI). The GARFIELD browser allows the user to view annotations on a per chromosome basis with unplaced contigs provided on placeholder chromosomes. Various tracks on the browser allow display of annotations. A Genes track on the browser includes 20 285 regions that align to genes annotated in other mammalian genomes: Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, Bos taurus, and Canis familiaris. Also available are tracks that display the contigs that make up the chromosomes and representations of their GC content and repetitive elements as detected using the RepeatMasker (http://www.repeatmasker.org). Data from the browser can be downloaded in FASTA and GFF format, and users can upload their own data to the display. The Felis catus sequences and their chromosome assignments and additional annotations incorporate data analyzed and produced by a multicenter collaboration between NCI, ABCC, Agencourt Biosciences Corporation, Broad Institute of Harvard and Massachusetts Institute of Technology, National Human Genome Research Institute, National Center for Biotechnology and Information, and Texas A&M.  相似文献   

14.
A wavelet transform of the DNA "walk" constructed from a genomic sequence offers a direct visualization of short and long-range patterns in nucleotide sequences. We study sequences that encode diverse biological functions, taken from a variety of genomes. Pattern irregularities in the transform are frequently associated with sequences of biological interest. Exonic regions, for example, visualize differently under wavelet analysis than introns, and ribosomal RNA regions display distinct universal signatures. DNA walk wavelet analysis can provide a sensitive and rapid assessment of the putative biological significance of genomic DNA.  相似文献   

15.
MOTIVATION: InFiRe, Insertion Finder via Restriction digest, is a novel software tool that allows for the computational identification of transposon insertion sites in known bacterial genome sequences after transposon mutagenesis experiments. The approach is based on the fact that restriction endonuclease digestions of bacterial DNA yield a unique pattern of DNA fragments with defined sizes. Transposon insertion changes the size of the hosting DNA fragment by a known number of base pairs. The exact size of this fragment can be determined by Southern blot hybridization. Subsequently, the position of insertion can be identified with computational analysis. The outlined method provides a solid basis for the establishment of a new high-throughput technology. AVAILABILITY AND IMPLEMENTATION: The software is freely available on our web server at www.infire.tu-bs.de. The algorithm was implemented in the statistical programming language R. For the most flexible use, InFiRe is provided in two different versions. A web interface offers the convenient use in a web browser. In addition, the software and source code is freely available for download as R-packages on our website. CONTACT: m.steinert@tu-bs.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

16.
测定和分析霍乱弧菌分型噬菌体VP3基因组序列,并为ElTor型霍乱弧菌两类菌株的分型方法原理提供研究基础。鸟枪法构建VP3噬菌体全基因组随机文库;测序拼接成最小重叠群,引物步移法填补缝隙序列,拼接后获得VP3全基因组序列。PCR随机扩增噬菌体DNA片段并酶切鉴定;预测可能存在的开放读码框(ORF);对VP3和相关噬菌体的DNA聚合酶基因作进化树分析,协助判定VP3的分类;对预测的部分启动子区利用报道基因进行活性分析。VP3全基因组为环状双链DNA,长度39504bp;酶切鉴定结果与序列一致。确定了49个ORF,注释了27个ORF的编码产物,其中有20个基因产物与T7样噬菌体同源,包括RNA聚合酶(RNAP)、参与DNA复制的蛋白、衣壳蛋白、尾管及尾丝蛋白、DNA包装蛋白等。DNA聚合酶(DNAP)进化树分析表明VP3与T7样噬菌体有同源性。将预测的10个启动子序列克隆到lacZ融合质粒pRS1274上,经检测均具有启动子活性。测定和分析VP3的基因组序列,基因组结构与进化树分析提示VP3属于T7噬菌体家族。  相似文献   

17.
DNA binding proteins find their cognate sequences within genomic DNA through recognition of specific chemical and structural features. Here, we demonstrate that high-resolution DNase I cleavage profiles can provide detailed information about the shape and chemical modification status of genomic DNA. Analyzing millions of DNA-backbone hydrolysis events on naked genomic DNA, we show that the intrinsic rate of cleavage by DNase I closely tracks the width of the minor groove. Integration of these DNase I cleavage data with bisulfite sequencing data for the same cell type genome reveals that the cleavage directly adjacent to CpG dinucleotides is enhanced at least eight-fold by cytosine methylation. This phenomenon we show is attributable to methylation-induced narrowing of the minor groove. Furthermore, we demonstrate that it enables simultaneous mapping of DNase I hypersensitivity and regional DNA methylation levels using dense in vivo cleavage data. Taken together, our results suggest a general mechanism through which CpG methylation can modulate protein–DNA interaction strength via the remodeling of DNA shape.  相似文献   

18.

Background

Epigenome-wide association scans (EWAS) are an increasingly powerful and widely-used approach to assess the role of epigenetic variation in human complex traits. However, this rapidly emerging field lacks dedicated visualisation tools that can display features specific to epigenetic datasets.

Result

We developed coMET, an R package and online tool for visualisation of EWAS results in a genomic region of interest. coMET generates a regional plot of epigenetic-phenotype association results and the estimated DNA methylation correlation between CpG sites (co-methylation), with further options to visualise genomic annotations based on ENCODE data, gene tracks, reference CpG-sites, and user-defined features. The tool can be used to display phenotype association signals and correlation patterns of microarray or sequencing-based DNA methylation data, such as Illumina Infinium 450k, WGBS, or MeDIP-seq, as well as other types of genomic data, such as gene expression profiles. The software is available as a user-friendly online tool from http://epigen.kcl.ac.uk/cometand as an R Bioconductor package. Source code, examples, and full documentation are also available from GitHub.

Conclusion

Our new software allows visualisation of EWAS results with functional genomic annotations and with estimation of co-methylation patterns. coMET is available to a wide audience as an online tool and R package, and can be a valuable resource to interpret results in the fast growing field of epigenetics. The software is designed for epigenetic data, but can also be applied to genomic and functional genomic datasets in any species.  相似文献   

19.
《Epigenetics》2013,8(2):159-163
Abnormalities in DNA methylation of CpG islands that play a role in gene regulation affect gene expression and hence play a role in disease, including cancer. Bisulfite-based DNA methylation analysis methods such as methylation-specific PCR (MSP) and bisulfite sequencing (BiSeq) are most commonly used to study gene-specific DNA methylation. Assessing specificity and visualizing the position of PCR primers in their genomic context is a laborious and tedious task, primarily due to the sequence changes induced during the bisulfite conversion. For this purpose, we developed methGraph, a web application for easy, fast and flexible visualization and accurate in silico quality evaluation of PCR-based methylation assays. The visualization process starts by submitting PCR primer sequences for specificity assessment and mapping on the genome using the BiSearch ePCR primer-search algorithm. The next step comprises the selection of relevant UCSC genome annotation tracks for display in the final graph. A custom track showing all individual CpG dinucleotides, representing their distribution in the CpG island is also provided. Finally, methGraph creates a BED file that is automatically uploaded to the UCSC genome browser, after which the resulting image files are extracted and made available for visualization and download. The generated high-quality figures can easily be customized and exported for use in publications or presentations. methGraph is available at http://mellfire.ugent.be/methgraph/.  相似文献   

20.
"Minghui 63" is the restorer line for a number of the most important commercial rice hybrids varieties in China. To facilitate long-term commitment in genetic analysis and molecular cloning of the superior genes in the genome of "Minghui 63", the authors have constructed a largeinsert genomic DNA library using the bacterial artificial chromosome (BAC) cloning vector (pBe- loBAC 11). Size fractionated Hind m digest of genomic DNA was ligated to the BAC vector, and the ligation mixture was used to transform the bacterial strain DH10B. A total of over 26 000 clones were obtained with the average insert size of about 150 kb, ranging from 90 to 240 kb. These clones thus represent 9 x rice haploid genome equivalents. The library is now being used for physical mapping of several genomic regions for map-based gene cloning.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号