首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
In the present study, 3217 UniGene sequences of Neurospora crassa downloaded from the National Center for Biotechnology Information (NCBI) were mined for the identification of microsatellites or simple sequence repeats (SSRs). A total of 287 SSRs detected gives density of 1SSR/14.6 kb of 4187.86 kb sequences mined suggests that only 250 (7.8%) of sequences contained SSRs. Depending on the repeat units, the length of SSRs ranged from 14 to 17 bp for mono-, 14 to 48 bp for di-, 18 to 90 bp for tri-, 24 to 48 bp for tetra-, 30 for penta- and 42 to 48 bp for hexa-nucleotide repeats. Tri-nucleotide repeats were the most frequent repeat type (88.8%) followed by di-nucleotide repeats (5.9%). An attempt was also made with the help of bioinformatics approach to find out primer pairs for identified SSRs and primers were found only for 239 sequences. But, this part needs experimental validation. Annotation of SSRs containing sequences was also carried out.  相似文献   

2.
An in-silico analysis of simple sequence repeats (SSRs) in 30 species of tobamoviruses was done. SSRs (mono to hexa) were present with variant frequency across species. Compound microsatellites, primarily of variant motifs accounted for up to 11.43% of the SSRs. Motif duplications were observed for A, T, AT, and ACA repeats. (AG)–(TC) was the most prevalent SSR-couple. SSRs were differentially localized in the coding region with ~ 54% on the 128 kDa protein while 20.37% was exclusive to 186 kDa protein. Characterization of such variations is important for elucidating the origin, sequence variations, and structure of these widely used, but incompletely understood sequences.  相似文献   

3.
Simple sequence repeats (SSRs) or microsatellites are known to exhibit ubiquitous across all kingdoms of life including viruses. However, imperfections in simple sequence repeats have been analyzed in genomes of human, Escherichia coli and Human Immunodeficiency virus. The assessment of compound microsatellites in plant viral genomes is yet to be studied. Potyviruses severely affect crop plant growth and reduce economic yield in diverse cropping systems worldwide. Hence, we analyze the nature and distribution of compound microsatellites present in complete genome of 45 potyvirus species. The results indicate that compound microsatellites accounted for about 0% to 15.15% of all microsatellites and have low complexity as compared to that of prokaryotic genomes. Overall, 14% of compound microsatellites were of similar motifs and such motif duplications were observed for CA, TA and AG repeats. Among all 45 potyvirus genomes analyzed, SSR couple (AG)-x-(AC) was found to be the most abundant one. Hence it is apparent that in contrast to eukaryotes, majority of compound microsatellites in potyviruses were composed of variant motifs. We also highlight the relative frequency of different classes of compound microsatellites as well as their patterns of distribution and correlate with biology of potyviruses. Further characterization of such variation is important for elucidating the origin, mutational processes, and structure of these widely used, but incompletely understood sequences.  相似文献   

4.
The draft sequence of several complete protozoan genomes is now available and genome projects are ongoing for a number of other species. Different strategies are being implemented to identify and annotate protein coding and RNA genes in these genomes, as well as study their genomic architecture. Since the genomes vary greatly in size, GC-content, nucleotide composition, and degree of repetitiveness, genome structure is often a factor in choosing the methodology utilised for annotation. In addition, the approach taken is dictated, to a greater or lesser extent, by the particular reasons for carrying out genome-wide analyses and the level of funding available for projects. Nevertheless, these projects have provided a plethora of material that will aid in understanding the biology and evolution of these parasites, as well as identifying new targets that can be used to design urgently required drug treatments for the diseases they cause.  相似文献   

5.

Background

The giant panda (Ailuropoda melanoleuca) is a critically endangered species endemic to China. Microsatellites have been preferred as the most popular molecular markers and proven effective in estimating population size, paternity test, genetic diversity for the critically endangered species. The availability of the giant panda complete genome sequences provided the opportunity to carry out genome-wide scans for all types of microsatellites markers, which now opens the way for the analysis and development of microsatellites in giant panda.

Results

By screening the whole genome sequence of giant panda in silico mining, we identified microsatellites in the genome of giant panda and analyzed their frequency and distribution in different genomic regions. Based on our search criteria, a repertoire of 855,058 SSRs was detected, with mono-nucleotides being the most abundant. SSRs were found in all genomic regions and were more abundant in non-coding regions than coding regions. A total of 160 primer pairs were designed to screen for polymorphic microsatellites using the selected tetranucleotide microsatellite sequences. The 51 novel polymorphic tetranucleotide microsatellite loci were discovered based on genotyping blood DNA from 22 captive giant pandas in this study. Finally, a total of 15 markers, which showed good polymorphism, stability, and repetition in faecal samples, were used to establish the novel microsatellite marker system for giant panda. Meanwhile, a genotyping database for Chengdu captive giant pandas (n = 57) were set up using this standardized system. What’s more, a universal individual identification method was established and the genetic diversity were analysed in this study as the applications of this marker system.

Conclusion

The microsatellite abundance and diversity were characterized in giant panda genomes. A total of 154,677 tetranucleotide microsatellites were identified and 15 of them were discovered as the polymorphic and stable loci. The individual identification method and the genetic diversity analysis method in this study provided adequate material for the future study of giant panda.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1268-z) contains supplementary material, which is available to authorized users.  相似文献   

6.
Microsatellites (SSRs) are widely used in cereal research, and their use in marker assisted breeding has increased the speed and efficiency of germplasm improvement. Central to the application of SSRs for many purposes are methodologies enabling the low-cost acquisition of large quantities of genetic information for gene and genotype identification. In this study, multiplex-ready PCR was evaluated in barley and bread wheat as an approach for rapid and more automated SSR genotyping on a fluorescence-based DNA fragment analyzer. Multiplex-ready PCR is a method that allows SSR genotyping to be performed using a standardized protocol. The method enables flexible fluorescence labeling of SSRs, generates a relatively constant amount of PCR product for each marker, and has a high amenability to multiplex PCR (the simultaneous amplification of several SSRs in the same reaction). A high (92%) compatibility of published SSRs with multiplex-ready PCR is demonstrated, and the usefulness of the method for large scale genotyping is shown by its application for whole genome marker assisted breeding in barley. A database of more than 2,800 barley and wheat SSRs, and a suite of bio-informatic tools were developed to support the deployment of multiplex-ready PCR for various genetic applications, and are accessible at . Multiplex-ready PCR is broadly applicable to cereal genomics research and marker assisted breeding, and should be transferable to similar analyses of any animal or plant species.  相似文献   

7.
Grass carp, Ctenopharyngodon idellus (Valenciennes, 1844), is an economically important species widely cultured in the world, but its genome research resources are largely lacking. The objectives of this study were to construct normalized cDNA libraries for efficient EST analysis, to generate ESTs from these libraries, and to identify EST-related molecular markers such as microsatellites and single nucleotide polymorphisms (SNPs) for genetic analysis of this species. A total of 6,269 ESTs were generated representing 4,815 unique sequences, from which 105 putative microsatellites and 5,228 SNPs were identified. These genome resources provide the material basis for future genetic and functional analyses in this species.  相似文献   

8.
The availability of genomic sequences of many organisms has opened new challenges in many aspects particularly in terms of genome analysis. Sequence extraction is a vital step and many tools have been developed to solve this issue. These tools are available publically but have limitations with reference to the sequence extraction, length of the sequence to be extracted, organism specificity and lack of user friendly interface. We have developed a java based software package having three modules which can be used independently or sequentially. The tool efficiently extracts sequences from large datasets with few simple steps. It can efficiently extract multiple sequences of any desired length from a genome of any organism. The results are crosschecked by published data.

Availability

URL 1: http://ww3.comsats.edu.pk/bio/ResearchProjects.aspxURL 2: http://ww3.comsats.edu.pk/bio/SequenceManeuverer.aspx  相似文献   

9.

Background

With the development of several new technologies using synthetic biology, it is possible to engineer genetically intractable organisms including Mycoplasma mycoides subspecies capri (Mmc), by cloning the intact bacterial genome in yeast, using the host yeast’s genetic tools to modify the cloned genome, and subsequently transplanting the modified genome into a recipient cell to obtain mutant cells encoded by the modified genome. The recently described tandem repeat coupled with endonuclease cleavage (TREC) method has been successfully used to generate seamless deletions and point mutations in the mycoplasma genome using the yeast DNA repair machinery. But, attempts to knock-in genes in some cases have encountered a high background of transformation due to maintenance of unwanted circularization of the transforming DNA, which contains possible autonomously replicating sequence (ARS) activity. To overcome this issue, we incorporated a split marker system into the TREC method, enabling seamless gene knock-in with high efficiency. The modified method is called TREC-assisted gene knock-in (TREC-IN). Since a gene to be knocked-in is delivered by a truncated non-functional marker, the background caused by an incomplete integration is essentially eliminated.

Results

In this paper, we demonstrate applications of the TREC-IN method in gene complementation and genome minimization studies in Mmc. In the first example, the Mmc dnaA gene was seamlessly replaced by an orthologous gene, which shares a high degree of identity at the nucleotide level with the original Mmc gene, with high efficiency and low background. In the minimization example, we replaced an essential gene back into the genome that was present in the middle of a cluster of non-essential genes, while deleting the non-essential gene cluster, again with low backgrounds of transformation and high efficiency.

Conclusion

Although we have demonstrated the feasibility of TREC-IN in gene complementation and genome minimization studies in Mmc, the applicability of TREC-IN ranges widely. This method proves to be a valuable genetic tool that can be extended for genomic engineering in other genetically intractable organisms, where it may be implemented in elucidating specific metabolic pathways and in rationale vaccine design.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1180) contains supplementary material, which is available to authorized users.  相似文献   

10.
We offer a guide to de novo genome assembly1 using sequence data generated by the Illumina platform for biologists working with fungi or other organisms whose genomes are less than 100 Mb in size. The guide requires no familiarity with sequencing assembly technology or associated computer programs. It defines commonly used terms in genome sequencing and assembly; provides examples of assembling short-read genome sequence data for four strains of the fungus Grosmannia clavigera using four assembly programs; gives examples of protocols and software; and presents a commented flowchart that extends from DNA preparation for submission to a sequencing center, through to processing and assembly of the raw sequence reads using freely available operating systems and software.  相似文献   

11.
Chung BI  Lee KH  Shin KS  Kim WC  Kwon DN  You RN  Lee YK  Cho K  Cho DH 《Genomics》2011,98(5):381-389
Repetitive elements (REs) constitute a substantial portion of the genomes of human and other species; however, the RE profiles (type, density, and arrangement) within the individual genomes have not been fully characterized. In this study, we developed an RE analysis tool, called REMiner, for a chromosome-wide investigation into the occurrence of individual REs and arrangement of clusters of REs, and REMiner's functional features were examined using the human chromosome Y. The algorithm implemented by REMiner focused on unbiased mining of REs in large chromosomes and data interface within a viewer. The data from the chromosome demonstrated that REMiner is an efficient tool in regard to its capacity for a large query size and the availability of a high-resolution viewer, featuring instant retrieval of alignment data and control of magnification and identity ratio. The chromosome-wide survey identified a diverse population of ordered RE arrangements, which may participate in the genome biology.  相似文献   

12.
Simple sequence repeats (SSRs), or microsatellites, are special DNA/RNA sequences with repeated unit of 1–6 bp. The genomes of Herpesvirales have many repeating structures, which is an excellent system to study the evolution and roles of microsatellites and compound microsatellites in viruses. Therefore, 56 genomes of Herpesvirales were selected and the occurrence, composition and complexity of different repeats were investigated in the genomes. A total of 63,939 microsatellites and 5825 compound microsatellites were extracted from 56 genomes. It found that GC content has a significant strong correlation with both the counts of microsatellites (CM) and the counts of compound microsatellites (CCM). However, genome size has a moderate correlation only with CM and almost no correlation with CCM. The compound microsatellites occurring in genic regions are obviously more than that in intergenic regions. In general, the number of compound microsatellite decreases with the increase of complexity (C) (the count of individual microsatellites being part of a compound microsatellite) and the complexity hardly exceeds C = 4. The vast majority of compound microsatellites exist in intergenic regions, when C ≥ 10. The distributions of SSRs tend to be organism-specific rather than host-specific in herpesvirus genomes. The diversity of microsatellites and compound microsatellites may be helpful for a better understanding of the viral genetic diversity, genotyping, and evolutionary biology in herpesviruses genomes.  相似文献   

13.
14.
A set of 20 wheat microsatellite markers was used with 55 elite wheat genotypes to examine their utility (1) in detecting DNA polymorphism, (2)in the identifying genotypes and (3) in estimating genetic diversity among wheat genotypes. The 55 elite genotypes of wheat used in this study originated in 29 countries representing six continents. A total of 155 alleles were detected at 21 loci using the above microsatellite primer pairs (only 1 primer amplified 2 loci; all other primers amplified 1 locus each). Of the 20 primers amplifying 21 loci, 17 primers and their corresponding 18 loci were assigned to 13 different chromosomes (6 chromosomes of the A genome, 5 chromosomes of the B genome and 2 chromosomes of the D genome). The number of alleles per locus ranged from 1 to 13, with an average of 7.4 alleles per locus. The values of average polymorphic information content (PIC) and the marker index (MI) for these markers were estimated to be 0.71 and 0.70, respectively. The (GT)n microsatellites were found to be the most polymorphic. The genetic similarity (GS) coefficient for all possible 1485 pairs of genotypes ranged from 0.05 to 0.88 with an average of 0.23. The dendrogram, prepared on the basis of similarity matrix using the UPGMA algorithm, delineated the above genotypes into two major clusters (I and II), each with two subclusters (Ia, Ib and IIa, IIb). One of these subclusters (Ib) consisted of a solitary genotype (E3111) from Portugal, so that it was unique and diverse with respect to all other genotypes belonging to cluster I and placed in subcluster Ia. Using a set of only 12 primer pairs, we were able to distinguish a maximum of 48 of the above 55 wheat genotypes. The results demonstrate the utility of microsatellite markers for detecting polymorphism leading to genotype identification and for estimating genetic diversity. Received: 15 May 1999 / Accepted: 27 July 1999  相似文献   

15.
Copley RR  Doerks T  Letunic I  Bork P 《FEBS letters》2002,513(1):129-134
Domains present one of the most useful levels at which to understand protein function, and domain family-based analysis has had a profound impact on the study of individual proteins. Protein domain discovery has been progressing steadily over the past 30 years. What are the realistically achievable goals of sequence-based domain analysis, and how far off are they for the sequences encoded in eukaryotic genomes? Here we address some of the issues involved in better coverage of sequence-based domain annotation, and the integration of these results within the wider context of genomes, structures and function.  相似文献   

16.
In this article we describe and demonstrate the versatility of a computer program, GENOME MAPPING, that uses interactive graphics and runs on an IRIS workstation. The program helps to visualize as well as analyse global and local patterns of genomic DNA sequences. It was developed keeping in mind the requirements of the human genome sequencing programme, which requires rapid analysis of the data. Using GENOME MAPPING one can discern signature patterns of different kinds of sequences and analyse such patterns for repetitive as well as rare sequence strings. Further, one can visualize the extent of global homology between different genomic sequences. An application of our method to the published yeast mitochondrial genome data shows similar sequence organizations in the entire sequence and in smaller subsequences  相似文献   

17.
We introduce and analyse a simple probabilistic model of genome evolution. It is based on three fundamental evolutionary events: gene loss, duplication and accumulated change. This is motivated by previous works which consisted in fitting the available genomic data into, what is called paralog distributions. This formalism is described by a system of infinite number of linear equations. We show that this system generates a semigroup of linear operators on the space l 1. We prove that size distribution of paralogous gene families in a genome converges to the equilibrium as time goes to infinity. Moreover we show that when probabilities of gene removal and duplication are close to each other, then the resulting distribution is close to logarithmic distribution. Some empirical results for yeast genomes are presented.  相似文献   

18.
The yellow catfish Pelteobagrus fulvidraco is a freshwater fish species. Due to overfishing and pollution of freshwater ecosystems, the wild stocks of this fish reduced substantially. We isolated and characterized 12 polymorphic microsatellites of this species. The number of alleles at the 12 microsatellite loci ranged from four to eight, with an average of 6.6/locus. The average observed heterozygosity was 0.72, whereas the expected heterozygosity ranged from 0.60 to 0.86 (average: 0.80). All 12 microsatellites conformed to Hardy–Weinberg Equilibrium and were in linkage equilibrium. These 12 novel microsatellites could facilitate studies of genetic diversity and population structure of the yellow catfish to supply necessary information of conservation of the yellow catfish.  相似文献   

19.
20.
Glycosyltransferases comprise highly divergent groups of enzymes, which play a central role in the synthesis of complex glycans. Because the repertoire of glycosyltransferases in the genome determines the range of synthesizable glycans, and because the increasing amount of genome sequence data is now available, it is essential to examine these enzymes across organisms to explore possible structures and functions of the glycoconjugates. In this study, we systematically investigated 36 eukaryotic genomes and obtained 3426 glycosyltransferase homologs for biosynthesis of major glycans, classified into 53 families based on sequence similarity. The families were further grouped into six functional categories based on the biosynthetic pathways, which revealed characteristic patterns among organism groups in the degree of conservation and in the number of paralogs. The results also revealed a strong correlation between the number of glycosyltransferases and the number of coding genes in each genome. We then predicted the ability to synthesize major glycan structures including N-glycan precursors and GPI-anchors in each organism from the combination of the glycosyltransferase families. This indicates that not only parasitic protists but also some algae are likely to synthesize smaller structures than the structures known to be conserved among a wide range of eukaryotes. Finally we discuss the functions of two large families, sialyltransferases and β4-glycosyltransferases, by performing finer classifications into subfamilies. Our findings suggest that universality and diversity of glycans originate from two types of evolution of glycosyltransferase families, namely conserved families with few paralogs and diverged families with many paralogs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号