首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
4.
5.
6.
7.
Gigabase-scale genome assemblies are now feasible using short-read sequencing technology, bringing the cost of such projects below the million-dollar mark.  相似文献   

8.
9.
The Filoviridae family comprises of Ebola and Marburg viruses, which are known to cause lethal hemorrhagic fever. However, there is no effective anti-viral therapy or licensed vaccines currently available for these human pathogens. The envelope glycoprotein (GP) of Ebola virus, which mediates entry into target cells, is cytotoxic and this effect maps to a highly glycosylated mucin-like region in the surface subunit of GP (GP1). However, the mechanism underlying this cytotoxic property of GP is unknown. To gain insight into the basis of this GP-induced cytotoxicity, HEK293T cells were transiently transfected with full-length and mucin-deleted (Δmucin) Ebola GP plasmids and GP localization was examined relative to the nucleus, endoplasmic reticulum (ER), Golgi, early and late endosomes using deconvolution fluorescent microscopy. Full-length Ebola GP was observed to accumulate in the ER. In contrast, GPΔmucin was uniformly expressed throughout the cell and did not localize in the ER. The Ebola major matrix protein VP40 was also co-expressed with GP to investigate its influence on GP localization. GP and VP40 co-expression did not alter GP localization to the ER. Also, when VP40 was co-expressed with the nucleoprotein (NP), it localized to the plasma membrane while NP accumulated in distinct cytoplasmic structures lined with vimentin. These latter structures are consistent with aggresomes and may serve as assembly sites for filoviral nucleocapsids. Collectively, these data suggest that full-length GP, but not GPΔmucin, accumulates in the ER in close proximity to the nuclear membrane, which may underscore its cytotoxic property.  相似文献   

10.
Exploring the plant transcriptome through phylogenetic profiling   总被引:5,自引:0,他引:5       下载免费PDF全文
Publicly available protein sequences represent only a small fraction of the full catalog of genes encoded by the genomes of different plants, such as green algae, mosses, gymnosperms, and angiosperms. By contrast, an enormous amount of expressed sequence tags (ESTs) exists for a wide variety of plant species, representing a substantial part of all transcribed plant genes. Integrating protein and EST sequences in comparative and evolutionary analyses is not straightforward because of the heterogeneous nature of both types of sequence data. By combining information from publicly available EST and protein sequences for 32 different plant species, we identified more than 250,000 plant proteins organized in more than 12,000 gene families. Approximately 60% of the proteins are absent from current sequence databases but provide important new information about plant gene families. Analysis of the distribution of gene families over different plant species through phylogenetic profiling reveals interesting insights into plant gene evolution, and identifies species- and lineage-specific gene families, orphan genes, and conserved core genes across the green plant lineage. We counted a similar number of approximately 9,500 gene families in monocotyledonous and eudicotyledonous plants and found strong evidence for the existence of at least 33,700 genes in rice (Oryza sativa). Interestingly, the larger number of genes in rice compared to Arabidopsis (Arabidopsis thaliana) can partially be explained by a larger amount of species-specific single-copy genes and species-specific gene families. In addition, a majority of large gene families, typically containing more than 50 genes, are bigger in rice than Arabidopsis, whereas the opposite seems true for small gene families.  相似文献   

11.
We describe a high-throughput cDNA sequencing pipeline (http://www.hgsc.bcm.tmc.edu/projects/cdna) built in response to the emerging need for rapid sequencing of large cDNA collections. Using this strategy cDNA inserts are purified and joined through concatenation into large molecules. These 'pseudo-BACs' are subjected to random shotgun sequencing whereby the majority of cDNA inserts in the pool are sequenced. Using this concatenation cDNA sequencing platform, we have contributed more than 13000 full-length cDNA sequences from human and mouse to the Mammalian Gene Collection (MGC).  相似文献   

12.
13.
Current protocols for DNA methylation analysis are either labor intensive or limited to the measurement of only one or two CpG positions. Pyrosequencing is a real-time sequencing technology that can overcome these limitations and be used as an epigenotype-mapping tool. Initial experiments demonstrated reliable quantification of the degree of DNA methylation when 2-6 CpGs were analyzed. We sought to improve the sequencing protocol so as to analyze as many CpGs as possible in a single sequencing run. By using an improved enzyme mix and adding single-stranded DNA-binding protein to the reaction, we obtained reproducible results for as many as 10 successive CpGs in a single sequencing reaction spanning up to 75 nucleotides. A minimum amount of 10 ng of bisulfite-treated DNA is necessary to obtain good reproducibility and avoid preferential amplification. We applied the assay to the analysis of DNA methylation patterns in four CpG islands in the vicinity of IGF2 and H19 genes. This allowed accurate and quantitative de novo sequencing of the methylation state of each CpG, showing reproducible variations of methylation state in contiguous CpGs, and proved to be a useful adjunct to current technologies.  相似文献   

14.
15.

Background

The tremendous output of massive parallel sequencing technologies requires automated robust and scalable sample preparation methods to fully exploit the new sequence capacity.

Methodology

In this study, a method for automated library preparation of RNA prior to massively parallel sequencing is presented. The automated protocol uses precipitation onto carboxylic acid paramagnetic beads for purification and size selection of both RNA and DNA. The automated sample preparation was compared to the standard manual sample preparation.

Conclusion/Significance

The automated procedure was used to generate libraries for gene expression profiling on the Illumina HiSeq 2000 platform with the capacity of 12 samples per preparation with a significantly improved throughput compared to the standard manual preparation. The data analysis shows consistent gene expression profiles in terms of sensitivity and quantification of gene expression between the two library preparation methods.  相似文献   

16.
17.

Background

RNA-seq has spurred important gene fusion discoveries in a number of different cancers, including lung, prostate, breast, brain, thyroid and bladder carcinomas. Gene fusion discovery can potentially lead to the development of novel treatments that target the underlying genetic abnormalities.

Results

In this study, we provide comprehensive view of gene fusion landscape in 185 glioblastoma multiforme patients from two independent cohorts. Fusions occur in approximately 30-50% of GBM patient samples. In the Ivy Center cohort of 24 patients, 33% of samples harbored fusions that were validated by qPCR and Sanger sequencing. We were able to identify high-confidence gene fusions from RNA-seq data in 53% of the samples in a TCGA cohort of 161 patients. We identified 13 cases (8%) with fusions retaining a tyrosine kinase domain in the TCGA cohort and one case in the Ivy Center cohort. Ours is the first study to describe recurrent fusions involving non-coding genes. Genomic locations 7p11 and 12q14-15 harbor majority of the fusions. Fusions on 7p11 are formed in focally amplified EGFR locus whereas 12q14-15 fusions are formed by complex genomic rearrangements. All the fusions detected in this study can be further visualized and analyzed using our website: http://ivygap.swedish.org/fusions.

Conclusions

Our study highlights the prevalence of gene fusions as one of the major genomic abnormalities in GBM. The majority of the fusions are private fusions, and a minority of these recur with low frequency. A small subset of patients with fusions of receptor tyrosine kinases can benefit from existing FDA approved drugs and drugs available in various clinical trials. Due to the low frequency and rarity of clinically relevant fusions, RNA-seq of GBM patient samples will be a vital tool for the identification of patient-specific fusions that can drive personalized therapy.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-14-818) contains supplementary material, which is available to authorized users.  相似文献   

18.
19.
20.
Second-generation sequencing is increasingly being used in combination with genome-enrichment techniques to amplify a large number of loci in many individuals for the purpose of population genetic and phylogeographic analysis. Compiling all the necessary tools to analyse these data is complex and time-consuming. Here, we assemble a set of programs and pipe them together with Perl, enabling research laboratories without a dedicated bioinformatician to utilize second-generation sequencing. User input is a folder of the second-generation sequencing reads sorted by individual (in FASTA format) and pipeline output is a folder of multi-FASTA files that correspond to loci (with 2 alleles called per individual). Additional output includes a summary file of the number of individuals per locus, observed and expected heterozygosity for each locus, distribution of multiple hits and summary statistics (θ, Tajima's D, etc.). This user-friendly, open source pipeline, which requires no a priori reference genome because it constructs its own, allows the user to set various parameters (e.g. minimum coverage) in the dependent programs (CAP3, BWA, SAMtools and VarScan) and facilitates evaluation of the nature and quality of data collected prior to analysis in software packages.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号