首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background  

Many cutting-edge microarray analysis tools and algorithms, including commonly used limma and affy packages in Bioconductor, need sophisticated knowledge of mathematics, statistics and computer skills for implementation. Commercially available software can provide a user-friendly interface at considerable cost. To facilitate the use of these tools for microarray data analysis on an open platform we developed an online microarray data analysis platform, WebArray, for bench biologists to utilize these tools to explore data from single/dual color microarray experiments.  相似文献   

2.
Genome sequencing projects are either based on whole genome shotgun (WGS) or on a BAC-by-BAC strategy. Although WGS is in most cases the preferred choice, sometimes the BAC-by-BAC approach may be better because it requires a much simpler assembly process. Furthermore, when the study is limited to specific regions of the genome, the WGS would require an unjustified effort, making the BAC-by-BAC the only feasible strategy. In this paper we describe an informatics pipeline called PABS (Platform Assisted BAC-by-BAC Sequencing) that we developed to provide a tool to optimize the BAC-by-BAC sequencing strategy. PABS has two main functions: (i) PABS-Select, to choose suitable overlapping clones; and (ii) PABS-Validate, to verify whether a BAC under analysis is actually overlapping the neighboring BAC.  相似文献   

3.

Background  

It is difficult to accurately interpret chromosomal correspondences such as true orthology and paralogy due to significant divergence of genomes from a common ancestor. Analyses are particularly problematic among lineages that have repeatedly experienced whole genome duplication (WGD) events. To compare multiple "subgenomes" derived from genome duplications, we need to relax the traditional requirements of "one-to-one" syntenic matchings of genomic regions in order to reflect "one-to-many" or more generally "many-to-many" matchings. However this relaxation may result in the identification of synteny blocks that are derived from ancient shared WGDs that are not of interest. For many downstream analyses, we need to eliminate weak, low scoring alignments from pairwise genome comparisons. Our goal is to objectively select subset of synteny blocks whose total scores are maximized while respecting the duplication history of the genomes in comparison. We call this "quota-based" screening of synteny blocks in order to appropriately fill a quota of syntenic relationships within one genome or between two genomes having WGD events.  相似文献   

4.

Background

Extant genomes share regions where genes have the same order and orientation, which are thought to arise from the conservation of an ancestral order of genes during evolution. Such regions of so-called conserved synteny, or synteny blocks, must be precisely identified and quantified, as a prerequisite to better understand the evolutionary history of genomes.

Results

Here we describe PhylDiag, a software that identifies statistically significant synteny blocks in pairwise comparisons of eukaryote genomes. Compared to previous methods, PhylDiag uses gene trees to define gene homologies, thus allowing gene deletions to be considered as events that may break the synteny. PhylDiag also accounts for gene orientations, blocks of tandem duplicates and lineage specific de novo gene births. Starting from two genomes and the corresponding gene trees, PhylDiag returns synteny blocks with gaps less than or equal to the maximum gap parameter gapmax. This parameter is theoretically estimated, and together with a utility to graphically display results, contributes to making PhylDiag a user friendly method. In addition, putative synteny blocks are subject to a statistical validation to verify that they are unlikely to be due to a random combination of genes.

Conclusions

We benchmark several known metrics to measure 2D-distances in a matrix of homologies and we compare PhylDiag to i-ADHoRe 3.0 on real and simulated data. We show that PhylDiag correctly identifies small synteny blocks even with insertions, deletions, incorrect annotations or micro-inversions. Finally, PhylDiag allowed us to identify the most relevant distance metric for 2D-distance calculation between homologies.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-268) contains supplementary material, which is available to authorized users.  相似文献   

5.
The COVID-19 pandemic is shifting teaching to an online setting all over the world. The Galaxy framework facilitates the online learning process and makes it accessible by providing a library of high-quality community-curated training materials, enabling easy access to data and tools, and facilitates sharing achievements and progress between students and instructors. By combining Galaxy with robust communication channels, effective instruction can be designed inclusively, regardless of the students’ environments.  相似文献   

6.
7.
Whole-genome comparisons are highly informative regarding genome evolution and can reveal the conservation of genome organization and gene content, gene regulatory elements, and presence of species-specific genes. Initial comparative genome analyses of the human malaria parasite Plasmodium falciparum and rodent malaria parasites (RMPs) revealed a core set of 4,500 Plasmodium orthologs located in the highly syntenic central regions of the chromosomes that sharply defined the boundaries of the variable subtelomeric regions. We used composite RMP contigs, based on partial DNA sequences of three RMPs, to generate a whole-genome synteny map of P. falciparum and the RMPs. The core regions of the 14 chromosomes of P. falciparum and the RMPs are organized in 36 synteny blocks, representing groups of genes that have been stably inherited since these malaria species diverged, but whose relative organization has altered as a result of a predicted minimum of 15 recombination events. P. falciparum-specific genes and gene families are found in the variable subtelomeric regions (575 genes), at synteny breakpoints (42 genes), and as intrasyntenic indels (126 genes). Of the 168 non-subtelomeric P. falciparum genes, including two newly discovered gene families, 68% are predicted to be exported to the surface of the blood stage parasite or infected erythrocyte. Chromosomal rearrangements are implicated in the generation and dispersal of P. falciparum-specific gene families, including one encoding receptor-associated protein kinases. The data show that both synteny breakpoints and intrasyntenic indels can be foci for species-specific genes with a predicted role in host-parasite interactions and suggest that, besides rearrangements in the subtelomeric regions, chromosomal rearrangements may also be involved in the generation of species-specific gene families. A majority of these genes are expressed in blood stages, suggesting that the vertebrate host exerts a greater selective pressure than the mosquito vector, resulting in the acquisition of diversity.  相似文献   

8.
Citizen science has grown rapidly in popularity in recent years due to its potential to educate and engage the public while providing a means to address a myriad of scientific questions. However, the rise in popularity of citizen science has also been accompanied by concerns about the quality of data emerging from citizen science research projects. We assessed data quality in the online citizen scientist platform Chimp&See, which hosts camera trap videos of chimpanzees (Pan troglodytes) and other species across Equatorial Africa. In particular, we compared detection and identification of individual chimpanzees by citizen scientists with that of experts with years of experience studying those chimpanzees. We found that citizen scientists typically detected the same number of individual chimpanzees as experts, but assigned far fewer identifications (IDs) to those individuals. Those IDs assigned, however, were nearly always in agreement with the IDs provided by experts. We applied the data sets of citizen scientists and experts by constructing social networks from each. We found that both social networks were relatively robust and shared a similar structure, as well as having positively correlated individual network positions. Our findings demonstrate that, although citizen scientists produced a smaller data set based on fewer confirmed IDs, the data strongly reflect expert classifications and can be used for meaningful assessments of group structure and dynamics. This approach expands opportunities for social research and conservation monitoring in great apes and many other individually identifiable species.  相似文献   

9.
FaBox is a collection of simple and intuitive web services that enable biologists and medical researchers to quickly perform typical task with sequence data. The services makes it easy to extract, edit, and replace sequence headers and join or divide data sets based on header information. Other services include collapsing a set of sequences into haplotypes and automated formatting of input files for a number of population genetics programs, such as arlequin , tcs and mrbayes . The toolbox is expected to grow on the basis of requests for particular services and converters in the future. FaBox is freely available at http://www.birc.au.dk/fabox .  相似文献   

10.
11.
MOTIVATION: The recent efforts of various sequence projects to sequence deeply into various phylogenies provide great resources for comparative sequence analysis. A generic and portable tool is essential for scientists to visualize and analyze sequence comparisons. RESULTS: We have developed SynBrowse, a synteny browser for visualizing and analyzing genome alignments both within and between species. It is intended to help scientists study macrosynteny, microsynteny and homologous genes between sequences. It can also aid with the identification of uncharacterized genes, putative regulatory elements and novel structural features of a species. SynBrowse is a GBrowse (the Generic Genome Browser) family software tool that runs on top of the open source BioPerl modules. It consists of two components: a web-based front end and a set of relational database back ends. Each database stores pre-computed alignments from a focus sequence to reference sequences in addition to the genome annotations of the focus sequence. The user interface lets end users select a key comparative alignment type and search for syntenic blocks between two sequences and zoom in to view the relationships among the corresponding genome annotations in detail. SynBrowse is portable with simple installation, flexible configuration, convenient data input and easy integration with other components of a model organism system. AVAILABILITY: The software is available at http://www.gmod.org CONTACT: vbrendel@iastate.edu  相似文献   

12.
13.

Background  

Blueberry is a member of the Ericaceae family, which also includes closely related cranberry and more distantly related rhododendron, azalea, and mountain laurel. Blueberry is a major berry crop in the United States, and one that has great nutritional and economical value. Extreme low temperatures, however, reduce crop yield and cause major losses to US farmers. A better understanding of the genes and biochemical pathways that are up- or down-regulated during cold acclimation is needed to produce blueberry cultivars with enhanced cold hardiness. To that end, the blueberry genomics database (BBDG) was developed. Along with the analysis tools and web-based query interfaces, the database serves both the broader Ericaceae research community and the blueberry research community specifically by making available ESTs and gene expression data in searchable formats and in elucidating the underlying mechanisms of cold acclimation and freeze tolerance in blueberry.  相似文献   

14.

Background  

Many online resources for the life sciences have been developed and introduced in peer-reviewed papers recently, ranging from databases and web applications to data-analysis software. Some have been introduced in special journal issues or websites with a search function, but others remain scattered throughout the Internet and in the published literature. The searchable resources on these sites are collected and maintained manually and are therefore of higher quality than automatically updated sites, but also require more time and effort.  相似文献   

15.
Ribosomal RNA-(rRNA)-targeted oligonucleotide probes are widely used for culture-independent identification of microorganisms in environmental and clinical samples. ProbeBase is a comprehensive database containing more than 700 published rRNA-targeted oligonucleotide probe sequences (status August 2002) with supporting bibliographic and biological annotation that can be accessed through the internet at http://www.probebase.net. Each oligonucleotide probe entry contains information on target organisms, target molecule (small- or large-subunit rRNA) and position, G+C content, predicted melting temperature, molecular weight, necessity of competitor probes, and the reference that originally described the oligonucleotide probe, including a link to the respective abstract at PubMed. In addition, probes successfully used for fluorescence in situ hybridization (FISH) are highlighted and the recommended hybridization conditions are listed. ProbeBase also offers difference alignments for 16S rRNA-targeted probes by using the probe match tool of the ARB software and the latest small-subunit rRNA ARB database (release June 2002). The option to directly submit probe sequences to the probe match tool of the Ribosomal Database Project II (RDP-II) further allows one to extract supplementary information on probe specificities. The two main features of probeBase, 'search probeBase' and 'find probe set', help researchers to find suitable, published oligonucleotide probes for microorganisms of interest or for rRNA gene sequences submitted by the user. Furthermore, the 'search target site' option provides guidance for the development of new FISH probes.  相似文献   

16.
Whole-genome comparisons provide insight into genome evolution by informing on gene repertoires, gene gains/losses, and genome organization. Most of our knowledge about eukaryotic genome evolution is derived from studies of multicellular model organisms. The eukaryotic phylum Apicomplexa contains obligate intracellular protist parasites responsible for a wide range of human and veterinary diseases (e.g., malaria, toxoplasmosis, and theileriosis). We have developed an in silico protein-encoding gene based pipeline to investigate synteny across 12 apicomplexan species from six genera. Genome rearrangement between lineages is extensive. Syntenic regions (conserved gene content and order) are rare between lineages and appear to be totally absent across the phylum, with no group of three genes found on the same chromosome and in the same order within 25 kb up- and downstream of any orthologous genes. Conserved synteny between major lineages is limited to small regions in Plasmodium and Theileria/Babesia species, and within these conserved regions, there are a number of proteins putatively targeted to organelles. The observed overall lack of synteny is surprising considering the divergence times and the apparent absence of transposable elements (TEs) within any of the species examined. TEs are ubiquitous in all other groups of eukaryotes studied to date and have been shown to be involved in genomic rearrangements. It appears that there are different criteria governing genome evolution within the Apicomplexa relative to other well-studied unicellular and multicellular eukaryotes.  相似文献   

17.
MetaboNexus is an interactive metabolomics data analysis platform that integrates pre-processing of raw peak data with in-depth statistical analysis and metabolite identity search. It is designed to work as a desktop application hence uploading large files to web servers is not required. This could speed up the data analysis process because server queries or queues are avoided, while ensuring security of confidential clinical data on a local computer. With MetaboNexus, users can progressively start from data pre-processing, multi- and univariate analysis to metabolite identity search of significant molecular features, thereby seamlessly integrating critical steps for metabolite biomarker discovery. Data exploration can be first performed using principal components analysis, while prediction and variable importance can be calculated using partial least squares-discriminant analysis and Random Forest. After identifying putative features from multi- and univariate analyses (e.g. t test, ANOVA, Mann–Whitney U test and Kruskal–Wallis test), users can seamlessly determine the molecular identity of these putative features. To assist users in data interpretation, MetaboNexus also automatically generates graphical outputs, such as score plots, diagnostic plots, boxplots, receiver operating characteristic plots and heatmaps. The metabolite search function will match the mass spectrometric peak data to three major metabolite repositories, namely HMDB, MassBank and METLIN, using a comprehensive range of molecular adducts. Biological pathways can also be searched within MetaboNexus. MetaboNexus is available with installation guide and tutorial at http://www.sph.nus.edu.sg/index.php/research-services/research-centres/ceohr/metabonexus, and is meant for the Windows Operating System, XP and onwards (preferably on 64-bit). In summary, MetaboNexus is a desktop-based platform that seamlessly integrates the entire data analytical workflow and further provides the putative identities of mass spectrometric data peaks by matching them to databases.  相似文献   

18.
The increasing volume of ChIP-chip and ChIP-seq data being generated creates a challenge for standard, integrative and reproducible bioinformatics data analysis platforms. We developed a web-based application called Cistrome, based on the Galaxy open source framework. In addition to the standard Galaxy functions, Cistrome has 29 ChIP-chip- and ChIP-seq-specific tools in three major categories, from preliminary peak calling and correlation analyses to downstream genome feature association, gene expression analyses, and motif discovery. Cistrome is available at http://cistrome.org/ap/.  相似文献   

19.
Wang  Hao  Xi  Qilemuge  Liang  Pengfei  Zheng  Lei  Hong  Yan  Zuo  Yongchun 《Amino acids》2021,53(2):239-251

Enzymes have been proven to play considerable roles in disease diagnosis and biological functions. The feature extraction that truly reflects the intrinsic properties of protein is the most critical step for the automatic identification of enzymes. Although lots of feature extraction methods have been proposed, some challenges remain. In this study, we developed a predictor called IHEC_RAAC, which has the capability to identify whether a protein is a human enzyme and distinguish the function of the human enzyme. To improve the feature representation ability, protein sequences were encoded by a new feature-vector called ‘reduced amino acid cluster’. We calculated 673 amino acid reduction alphabets to determine the optimal feature representative scheme. The tenfold cross-validation test showed that the accuracy of IHEC_RAAC to identify human enzymes was 74.66% and further discriminate the human enzyme classes with an accuracy of 54.78%, which was 2.06% and 8.68% higher than the state-of-the-art predictors, respectively. Additionally, the results from the independent dataset indicated that IHEC_RAAC can effectively predict human enzymes and human enzyme classes to further provide guidance for protein research. A user-friendly web server, IHEC_RAAC, is freely accessible at http://bioinfor.imu.edu.cn/ihecraac.

  相似文献   

20.
Investigation of physiological mechanisms at a cellular level often requires production of high-quality antibodies, frequently using synthetic peptides as immunogens. Here we describe a new, web-based software tool called NHLBI-AbDesigner that allows the user to visualize the information needed to choose optimal peptide sequences for peptide-directed antibody production (http://helixweb.nih.gov/AbDesigner/). The choice of an immunizing peptide is generally based on a need to optimize immunogenicity, antibody specificity, multispecies conservation, and robustness in the face of posttranslational modifications (PTMs). AbDesigner displays information relevant to these criteria as follows: 1) "Immunogenicity Score," based on hydropathy and secondary structure prediction; 2) "Uniqueness Score," a predictor of specificity of an antibody against all proteins expressed in the same species; 3) "Conservation Score," a predictor of ability of the antibody to recognize orthologs in other animal species; and 4) "Protein Features" that show structural domains, variable regions, and annotated PTMs that may affect antibody performance. AbDesigner displays the information online in an interactive graphical user interface, which allows the user to recognize the trade-offs that exist for alternative synthetic peptide choices and to choose the one that is best for a proposed application. Several examples of the use of AbDesigner for the display of such trade-offs are presented, including production of a new antibody to Slc9a3. We also used the program in large-scale mode to create a database listing the 15-amino acid peptides with the highest Immunogenicity Scores for all known proteins in five animal species, one plant species (Arabidopsis thaliana), and Saccharomyces cerevisiae.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号