首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Inferences of population genetic structure are of great importance to the fields of ecology and evolutionary biology. The program structure has been widely used to infer population genetic structure. However, previous studies demonstrated that uneven sampling often leads to wrong inferences on hierarchical structure. The most widely used ΔK method tends to identify the uppermost hierarchy of population structure. Recently, four alternative statistics (medmedk , medmeak , maxmedk and maxmeak ) were proposed, which appear to be more accurate than the previously used methods for both even and uneven sampling data. However, the lack of easy‐to‐use software limits the use of these appealing new estimators. Here, we developed a web‐based user‐friendly software structureselector to calculate the four appealing alternative statistics together with the commonly used Ln Pr(X|K) and ΔK statistics. structureselector accepts the result files of structure , admixture or faststructure as input files. It reports the “best” K for each estimator, and the results are available as HTML or tab separated tables. The program can also generate graphical representations for specific K, which can be easily downloaded from the server. The software is freely available at http://lmme.qdio.ac.cn/StructureSelector/ .  相似文献   

2.
The MSE (where MSE is low energy (MS) and elevated energy (E) mode of acquisition) acquisition method commercialized by Waters on its Q‐TOF instruments is regarded as a unique data‐independent fragmentation approach that improves the accuracy and dynamic range of label‐free proteomic quantitation. Due to its special format, MSE acquisition files cannot be independently analyzed with most widely used open‐source proteomic software specialized for processing data‐dependent acquisition files. In this study, we established a workflow integrating Skyline, a popular and versatile peptide‐centric quantitation program, and a statistical tool DiffProt to fulfill MSE‐based proteomic quantitation. Comparison with the vendor software package for analyzing targeted phosphopeptides and global proteomic datasets reveals distinct advantages of Skyline in MSE data mining, including sensitive peak detection, flexible peptide filtering, and transparent step‐by‐step workflow. Moreover, we developed a new procedure such that Skyline MS1 filtering was extended to small molecule quantitation for the first time. This new utility of Skyline was examined in a protein–ligand interaction experiment to identify multiple chemical compounds specifically bound to NDM‐1 (where NDM is New Delhi metallo‐β‐lactamase 1), an antibiotics‐resistance target. Further improvement of the current weaknesses in Skyline MS1 filtering is expected to enhance the reliability of this powerful program in full scan‐based quantitation of both peptides and small molecules.  相似文献   

3.
Tandem MS (MS2) quantification using the series of N‐ and C‐terminal fragment ion pairs generated from isobaric‐labelled peptides was recently considered an accurate strategy in quantitative proteomics. However, the presence of multiplexed terminal fragment ion in MS2 spectra may reduce the efficiency of peptide identification, resulting in lower identification scores or even incorrect assignments. To address this issue, we developed a quantitative software tool, denoted isobaric tandem MS quantification (ITMSQ), to improve N‐ and C‐terminal fragment ion pairs based isobaric MS2 quantification. A spectrum splitting module was designed to separate the MS2 spectra from different samples, increasing the accuracy of both identification and quantification. ITMSQ offers a convenient interface through which parameters can be changed along with the labelling method, and the result files and all of the intermediate files can be exported. We performed an analysis of in vivo terminal amino acid labelling labelled HeLa samples and found that the numbers of quantified proteins and peptides increased by 13.64 and 27.52% after spectrum splitting, respectively. In conclusion, ITMSQ provides an accurate and reliable quantitative solutionfor N‐ and C‐terminal fragment ion pairs based isobaric MS2 quantitative methods.  相似文献   

4.
The mzQuantML data standard was designed to capture the output of quantitative software in proteomics, to support submissions to public repositories, development of visualization software and pipeline/modular approaches. The standard is designed around a common core that can be extended to support particular types of technique through the release of semantic rules that are checked by validation software. The first release of mzQuantML supported four quantitative proteomics techniques via four sets of semantic rules: (i) intensity‐based (MS1) label free, (ii) MS1 label‐based (such as SILAC or N15), (iii) MS2 tag‐based (iTRAQ or tandem mass tags), and (iv) spectral counting. We present an update to mzQuantML for supporting SRM techniques. The update includes representing the quantitative measurements, and associated meta‐data, for SRM transitions, the mechanism for inferring peptide‐level or protein‐level quantitative values, and support for both label‐based or label‐free SRM protocols, through the creation of semantic rules and controlled vocabulary terms. We have updated the specification document for mzQuantML (version 1.0.1) and the mzQuantML validator to ensure that consistent files are produced by different exporters. We also report the capabilities for production of mzQuantML files from popular SRM software packages, such as Skyline and Anubis.  相似文献   

5.
We present SequenceMatrix, software that is designed to facilitate the assembly and analysis of multi‐gene datasets. Genes are concatenated by dragging and dropping FASTA, NEXUS, or TNT files with aligned sequences into the program window. A multi‐gene dataset is concatenated and displayed in a spreadsheet; each sequence is represented by a cell that provides information on sequence length, number of indels, the number of ambiguous bases (“Ns”), and the availability of codon information. Alternatively, GenBank numbers for the sequences can be displayed and exported. Matrices with hundreds of genes and taxa can be concatenated within minutes and exported in TNT, NEXUS, or PHYLIP formats, preserving both character set and codon information for TNT and NEXUS files. SequenceMatrix also creates taxon sets listing taxa with a minimum number of characters or gene fragments, which helps assess preliminary datasets. Entire taxa, whole gene fragments, or individual sequences for a particular gene and species can be excluded from export. Data matrices can be re‐split into their component genes and the gene fragments can be exported as individual gene files. SequenceMatrix also includes two tools that help to identify sequences that may have been compromised through laboratory contamination or data management error. One tool lists identical or near‐identical sequences within genes, while the other compares the pairwise distance pattern of one gene against the pattern for all remaining genes combined. SequenceMatrix is Java‐based and compatible with the Microsoft Windows, Apple MacOS X and Linux operating systems. The software is freely available from http://code.google.com/p/sequencematrix/ . © The Willi Hennig Society 2010.  相似文献   

6.
The Cochran–Armitage (CA) linear trend test for proportions is often used for genotype‐based analysis of candidate gene association. Depending on the underlying genetic mode of inheritance, the use of model‐specific scores maximises the power. Commonly, the underlying genetic model, i.e. additive, dominant or recessive mode of inheritance, is a priori unknown. Association studies are commonly analysed using permutation tests, where both inference and identification of the underlying mode of inheritance are important. Especially interesting are tests for case–control studies, defined by a maximum over a series of standardised CA tests, because such a procedure has power under all three genetic models. We reformulate the test problem and propose a conditional maximum test of scores‐specific linear‐by‐linear association tests. For maximum‐type, sum and quadratic test statistics the asymptotic expectation and covariance can be derived in a closed form and the limiting distribution is known. Both the limiting distribution and approximations of the exact conditional distribution can easily be computed using standard software packages. In addition to these technical advances, we extend the area of application to stratified designs, studies involving more than two groups and the simultaneous analysis of multiple loci by means of multiplicity‐adjusted p‐values for the underlying multiple CA trend tests. The new test is applied to reanalyse a study investigating genetic components of different subtypes of psoriasis. A new and flexible inference tool for association studies is available both theoretically as well as practically since already available software packages can be easily used to implement the suggested test procedures.  相似文献   

7.
Failing to open computer files that describe image data is not the most frustrating experience that the user of a computer can suffer, but it is high on list of possible aggravations. To ameliorate this, the structure of uncompressed image data files is described here. The various ways in which information that describes a picture can be recorded are related, and a primary distinction between raster or bitmap based and vector or object based image data files is drawn. Bitmap based image data files are the more useful of the two formats for recording complicated images such as digital light micrographs, whereas object based files are better for recording illustrations and cartoons. Computer software for opening a very large variety of different formats of digital image data is recommended, and if these fail, ways are described for opening bitmap based digital image data files whose format is unknown.  相似文献   

8.
Failing to open computer files that describe image data is not the most frustrating experience that the user of a computer can suffer, but it is high on list of possible aggravations. To ameliorate this, the structure of uncompressed image data files is described here. The various ways in which information that describes a picture can be recorded are related, and a primary distinction between raster or bitmap based and vector or object based image data files is drawn. Bitmap based image data files are the more useful of the two formats for recording complicated images such as digital light micrographs, whereas object based files are better for recording illustrations and cartoons. Computer software for opening a very large variety of different formats of digital image data is recommended, and if these fail, ways are described for opening bitmap based digital image data files whose format is unknown.  相似文献   

9.
Here, we describe a single micro‐CT scan with a spatial resolution of 10 μm of a 10‐day‐old adult male Schistocerca gregaria (Forskål) (Orthoptera: Acrididae) and we compare our tracheal volume (VT) determination with published work on the subject. We also illustrate the feasibility of performing non‐invasive ‘virtual dissection’ on insects after performing micro‐CT. These post‐processing steps can be performed using free downloadable 3‐D software. Finally, the values of producing stereo‐lithography (STL) files that can be viewed or used to print out 3‐D models as teaching aids are discussed.  相似文献   

10.
Pulsed Q dissociation enables combining LTQ ion trap instruments with isobaric peptide tagging. Unfortunately, this combination lacks a technique which accurately reports protein abundance ratios and is implemented in a freely available, flexible software pipeline. We developed and implemented a technique assigning collective reporter ion intensity‐based weights to each peptide abundance ratio and calculating a protein's weighted average abundance ratio and p‐value. Using an iTRAQ‐labeled standard mixture, we compared our technique's performance to the commercial software MASCOT, finding that it performed better than MASCOT's nonweighted averaging and median peptide ratio techniques, and equal to its weighted averaging technique. We also compared performance of the LTQ‐Orbitrap plus our technique to 4800 MALDI TOF/TOF plus Protein Pilot, by analyzing an iTRAQ‐labeled stem cell lysate. We found highly correlated protein abundance ratios, indicating that the LTQ‐Orbitrap plus our technique yields results comparable to the current standard. We implemented our technique in a freely available, automated software pipeline, called LTQ‐iQuant, which is mzXML‐compatible; supports iTRAQ 4‐plex and 8‐plex LTQ data; and can be modified for and have weights trained to a user's LTQ and other isobaric peptide tagging methods. LTQ‐iQuant should make LTQ instruments and isobaric peptide tagging accessible to more proteomic researchers.  相似文献   

11.
macroeco is a Python package that supports the analysis of empirical macroecological patterns and the comparison of these patterns to theoretical predictions. Here we describe the use of macroeco and the various functions that it contains. We also highlight a unique high‐level interface included with the package, MacroecoDesktop, that allows non‐programmers to access the functionality of macroeco. MacroecoDesktop takes simple text‐based metadata and parameter files as inputs and generates both tabular and graphical outputs, supporting users in creating reproducible workflows that follow the principles of simplicity, provenance, and automation. Both macroeco and MacroecoDesktop provide case studies for developers of analytically‐focused scientific software packages who wish to better support the reproducible use of their tools.  相似文献   

12.
The HUPO Proteomics Standards Initiative has developed several standardized data formats to facilitate data sharing in mass spectrometry (MS)-based proteomics. These allow researchers to report their complete results in a unified way. However, at present, there is no format to describe the final qualitative and quantitative results for proteomics and metabolomics experiments in a simple tabular format. Many downstream analysis use cases are only concerned with the final results of an experiment and require an easily accessible format, compatible with tools such as Microsoft Excel or R.We developed the mzTab file format for MS-based proteomics and metabolomics results to meet this need. mzTab is intended as a lightweight supplement to the existing standard XML-based file formats (mzML, mzIdentML, mzQuantML), providing a comprehensive summary, similar in concept to the supplemental material of a scientific publication. mzTab files can contain protein, peptide, and small molecule identifications together with experimental metadata and basic quantitative information. The format is not intended to store the complete experimental evidence but provides mechanisms to report results at different levels of detail. These range from a simple summary of the final results to a representation of the results including the experimental design. This format is ideally suited to make MS-based proteomics and metabolomics results available to a wider biological community outside the field of MS. Several software tools for proteomics and metabolomics have already adapted the format as an output format. The comprehensive mzTab specification document and extensive additional documentation can be found online.Mass spectrometry (MS)1 has become a major analysis tool in the life sciences (1). It is currently used in different modes for several “omics” approaches, proteomics and metabolomics being the most prominent. In both disciplines, one major burden in the exchange, communication, and large-scale (re-) analysis of MS-based data is the significant number of software pipelines and, consequently, heterogeneous file formats used to process, analyze, and store these experimental results, including both identification and quantification data. Publication guidelines from scientific journals and funding agencies'' requirements for public data availability have led to an increasing amount of MS-based proteomics and metabolomics data being submitted to public repositories, such as those of the ProteomeXchange consortium (2) or, in the case of metabolomics, the resources from the nascent COSMOS (Coordination of Standards in Metabolomics) initiative (3).In the past few years, the Human Proteome Organization Proteomics Standards Initiative (PSI) has developed several vendor-neutral standard data formats to overcome the representation heterogeneity. The Human Proteome Organization PSI promotes the usage of three XML file formats to fully report the data coming from MS-based proteomics experiments (including related metadata): mzML (4) to store the “primary” MS data (the spectra and chromatograms), mzIdentML (5) to report peptide identifications and inferred protein identifications, and mzQuantML (6) to store quantitative information associated with these results.Even though the existence of the PSI standard data formats represents a huge step forward, these formats cannot address all use cases related to proteomics and metabolomics data exchange and sharing equally well. During the development of mzML, mzIdentML, and mzQuantML, the main focus lay on providing an exact and comprehensive representation of the gathered results. All three formats can be used within analysis pipelines and as interchange formats between independent analysis tools. It is thus vital that these formats be capable of storing the full data and analysis that led to the results. Therefore, all three formats result in relatively complex schemas, a clear necessity for adequate representation of the complexity found in MS-based data.An inevitable drawback of this approach is that data consumers can find it difficult to quickly retrieve the required information. Several application programming interfaces (APIs) have been developed to simplify software development based on these formats (79), but profound proteomics and bioinformatics knowledge still is required in order to use them efficiently and take full advantage of the comprehensive information contained.The new file format presented here, mzTab, aims to describe the qualitative and quantitative results for MS-based proteomics and metabolomics experiments in a consistent, simpler tabular format, abstracting from the mass spectrometry details. The format contains identifications, basic quantitative information, and related metadata. With mzTab''s flexible design, it is possible to report results at different levels ranging from a simple summary or subset of the complete information (e.g. the final results) to fairly comprehensive representation of the results including the experimental design. Many downstream analysis use cases are only concerned with the final results of an experiment in an easily accessible format that is compatible with tools such as Microsoft Excel® or R (10) and can easily be adapted by existing bioinformatics tools. Therefore, mzTab is ideally suited to make MS proteomics and metabolomics results available to the wider biological community, beyond the field of MS.mzTab follows a similar philosophy as the other tab-delimited format recently developed by the PSI to represent molecular interaction data, MITAB (11). MITAB is a simpler tab-delimited format, whereas PSI-MI XML (12), the more detailed XML-based format, holds the complete evidence. The microarray community makes wide use of the format MAGE-TAB (13), another example of such a solution that can cover the main use cases and, for the sake of simplicity, is often preferred to the XML standard format MAGE-ML (14). Additionally, in MS-based proteomics, several software packages, such as Mascot (15), OMSSA (16), MaxQuant (17), OpenMS/TOPP (18, 19), and SpectraST (20), also support the export of their results in a tab-delimited format next to a more complete and complex default format. These simple formats do not contain the complete information but are nevertheless sufficient for the most frequent use cases.mzTab has been designed with the same purpose in mind. It can be used alone or in conjunction with mzML (or other related MS data formats such as mzXML (21) or text-based peak list formats such as MGF), mzIdentML, and/or mzQuantML. Several highly successful concepts taken from the development process of mzIdentML and mzQuantML were adapted to the text-based nature of mzTab.In addition, there is a trend to perform more integrated experimental workflows involving both proteomics and metabolomics data. Thus, we developed a standard format that can represent both types of information in a single file.  相似文献   

13.
14.
The application of mass spectrometry imaging (MS imaging) is rapidly growing with a constantly increasing number of different instrumental systems and software tools. The data format imzML was developed to allow the flexible and efficient exchange of MS imaging data between different instruments and data analysis software. imzML data is divided in two files which are linked by a universally unique identifier (UUID). Experimental details are stored in an XML file which is based on the HUPO-PSI format mzML. Information is provided in the form of a 'controlled vocabulary' (CV) in order to unequivocally describe the parameters and to avoid redundancy in nomenclature. Mass spectral data are stored in a binary file in order to allow efficient storage. imzML is supported by a growing number of software tools. Users will be no longer limited to proprietary software, but are able to use the processing software best suited for a specific question or application. MS imaging data from different instruments can be converted to imzML and displayed with identical parameters in one software package for easier comparison. All technical details necessary to implement imzML and additional background information is available at www.imzml.org.  相似文献   

15.
The FLOSS software package is a flexible framework for ordered subset analysis. FLOSS is specifically designed for use with the Merlin linkage analysis package, but FLOSS can be used with any linkage analysis software package that reports NPL Z-scores for each locus and family. When FLOSS is used with the Merlin linkage analysis package, one can use either non-parametric Z-scores or Kong and Cox linear allele sharing model LOD scores. Monte Carlo P-values are calculated using a permutation test with an efficient Besag-Clifford sequential stopping rule. FLOSS also has a flexible tool for assigning family covariate scores from Merlin input files. FLOSS includes user documentation and is written in Java for easy portability. The FLOSS source code is documented and designed to be extensible.  相似文献   

16.
The pathogenic bacteria Legionella pneumophila is known to cause Legionnaires' Disease, a severe pneumonia that can be fatal to immunocompromised individuals and the elderly. Shohdy et al. identified the L. pneumophila vacuole sorting inhibitory protein VipF as a putative N‐acetyltransferase based on sequence homology. We have characterized the basic structural and functional properties of VipF to confirm this original functional assignment. Sequence conservation analysis indicates two putative CoA‐binding regions within VipF. Homology modeling and small angle X‐ray scattering suggest a monomeric, dual‐domain structure joined by a flexible linker. Each domain contains the characteristic beta‐splay motif found in many acetyltransferases, suggesting that VipF may contain two active sites. Docking experiments suggest reasonable acetyl‐CoA binding locations within each beta‐splay motif. Broad substrate screening indicated that VipF is capable of acetylating chloramphenicol and both domains are catalytically active. Given that chloramphenicol is not known to be N‐acetylated, this is a surprising finding suggesting that VipF is capable of O‐acetyltransferase activity. Proteins 2016; 84:1422–1430. © 2016 Wiley Periodicals, Inc.  相似文献   

17.
This paper describes and evaluates a flexible, non‐invasive tagging system for the automated identification and long‐term monitoring of individual three‐spined sticklebacks Gasterosteus aculeatus. The system is based on barcoded tags, which can be reliably and robustly detected and decoded to provide information on an individual's identity and location. Because large numbers of fish can be individually tagged, it can be used to monitor individual‐ and group‐level dynamics within fish shoals.  相似文献   

18.
19.
Species identification based on short sequences of DNA markers, that is, DNA barcoding, has emerged as an integral part of modern taxonomy. However, software for the analysis of large and multilocus barcoding data sets is scarce. The Basic Local Alignment Search Tool (BLAST) is currently the fastest tool capable of handling large databases (e.g. >5000 sequences), but its accuracy is a concern and has been criticized for its local optimization. However, current more accurate software requires sequence alignment or complex calculations, which are time‐consuming when dealing with large data sets during data preprocessing or during the search stage. Therefore, it is imperative to develop a practical program for both accurate and scalable species identification for DNA barcoding. In this context, we present VIP Barcoding: a user‐friendly software in graphical user interface for rapid DNA barcoding. It adopts a hybrid, two‐stage algorithm. First, an alignment‐free composition vector (CV) method is utilized to reduce searching space by screening a reference database. The alignment‐based K2P distance nearest‐neighbour method is then employed to analyse the smaller data set generated in the first stage. In comparison with other software, we demonstrate that VIP Barcoding has (i) higher accuracy than Blastn and several alignment‐free methods and (ii) higher scalability than alignment‐based distance methods and character‐based methods. These results suggest that this platform is able to deal with both large‐scale and multilocus barcoding data with accuracy and can contribute to DNA barcoding for modern taxonomy. VIP Barcoding is free and available at http://msl.sls.cuhk.edu.hk/vipbarcoding/ .  相似文献   

20.
In Arabidopsis, lateral roots originate from pericycle cells deep within the primary root. New lateral root primordia (LRP) have to emerge through several overlaying tissues. Here, we report that auxin produced in new LRP is transported towards the outer tissues where it triggers cell separation by inducing both the auxin influx carrier LAX3 and cell‐wall enzymes. LAX3 is expressed in just two cell files overlaying new LRP. To understand how this striking pattern of LAX3 expression is regulated, we developed a mathematical model that captures the network regulating its expression and auxin transport within realistic three‐dimensional cell and tissue geometries. Our model revealed that, for the LAX3 spatial expression to be robust to natural variations in root tissue geometry, an efflux carrier is required—later identified to be PIN3. To prevent LAX3 from being transiently expressed in multiple cell files, PIN3 and LAX3 must be induced consecutively, which we later demonstrated to be the case. Our study exemplifies how mathematical models can be used to direct experiments to elucidate complex developmental processes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号