Background
Genomic islands (GIs) are clusters of alien genes in some bacterial genomes, but not be seen in the genomes of other strains within the same genus. The detection of GIs is extremely important to the medical and environmental communities. Despite the discovery of the GI associated features, accurate detection of GIs is still far from satisfactory.Results
In this paper, we combined multiple GI-associated features, and applied and compared various machine learning approaches to evaluate the classification accuracy of GIs datasets on three genera: Salmonella, Staphylococcus, Streptococcus, and their mixed dataset of all three genera. The experimental results have shown that, in general, the decision tree approach outperformed better than other machine learning methods according to five performance evaluation metrics. Using J48 decision trees as base classifiers, we further applied four ensemble algorithms, including adaBoost, bagging, multiboost and random forest, on the same datasets. We found that, overall, these ensemble classifiers could improve classification accuracy.Conclusions
We conclude that decision trees based ensemble algorithms could accurately classify GIs and non-GIs, and recommend the use of these methods for the future GI data analysis. The software package for detecting GIs can be accessed at http://www.esu.edu/cpsc/che_lab/software/GIDetector/.Availability
GIV is freely available for non-commercial use at http://www5.esu.edu/cpsc/bioinfo/software/GIV 相似文献Background
It has been noted that many bacterial virulence factor genes are located within genomic islands (GIs; clusters of genes in a prokaryotic genome of probable horizontal origin). However, such studies have been limited to single genera or isolated observations. We have performed the first large-scale analysis of multiple diverse pathogens to examine this association. We additionally identified genes found predominantly in pathogens, but not non-pathogens, across multiple genera using 631 complete bacterial genomes, and we identified common trends in virulence for genes in GIs. Furthermore, we examined the relationship between GIs and clustered regularly interspaced palindromic repeats (CRISPRs) proposed to confer resistance to phage.Methodology/Principal Findings
We show quantitatively that GIs disproportionately contain more virulence factors than the rest of a given genome (p<1E-40 using three GI datasets) and that CRISPRs are also over-represented in GIs. Virulence factors in GIs and pathogen-associated virulence factors are enriched for proteins having more “offensive” functions, e.g. active invasion of the host, and are disproportionately components of type III/IV secretion systems or toxins. Numerous hypothetical pathogen-associated genes were identified, meriting further study.Conclusions/Significance
This is the first systematic analysis across diverse genera indicating that virulence factors are disproportionately associated with GIs. “Offensive” virulence factors, as opposed to host-interaction factors, may more often be a recently acquired trait (on an evolutionary time scale detected by GI analysis). Newly identified pathogen-associated genes warrant further study. We discuss the implications of these results, which cement the significant role of GIs in the evolution of many pathogens. 相似文献Recent advances in genome analysis have established that chromatin has preferred 3D conformations, which bring distant loci into contact. Identifying these contacts is important for us to understand possible interactions between these loci. This has motivated the creation of the Hi-C technology, which detects long-range chromosomal interactions. Distance geometry-based algorithms, such as ChromSDE and ShRec3D, have been able to utilize Hi-C data to infer 3D chromosomal structures. However, these algorithms, being matrix-based, are space- and time-consuming on very large datasets. A human genome of 100 kilobase resolution would involve ∼30,000 loci, requiring gigabytes just in storing the matrices.
ResultsWe propose a succinct representation of the distance matrices which tremendously reduces the space requirement. We give a complete solution, called SuperRec, for the inference of chromosomal structures from Hi-C data, through iterative solving the large-scale weighted multidimensional scaling problem.
ConclusionsSuperRec runs faster than earlier systems without compromising on result accuracy. The SuperRec package can be obtained from http://www.cs.cityu.edu.hk/~shuaicli/SuperRec.
相似文献Methods: We performed an observational study at the University of Kentucky with 61 participants who underwent first-time LVAD implantation. Blood was collected at baseline and post-op days 0, 1, 3 and 6 as well as clinical follow-up. Demographics, clinical characteristics, one-year adverse events and routine laboratory data were collected from electronic medical records. Platelet function and plasma biomarkers were profiled.
Results: Evaluation of routine laboratory results revealed that sustained thrombocytopenia and increased mean platelet volume (MPV) were associated with development of GI bleeding and mortality. Platelet function at follow-up visit predicted one-year bleeding events. Thrombotic biomarker sCD40L strongly predicted one-year GI bleeding at baseline before implantation and within the first week following LVAD implant.
Conclusions: Early trends in routine bloodwork and platelet function may serve as novel signatures of patients at risk to experience adverse events. 相似文献
Aims Examine the use of Cork and Holm Oak trees by insectivorous birds in Mediterranean oak woodlands.
Methods Point-counts were used to compare species abundance among Cork Oak-dominated, Holm Oak-dominated and mixed woodlands. Focal foraging observations were used to evaluate the use of Cork and Holm Oaks in the three habitats and to relate tree characteristics with the foraging time of foliage- and bark-gleaners.
Results Bird densities in the three habitats were not different for most foliage- and bark-gleaners. Tree preference index values and foraging time per tree showed no significant differences between tree species and foraging guilds, however bark-gleaners had positive index values for Cork Oak in the three habitats. The foraging time of foliage- and bark-gleaners on both tree species showed a positive relationship with characteristics associated with arthropod abundance.
Conclusion Cork and Holm Oak trees are equally preferred by foliage-gleaners but bark-gleaners moderately preferred Cork Oak. Characteristics regarding morphology, phenology and physiological condition of trees can be used to predict habitat quality for insectivorous forest birds in Mediterranean oak woodlands. 相似文献
Aims: We assessed the structure of southern Amazonian forests, Brazil, to quantify the relative importance of variation in AGB caused by the abundance/density of palm species and by forest structure.
Methods: We stratified the landscape into homogeneous units in terms of vegetation types and elevation for using as a guide for plot establishment. We used the variation partitioning technique to decompose the relative contribution of forest structure and palm abundance.
Results: The AGBcommunity (including trees, palms and lianas) and AGBtree (excluding palms and lianas) significantly decreased with increasing abundance of palms. The Attalea speciosa, a large-leaved palm species, was the most important for explaining the variance of AGB. The total variance of AGBtree was partially explained by a redundant effect of A. speciosa and trees (28%) and by trees alone (62%), based on models of basal area. The redundant effect, alongside with additional analyses, indicated (1) competition between A. speciosa and small trees and (2) covariation between A. speciosa and large trees.
Conclusions: The abundance of palms plays a minor but significant role in predicting the AGB at the local scale in southern Amazonia. 相似文献
Aims: To investigate home range size, habitat and tree species selection of Wood Warblers at a staging site in Burkina Faso (Koubri) and a wintering site in Ghana (Pepease).
Methods: Comparing habitat recorded at locations of radio-tagged birds and at control points, we investigated whether there was habitat and tree species selection. We also compared home range size of individual birds between the two sites.
Results: Home range size did not differ between the two sites. There was significant selection for tree species at both Koubri and Pepease: Anogeissus leiocarpus and Albizia zygia, respectively. At Koubri, there was significant avoidance of the most common tree species (Azadirachta indica, Mangifera indica (both non-native), Vitellaria paradoxa and Acacia spp.). In addition, there was a preference for taller trees and greater tree density at both sites. However, the probability of a point being used declined with increasing number of taller (>14?m) trees.
Conclusion: Fine-scale selection of woodland habitats suggests that Wood Warblers are likely to suffer the consequences of ongoing land-use change in their West African wintering grounds. 相似文献
Aims: To examine the prevalence of various haemosporidian lineages in nestlings of three separated Iberian populations of the Southern Grey Shrike.
Methods: Blood samples were taken from nestling Southern Grey Shrikes from three agroecosystem areas in the Iberian Peninsula. Parasites were detected from blood samples using polymerase chain reaction screening.
Resusts: Nestlings were parasitized by 11 different lineages belonging to the genera Haemoproteus (3.8%), Plasmodium (0.5%) and Leucocytozoon (1.8%), including five new undescribed lineages. These are among the highest prevalence levels of haemosporidians parasites (7.4%) for nestlings of passerine birds.
Conclusion: Our findings suggest that the distribution of avian haemosporidians is determined by complex effects including climate and biogeography. Most parasite lineages were not universally spread across shrike populations, despite being otherwise widespread both geographically and taxonomically. 相似文献
Aims: This study aims to determine whether Capercaillie vocalizations can be recognized in lek recordings, whether this can be automated using readily available software, and whether the number of calls resulting varies with location, weather conditions, date and time of day.
Methods: Unattended recording devices and semi-automated call classification software were used to record and analyse the display calls of Capercaillie at three known lek sites in Scotland over a two-week period.
Results: Capercaillie calls were successfully and rapidly identified within a data set that included the vocalizations of other bird species and environmental noise. Calls could be readily recognized to species level using a combination of unsupervised software and manual analysis. The number of calls varied by time and date, by recorder/microphone location at the lek site, and with weather conditions. This information can be used to better target future acoustic monitoring and improve the quality of existing traditional lek surveys.
Conclusion: Bioacoustic methods provide a practical and cost-effective way to determine habitat occupancy and activity levels by a vocally distinctive bird species. Following further testing alongside traditional counting methods, it could offer a significant new approach towards more effective monitoring of local population levels for Capercaillie and other species of conservation concern. 相似文献
Aim: We aimed to analyse and compare distribution and habitat preferences of Red-backed Shrikes and Barred Warblers breeding sympatrically in semi-natural landscape in a wetland/farmland mosaic.
Methods: We examined habitat availability and use by the two species within their breeding territories to identify differences in habitat selection.
Results: Territories of both species were similar in habitat composition and used levees, bushes, fallow areas and single trees. However, the spatial characteristics of the territories differed between species. Red-backed Shrikes used a wider range of sizes and shapes of habitat patches, whilst Barred Warblers preferred a more complex landscape structure and a higher diversity of habitat types. We also found that areas of 71% of Barred Warbler and 34% of Red-backed Shrike territories overlapped.
Conclusion: Whilst both species showed similar habitat choices, they appeared to differ significantly in terms of landscape structure: Red-backed Shrikes were more flexible and less selective than Barred Warblers in their habitat choice. 相似文献
Areas covered: The aim of this review is to address opportunities and challenges of metaproteomics from a computational perspective. Appealing to an audience of microbial ecologists and proteomic researchers alike, we provide an overview on state-of-the-art software and databases by which metaproteome data can be readily analyzed.
Expert commentary: While tailored protein databases, combined search algorithms and iterative workflows are means to improve the identification yield, software tools for taxonomic and functional analysis are challenged by the vast amount of unannotated sequences in metaproteomics. 相似文献