Biomedical literature is expanding rapidly, and tools that help locate information of interest are needed. To this end, a multitude of different approaches for classifying sentences in biomedical publications according to their coarse semantic and rhetoric categories (e.g., Background, Methods, Results, Conclusions) have been devised, with recent state-of-the-art results reported for a complex deep learning model. Recent evidence showed that shallow and wide neural models such as fastText can provide results that are competitive or superior to complex deep learning models while requiring drastically lower training times and having better scalability. We analyze the efficacy of the fastText model in the classification of biomedical sentences in the PubMed 200k RCT benchmark, and introduce a simple pre-processing step that enables the application of fastText on sentence sequences. Furthermore, we explore the utility of two unsupervised pre-training approaches in scenarios where labeled training data are limited.
Results
Our fastText-based methodology yields a state-of-the-art F1 score of.917 on the PubMed 200k benchmark when sentence ordering is taken into account, with a training time of only 73 s on standard hardware. Applying fastText on single sentences, without taking sentence ordering into account, yielded an F1 score of.852 (training time 13 s). Unsupervised pre-training of N-gram vectors greatly improved the results for small training set sizes, with an increase of F1 score of.21 to.74 when trained on only 1000 randomly picked sentences without taking sentence ordering into account.
Conclusions
Because of it’s ease of use and performance, fastText should be among the first choices of tools when tackling biomedical text classification problems with large corpora. Unsupervised pre-training of N-gram vectors on domain-specific corpora also makes it possible to apply fastText when labeled training data are limited.
Understanding adaptation has become one of the major biological questions especially in the light of rapid environmental changes induced by climate change. Ocean temperatures are rising which triggers massive changes in water chemistry and thereby alters the living environment of all marine organisms. Studying adaptation, however, can be tricky because spatial genetic patterns might also occur due to random effects, for example, genetic drift. Genetic drift is reduced in very large and well‐connected populations, such as in broadcast marine spawning organisms. Here, spatial genetic divergence is likely to be produced by selection. In this issue of Molecular Ecology, Sandoval‐Castillo et al. (2018) investigated patterns of spatial genetic divergence and their association with environmental factors in the greenlip abalone (Haliotis laevigata). This commercially important species of mollusc is a broadcast spawner with large population sizes, rendering genetic drift an unlikely factor in the genetic divergence of wild populations. Sandoval‐Castillo et al. (2018) used a ddRAD genomic approach to test for genetic divergence between sampled populations while also measuring different environmental factors, for example, water temperature and oxygen content. The majority of identified SNPs was putatively neutral and showed only low levels of genetic divergence between field sites. However, 323 candidate adaptive markers were identified that clearly separated the individuals into five different clusters. These genetic clusters correlated with environmental clusters mainly determined by water temperature and (correlated) oxygen concentration. Gene annotation of the candidate SNPs revealed a large proportion of loci being involved in biological processes influenced by oxygen availability. The study by Sandoval‐Castillo et al. (2018) in this issue of Molecular Ecology exemplifies the benefits of combining genomic studies with ecological data. It is a great starting point for more detailed (gene function, physiology) as well as broader (biodiversity) investigations that might help us to better understand adaptation and predict ecosystems' resilience and resistance to environmental disturbances. In addition, this information can be applied to implement optimal conservation regime policies and sustainable harvesting strategies, hopefully protecting biodiversity as well as commercial interests in marine life. 相似文献
Streptococcus pneumoniae is an important cause of bacterial meningitis and pneumonia but usually colonizes the human nasopharynx harmlessly. As this niche is simultaneously populated by other bacterial species, we looked for a role and pathway of communication between pneumococci and other species. This paper shows that two proteins of non-encapsulated S. pneumoniae, AliB-like ORF 1 and ORF 2, bind specifically to peptides matching other species resulting in changes in the pneumococci. AliB-like ORF 1 binds specifically peptide SETTFGRDFN, matching 50S ribosomal subunit protein L4 of Enterobacteriaceae, and facilitates upregulation of competence for genetic transformation. AliB-like ORF 2 binds specifically peptides containing sequence FPPQS, matching proteins of Prevotella species common in healthy human nasopharyngeal microbiota. We found that AliB-like ORF 2 mediates the early phase of nasopharyngeal colonization in vivo. The ability of S. pneumoniae to bind and respond to peptides of other bacterial species occupying the same host niche may play a key role in adaptation to its environment and in interspecies communication. These findings reveal a completely new concept of pneumococcal interspecies communication which may have implications for communication between other bacterial species and for future interventional therapeutics. 相似文献
Identifying relevant signatures for clinical patient outcome is a fundamental task in high-throughput studies. Signatures, composed of features such as mRNAs, miRNAs, SNPs or other molecular variables, are often non-overlapping, even though they have been identified from similar experiments considering samples with the same type of disease. The lack of a consensus is mostly due to the fact that sample sizes are far smaller than the numbers of candidate features to be considered, and therefore signature selection suffers from large variation. We propose a robust signature selection method that enhances the selection stability of penalized regression algorithms for predicting survival risk. Our method is based on an aggregation of multiple, possibly unstable, signatures obtained with the preconditioned lasso algorithm applied to random (internal) subsamples of a given cohort data, where the aggregated signature is shrunken by a simple thresholding strategy. The resulting method, RS-PL, is conceptually simple and easy to apply, relying on parameters automatically tuned by cross validation. Robust signature selection using RS-PL operates within an (external) subsampling framework to estimate the selection probabilities of features in multiple trials of RS-PL. These probabilities are used for identifying reliable features to be included in a signature. Our method was evaluated on microarray data sets from neuroblastoma, lung adenocarcinoma, and breast cancer patients, extracting robust and relevant signatures for predicting survival risk. Signatures obtained by our method achieved high prediction performance and robustness, consistently over the three data sets. Genes with high selection probability in our robust signatures have been reported as cancer-relevant. The ordering of predictor coefficients associated with signatures was well-preserved across multiple trials of RS-PL, demonstrating the capability of our method for identifying a transferable consensus signature. The software is available as an R package rsig at CRAN (http://cran.r-project.org). 相似文献
High intensity interval training (HIIT) is characterized by vigorous exercise with short rest intervals. Hydrogen peroxide (H2O2) plays a key role in muscle adaptation. This study aimed to evaluate whether HIIT promotes similar H2O2 formation via O2 consumption (electron leakage) in three skeletal muscles with different twitch characteristics. Rats were assigned to two groups: sedentary (n=10) and HIIT (n=10, swimming training). We collected the tibialis anterior (TA-fast), gastrocnemius (GAST-fast/slow) and soleus (SOL-slow) muscles. The fibers were analyzed for mitochondrial respiration, H2O2 production and citrate synthase (CS) activity. A multi-substrate (glycerol phosphate (G3P), pyruvate, malate, glutamate and succinate) approach was used to analyze the mitochondria in permeabilized fibers. Compared to the control group, oxygen flow coupled to ATP synthesis, complex I and complex II was higher in the TA of the HIIT group by 1.5-, 3.0- and 2.7-fold, respectively. In contrast, oxygen consumed by mitochondrial glycerol phosphate dehydrogenase (mGPdH) was 30% lower. Surprisingly, the oxygen flow coupled to ATP synthesis was 42% lower after HIIT in the SOL. Moreover, oxygen flow coupled to ATP synthesis and complex II was higher by 1.4- and 2.7-fold in the GAST of the HIIT group. After HIIT, CS activity increased 1.3-fold in the TA, and H2O2 production was 1.3-fold higher in the TA at sites containing mGPdH. No significant differences in H2O2 production were detected in the SOL. Surprisingly, HIIT increased H2O2 production in the GAST via complex II, phosphorylation, oligomycin and antimycin by 1.6-, 1.8-, 2.2-, and 2.2-fold, respectively. Electron leakage was 3.3-fold higher in the TA with G3P and 1.8-fold higher in the GAST with multiple substrates. Unexpectedly, the HIIT protocol induced different respiration and electron leakage responses in different types of muscle. 相似文献
Since compost is widely used as soil amendment and the fact that during the processing of compost material high amounts of microorganisms are released into the air, we investigated whether compost may act as a carrier for thermophilic methanogens to temperate soils.
All eight investigated compost materials showed a clear methane production potential between 0.01 and 0.98 μmol CH4 g dw−1 h−1 at 50 °C. Single strand conformation polymorphism (SSCP) and cloning analysis indicated the presence of Methanosarcina thermophila, Methanoculleus thermophilus, and Methanobacterium formicicum.
Bioaerosols collected during the turning of a compost pile showed both a highly similar SSCP profile compared to the corresponding compost material and clear methane production during anoxic incubation in selective medium at 50 °C. Both observations indicated a considerable release of thermophilic methanogens into the air.
To analyse the persistence of compost-borne thermophilic methanogens in temperate oxic soils, we therefore studied their potential activity in compost and compost/soil mixtures, which was brought to a meadow soil, as well as in an agricultural soil fertilised with compost. After 24 h anoxic incubation at 50 °C, all samples containing compost showed a clear methanogenic activity, even 1 year after application.
In combination with the in vitro observed resilience of the compost-borne methanogens against desiccation and UV radiation we assume that compost material acts as an effective carrier for the distribution of thermophilic methanogens by fertilisation and wind. 相似文献
The vacuolating toxin VacA, released by Helicobacter pylori, is an important virulence factor in the pathogenesis of gastritis and gastroduodenal ulcers. VacA contains two subunits: The p58 subunit mediates entry into target cells, and the p34 subunit mediates targeting to mitochondria and is essential for toxicity. In this study we found that targeting to mitochondria is dependent on a unique signal sequence of 32 uncharged amino acid residues at the p34 N-terminus. Mitochondrial import of p34 is mediated by the import receptor Tom20 and the import channel of the outer membrane TOM complex, leading to insertion of p34 into the mitochondrial inner membrane. p34 assembles in homo-hexamers of extraordinary high stability. CD spectra of the purified protein indicate a content of >40% β-strands, similar to pore-forming β-barrel proteins. p34 forms an anion channel with a conductivity of about 12 pS in 1.5 M KCl buffer. Oligomerization and channel formation are independent both of the 32 uncharged N-terminal residues and of the p58 subunit of the toxin. The conductivity is efficiently blocked by 5-nitro-2-(3-phenylpropylamino)benzoic acid (NPPB), a reagent known to inhibit VacA-mediated apoptosis. We conclude that p34 essentially acts as a small pore-forming toxin, targeted to the mitochondrial inner membrane by a special hydrophobic N-terminal signal. 相似文献
Although TSH stimulates all aspects of thyroid physiology IGF-I signaling through a tyrosine kinase-containing transmembrane receptor exhibits a permissive impact on TSH action. To better understand the importance of the IGF-I receptor in the thyroid in vivo, we inactivated the Igf1r with a Tg promoter-driven Cre-lox system in mice. We studied male and female mice with thyroidal wild-type, Igf1r(+/-), and Igf1r(-/-) genotypes. Targeted Igf1r inactivation did transiently reduce thyroid hormone levels and significantly increased TSH levels in both heterozygous and homozygous mice without affecting thyroid weight. Histological analysis of thyroid tissue with Igf1r inactivation revealed hyperplasia and heterogeneous follicle structure. From 4 months of age, we detected papillary thyroid architecture in heterozygous and homozygous mice. We also noted increased body weight of male mice with a homozygous thyroidal null mutation in the Igf1r locus, compared with wild-type mice, respectively. A decrease of mRNA and protein for thyroid peroxidase and increased mRNA and protein for IGF-II receptor but no significant mRNA changes for the insulin receptor, the TSH receptor, and the sodium-iodide-symporter in both Igf1r(+/-) and Igf1r(-/-) mice were detected. Our results suggest that the strong increase of TSH benefits papillary thyroid hyperplasia and completely compensates the loss of IGF-I receptor signaling at the level of thyroid hormones without significant increase in thyroid weight. This could indicate that the IGF-I receptor signaling is less essential for thyroid hormone synthesis but maintains homeostasis and normal thyroid morphogenesis. 相似文献