首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Cultural products such as song lyrics, television shows, and books reveal cultural differences, including cultural change over time. Two studies examine changes in the use of individualistic words (Study 1) and phrases (Study 2) in the Google Books Ngram corpus of millions of books in American English. Current samples from the general population generated and rated lists of individualistic words and phrases (e.g., "unique," "personalize," "self," "all about me," "I am special," "I'm the best"). Individualistic words and phrases increased in use between 1960 and 2008, even when controlling for changes in communal words and phrases. Language in American books has become increasingly focused on the self and uniqueness in the decades since 1960.  相似文献   

2.
The power of language to modify the reader’s perception of interpreting biomedical results cannot be underestimated. Misreporting and misinterpretation are pressing problems in randomized controlled trials (RCT) output. This may be partially related to the statistical significance paradigm used in clinical trials centered around a P value below 0.05 cutoff. Strict use of this P value may lead to strategies of clinical researchers to describe their clinical results with P values approaching but not reaching the threshold to be “almost significant.” The question is how phrases expressing nonsignificant results have been reported in RCTs over the past 30 years. To this end, we conducted a quantitative analysis of English full texts containing 567,758 RCTs recorded in PubMed between 1990 and 2020 (81.5% of all published RCTs in PubMed). We determined the exact presence of 505 predefined phrases denoting results that approach but do not cross the line of formal statistical significance (P < 0.05). We modeled temporal trends in phrase data with Bayesian linear regression. Evidence for temporal change was obtained through Bayes factor (BF) analysis. In a randomly sampled subset, the associated P values were manually extracted. We identified 61,741 phrases in 49,134 RCTs indicating almost significant results (8.65%; 95% confidence interval (CI): 8.58% to 8.73%). The overall prevalence of these phrases remained stable over time, with the most prevalent phrases being “marginally significant” (in 7,735 RCTs), “all but significant” (7,015), “a nonsignificant trend” (3,442), “failed to reach statistical significance” (2,578), and “a strong trend” (1,700). The strongest evidence for an increased temporal prevalence was found for “a numerical trend,” “a positive trend,” “an increasing trend,” and “nominally significant.” In contrast, the phrases “all but significant,” “approaches statistical significance,” “did not quite reach statistical significance,” “difference was apparent,” “failed to reach statistical significance,” and “not quite significant” decreased over time. In a random sampled subset of 29,000 phrases, the manually identified and corresponding 11,926 P values, 68,1% ranged between 0.05 and 0.15 (CI: 67. to 69.0; median 0.06). Our results show that RCT reports regularly contain specific phrases describing marginally nonsignificant results to report P values close to but above the dominant 0.05 cutoff. The fact that the prevalence of the phrases remained stable over time indicates that this practice of broadly interpreting P values close to a predefined threshold remains prevalent. To enhance responsible and transparent interpretation of RCT results, researchers, clinicians, reviewers, and editors may reduce the focus on formal statistical significance thresholds and stimulate reporting of P values with corresponding effect sizes and CIs and focus on the clinical relevance of the statistical difference found in RCTs.

The power of language to modify the reader’s perception of interpreting biomedical results cannot be underestimated. An analysis of more than half a million randomized controlled trials reveals that researchers are using appealing phrases to describe non-significant findings as if they were below the p=0.05 significance threshold.  相似文献   

3.
We report here trends in the usage of “mood” words, that is, words carrying emotional content, in 20th century English language books, using the data set provided by Google that includes word frequencies in roughly 4% of all books published up to the year 2008. We find evidence for distinct historical periods of positive and negative moods, underlain by a general decrease in the use of emotion-related words through time. Finally, we show that, in books, American English has become decidedly more “emotional” than British English in the last half-century, as a part of a more general increase of the stylistic divergence between the two variants of English language.  相似文献   

4.
Fashions and fads are important phenomena that influence many individual choices. They are ubiquitous in human societies, and have recently been used as a source of data to test models of cultural dynamics. Although a few statistical regularities have been observed in fashion cycles, their empirical characterization is still incomplete. Here we consider the impact of mass media on popular culture, showing that the release of movies featuring dogs is often associated with an increase in the popularity of featured breeds, for up to 10 years after movie release. We also find that a movie''s impact on breed popularity correlates with the estimated number of viewers during the movie''s opening weekend—a proxy of the movie''s reach among the general public. Movies'' influence on breed popularity was strongest in the early 20th century, and has declined since. We reach these conclusions through a new, widely applicable method to measure the cultural impact of events, capable of disentangling the event''s effect from ongoing cultural trends.  相似文献   

5.
Social networking services (e.g., Twitter, Facebook) are now major sources of World Wide Web (called “Web”) dynamics, together with Web search services (e.g., Google). These two types of Web services mutually influence each other but generate different dynamics. In this paper, we distinguish two modes of Web dynamics: the reactive mode and the default mode. It is assumed that Twitter messages (called “tweets”) and Google search queries react to significant social movements and events, but they also demonstrate signs of becoming self-activated, thereby forming a baseline Web activity. We define the former as the reactive mode and the latter as the default mode of the Web. In this paper, we investigate these reactive and default modes of the Web''s dynamics using transfer entropy (TE). The amount of information transferred between a time series of 1,000 frequent keywords in Twitter and the same keywords in Google queries is investigated across an 11-month time period. Study of the information flow on Google and Twitter revealed that information is generally transferred from Twitter to Google, indicating that Twitter time series have some preceding information about Google time series. We also studied the information flow among different Twitter keywords time series by taking keywords as nodes and flow directions as edges of a network. An analysis of this network revealed that frequent keywords tend to become an information source and infrequent keywords tend to become sink for other keywords. Based on these findings, we hypothesize that frequent keywords form the Web''s default mode, which becomes an information source for infrequent keywords that generally form the Web''s reactive mode. We also found that the Web consists of different time resolutions with respect to TE among Twitter keywords, which will be another focal point of this paper.  相似文献   

6.
The Pirahã language has been at the center of recent debates in linguistics, in large part because it is claimed not to exhibit recursion, a purported universal of human language. Here, we present an analysis of a novel corpus of natural Pirahã speech that was originally collected by Dan Everett and Steve Sheldon. We make the corpus freely available for further research. In the corpus, Pirahã sentences have been shallowly parsed and given morpheme-aligned English translations. We use the corpus to investigate the formal complexity of Pirahã syntax by searching for evidence of syntactic embedding. In particular, we search for sentences which could be analyzed as containing center-embedding, sentential complements, adverbials, complementizers, embedded possessors, conjunction or disjunction. We do not find unambiguous evidence for recursive embedding of sentences or noun phrases in the corpus. We find that the corpus is plausibly consistent with an analysis of Pirahã as a regular language, although this is not the only plausible analysis.  相似文献   

7.
Five follicular ovarian implantations occurred among 200 ectopic pregnancies encountered during a 14-year period. Abortions from impregnated follicles may cause hemoperitoneum more often than is generally suspected. Wedge resection or cystectomy to ensure hemostasis provides tissue for histological examination, without which ruptured ovarian pregnancy may masquerade as rupture of a corpus luteum with hemorrhage (“ovarian apoplexy”). Including patients reported here, IUCD users have within the past five years accounted for about 10% of all ovarian pregnancies recorded in English.  相似文献   

8.
Cellular signaling is key for organisms to survive immediate stresses from fluctuating environments as well as relaying important information about external stimuli. Effective mechanisms have evolved to ensure appropriate responses for an optimal adaptation process. For them to be functional despite the noise that occurs in biochemical transmission, the cell needs to be able to infer reliably what was sensed in the first place. For example Saccharomyces cerevisiae are able to adjust their response to osmotic shock depending on the severity of the shock and initiate responses that lead to near perfect adaptation of the cell. We investigate the Sln1–Ypd1–Ssk1-phosphorelay as a module in the high-osmolarity glycerol pathway by incorporating a stochastic model. Within this framework, we can imitate the noisy perception of the cell and interpret the phosphorelay as an information transmitting channel in the sense of C.E. Shannon’s “Information Theory”. We refer to the channel capacity as a measure to quantify and investigate the transmission properties of this system, enabling us to draw conclusions on viable parameter sets for modeling the system.  相似文献   

9.
Gaps in data collection systems, as well as challenges associated with gathering data from rare and dispersed populations, render current health surveillance systems inadequate to identify and monitor efforts to reduce health disparities. Using sexual and gender minorities we investigated the utility of using a large nonprobability online panel to conduct rapid population assessments of such populations using brief surveys. Surveys of the Google Android Panel (four assessing sexual orientation and one assessing gender identity and sex assigned at birth) were conducted resulting in invitation of 53,739 application users (37,505 of whom viewed the invitation) to generate a total of 34,759 who completed screening questions indicating their sexual orientation, or gender identity and sex at birth. Where possible we make comparisons to similar data from two population-based surveys (NHIS and NESARC). We found that 99.4% to 100.0% of respondents across our Google Android panel samples completed the screening questions and 97.8% to 99.2% of those that consented to participate in our surveys indicated they were “OK” with the content of surveys that assessed sexual orientation and sex/gender. In our Google Android panel samples there was a higher percentage of sexual minority respondents than in either NHIS or NESARC with 7.4% of men and 12.4% of women reporting gay, lesbian or bisexual identities. The proportion sexual minority was 2.8 to 5.6 times higher in the Google Android panel samples than was found in the 2012 NHIS sample, for men and women, respectively. The percentage of “transgender” identified individuals in the Google sample was 0.7%, which is similar to 0.5% transgender identified through the Massachusetts BRFSS, and using a transgender status item we found that 2.0% of the overall sample fit could be classified as transgender. The Google samples sometimes more closely approximated national averages for ethnicity and race than NHIS.  相似文献   

10.
Culture is a phenomenon shared by all humans. Attempts to understand how dynamic factors affect the origin and distribution of cultural elements are, therefore, of interest to all humanity. As case studies go, understanding the distribution of cultural elements in Native American communities during the historical period of the Great Plains would seem a most challenging one. Famously, there is a mixture of powerful internal and external factors, creating-for a relatively brief period in time-a seemingly distinctive set of shared elements from a linguistically diverse set of peoples. This is known across the world as the “Great Plains culture.” Here, quantitative analyses show how different processes operated on two sets of cultural traits among nine High Plains groups. Moccasin decorations exhibit a pattern consistent with geographically-mediated between-group interaction. However, group variations in the religious ceremony of the Sun Dance also reveal evidence of purifying cultural selection associated with historical biases, dividing down ancient linguistic lines. The latter shows that while the conglomeration of “Plains culture” may have been a product of merging new ideas with old, combined with cultural interchange between groups, the details of what was accepted, rejected or elaborated in each case reflected preexisting ideological biases. Although culture may sometimes be a “melting pot,” the analyses show that even in highly fluid situations, cultural mosaics may be indirectly shaped by historical factors that are not always obvious.  相似文献   

11.
Use of socially generated “big data” to access information about collective states of the minds in human societies has become a new paradigm in the emerging field of computational social science. A natural application of this would be the prediction of the society''s reaction to a new product in the sense of popularity and adoption rate. However, bridging the gap between “real time monitoring” and “early predicting” remains a big challenge. Here we report on an endeavor to build a minimalistic predictive model for the financial success of movies based on collective activity data of online users. We show that the popularity of a movie can be predicted much before its release by measuring and analyzing the activity level of editors and viewers of the corresponding entry to the movie in Wikipedia, the well-known online encyclopedia.  相似文献   

12.
During sentence production, linguistic information (semantics, syntax, phonology) of words is retrieved and assembled into a meaningful utterance. There is still debate on how we assemble single words into more complex syntactic structures such as noun phrases or sentences. In the present study, event-related potentials (ERPs) were used to investigate the time course of syntactic planning. Thirty-three volunteers described visually animated scenes using naming formats varying in syntactic complexity: from simple words (‘W’, e.g., “triangle”, “red”, “square”, “green”, “to fly towards”), to noun phrases (‘NP’, e.g., “the red triangle”, “the green square”, “to fly towards”), to a sentence (‘S’, e.g., “The red triangle flies towards the green square.”). Behaviourally, we observed an increase in errors and corrections with increasing syntactic complexity, indicating a successful experimental manipulation. In the ERPs following scene onset, syntactic complexity variations were found in a P300-like component (‘S’/‘NP’>‘W’) and a fronto-central negativity (linear increase with syntactic complexity). In addition, the scene could display two actions - unpredictable for the participant, as the disambiguation occurred only later in the animation. Time-locked to the moment of visual disambiguation of the action and thus the verb, we observed another P300 component (‘S’>‘NP’/‘W’). The data show for the first time evidence of sensitivity to syntactic planning within the P300 time window, time-locked to visual events critical of syntactic planning. We discuss the findings in the light of current syntactic planning views.  相似文献   

13.
Continuous-time Markov models have been considered the best representation for the stochastic dynamics of ion channels for more than thirty years. For most single-channel data sets, several open and closed states are required for accurately representing the dynamics. However, each data point only shows if the channel is open or closed but not in which state it is. Consequently, some model structures are inherently overparameterized and therefore, in principle, unsuitable for representing any data—those models are called “nonidentifiable”. As of this writing, it seems to be poorly understood which continuous-time Markov models are identifiable and which are not, therefore the unconscious use of a nonidentifiable model is a considerable concern. To address this problem, an improved variant of a recently published Markov-chain Monte Carlo method is presented. The algorithm is tested using test data as well as experimental data. We demonstrate that, opposed to a widely used maximum-likelihood estimator, it gives clear warning signs when a nonidentifiable model is used for fitting. Furthermore, for test data that was generated from a nonidentifiable model, the Markov-chain Monte Carlo results recover much more information from the data than maximum-likelihood estimation.  相似文献   

14.
Creationism is a religiously motivated worldview in denial of biological evolution that has been very resistant to change. We performed a textual analysis by examining creationist and pro-evolutionary texts for aspects of “experiential thinking”, a cognitive process different from scientific thought. We observed characteristics of experiential thinking as follows: testimonials (present in 100% of sampled creationist texts), such as quotations, were a major form of proof. Confirmation bias (100% of sampled texts) was represented by ignoring or dismissing information that would contradict the creationist hypothesis. Scientifically irrelevant or flawed information was re-interpreted as relevant for the falsification of evolution (75–90% of sampled texts). Evolutionary theory was associated to moral issues by demonizing scientists and linking evolutionary theory to atrocities (63–93% of sampled texts). Pro-evolutionary rebuttals of creationist claims also contained testimonials (93% of sampled texts) and referred to moral implications (80% of sampled texts) but displayed lower prevalences of stereotypical thinking (47% of sampled texts), confirmation bias (27% of sampled texts) and pseudodiagnostics (7% of sampled texts). The aspects of experiential thinking could also be interpreted as argumentative fallacies. Testimonials lead, for instance, to ad hominem and appeals to authorities. Confirmation bias and simplification of data give rise to hasty generalizations and false dilemmas. Moral issues lead to guilt by association and appeals to consequences. Experiential thinking and fallacies can contribute to false beliefs and the persistence of the claims. We propose that science educators would benefit from the systematic analysis of experiential thinking patterns and fallacies in creationist texts and pro-evolutionary rebuttals in order to concentrate on scientific misconceptions instead of the scientifically irrelevant aspects of the creationist—evolutionist debate.  相似文献   

15.
The study of genetic information can reveal a reconstruction of human population’s history. We sequenced the entire mtDNA control region (positions 16.024 to 576 following Cambridge Reference Sequence, CRS) of 605 individuals from seven Mesoamerican indigenous groups and one Aridoamerican from the Greater Southwest previously defined, all of them in present Mexico. Samples were collected directly from the indigenous populations, the application of an individual survey made it possible to remove related or with other origins samples. Diversity indices and demographic estimates were calculated. Also AMOVAs were calculated according to different criteria. An MDS plot, based on FST distances, was also built. We carried out the construction of individual networks for the four Amerindian haplogroups detected. Finally, barrier software was applied to detect genetic boundaries among populations. The results suggest: a common origin of the indigenous groups; a small degree of European admixture; and inter-ethnic gene flow. The process of Mesoamerica’s human settlement took place quickly influenced by the region’s orography, which development of genetic and cultural differences facilitated. We find the existence of genetic structure is related to the region’s geography, rather than to cultural parameters, such as language. The human population gradually became fragmented, though they remained relatively isolated, and differentiated due to small population sizes and different survival strategies. Genetic differences were detected between Aridoamerica and Mesoamerica, which can be subdivided into “East”, “Center”, “West” and “Southeast”. The fragmentation process occurred mainly during the Mesoamerican Pre-Classic period, with the Otomí being one of the oldest groups. With an increased number of populations studied adding previously published data, there is no change in the conclusions, although significant genetic heterogeneity can be detected in Pima and Huichol groups. This result may be explained because populations historically assigned as belonging to the same group were, in fact, different indigenous populations.  相似文献   

16.
Functional magnetic resonance data acquired in a task-absent condition (“resting state”) require new data analysis techniques that do not depend on an activation model. In this work, we introduce an alternative assumption- and parameter-free method based on a particular form of node centrality called eigenvector centrality. Eigenvector centrality attributes a value to each voxel in the brain such that a voxel receives a large value if it is strongly correlated with many other nodes that are themselves central within the network. Google''s PageRank algorithm is a variant of eigenvector centrality. Thus far, other centrality measures - in particular “betweenness centrality” - have been applied to fMRI data using a pre-selected set of nodes consisting of several hundred elements. Eigenvector centrality is computationally much more efficient than betweenness centrality and does not require thresholding of similarity values so that it can be applied to thousands of voxels in a region of interest covering the entire cerebrum which would have been infeasible using betweenness centrality. Eigenvector centrality can be used on a variety of different similarity metrics. Here, we present applications based on linear correlations and on spectral coherences between fMRI times series. This latter approach allows us to draw conclusions of connectivity patterns in different spectral bands. We apply this method to fMRI data in task-absent conditions where subjects were in states of hunger or satiety. We show that eigenvector centrality is modulated by the state that the subjects were in. Our analyses demonstrate that eigenvector centrality is a computationally efficient tool for capturing intrinsic neural architecture on a voxel-wise level.  相似文献   

17.
High Throughput Biological Data (HTBD) requires detailed analysis methods and from a life science perspective, these analysis results make most sense when interpreted within the context of biological pathways. Bayesian Networks (BNs) capture both linear and nonlinear interactions and handle stochastic events in a probabilistic framework accounting for noise making them viable candidates for HTBD analysis. We have recently proposed an approach, called Bayesian Pathway Analysis (BPA), for analyzing HTBD using BNs in which known biological pathways are modeled as BNs and pathways that best explain the given HTBD are found. BPA uses the fold change information to obtain an input matrix to score each pathway modeled as a BN. Scoring is achieved using the Bayesian-Dirichlet Equivalent method and significance is assessed by randomization via bootstrapping of the columns of the input matrix. In this study, we improve on the BPA system by optimizing the steps involved in “Data Preprocessing and Discretization”, “Scoring”, “Significance Assessment”, and “Software and Web Application”. We tested the improved system on synthetic data sets and achieved over 98% accuracy in identifying the active pathways. The overall approach was applied on real cancer microarray data sets in order to investigate the pathways that are commonly active in different cancer types. We compared our findings on the real data sets with a relevant approach called the Signaling Pathway Impact Analysis (SPIA).  相似文献   

18.
We view web forums as virtual living organisms feeding on user''s clicks and investigate how they grow at the expense of clickstreams. We find that (the number of page views in a given time period) and (the number of unique visitors in the time period) of the studied forums satisfy the law of the allometric growth, i.e., . We construct clickstream networks and explain the observed temporal dynamics of networks by the interactions between nodes. We describe the transportation of clickstreams using the function , in which is the total amount of clickstreams passing through node and is the amount of the clickstreams dissipated from to the environment. It turns out that , an indicator for the efficiency of network dissipation, not only negatively correlates with , but also sets the bounds for . In particular, when and when . Our findings have practical consequences. For example, can be used as a measure of the “stickiness” of forums, which quantifies the stable ability of forums to remain users “lock-in” on the forum. Meanwhile, the correlation between and provides a method to predict the long-term “stickiness” of forums from the clickstream data in a short time period. Finally, we discuss a random walk model that replicates both of the allometric growth and the dissipation function .  相似文献   

19.
Citizen science is a research practice that relies on public contributions of data. The strong recognition of its educational value combined with the need for novel methods to handle subsequent large and complex data sets raises the question: Is citizen science effective at science? A quantitative assessment of the contributions of citizen science for its core purpose – scientific research – is lacking. We examined the contribution of citizen science to a review paper by ornithologists in which they formulated ten central claims about the impact of climate change on avian migration. Citizen science was never explicitly mentioned in the review article. For each of the claims, these ornithologists scored their opinions about the amount of research effort invested in each claim and how strongly the claim was supported by evidence. This allowed us to also determine whether their trust in claims was, unwittingly or not, related to the degree to which the claims relied primarily on data generated by citizen scientists. We found that papers based on citizen science constituted between 24 and 77% of the references backing each claim, with no evidence of a mistrust of claims that relied heavily on citizen-science data. We reveal that many of these papers may not easily be recognized as drawing upon volunteer contributions, as the search terms “citizen science” and “volunteer” would have overlooked the majority of the studies that back the ten claims about birds and climate change. Our results suggest that the significance of citizen science to global research, an endeavor that is reliant on long-term information at large spatial scales, might be far greater than is readily perceived. To better understand and track the contributions of citizen science in the future, we urge researchers to use the keyword “citizen science” in papers that draw on efforts of non-professionals.  相似文献   

20.
With the fast development of Internet and WWW, “information overload” has become an overwhelming problem, and collective attention of users will play a more important role nowadays. As a result, knowing how collective attention distributes and flows among different websites is the first step to understand the underlying dynamics of attention on WWW. In this paper, we propose a method to embed a large number of web sites into a high dimensional Euclidean space according to the novel concept of flow distance, which both considers connection topology between sites and collective click behaviors of users. With this geometric representation, we visualize the attention flow in the data set of Indiana university clickstream over one day. It turns out that all the websites can be embedded into a 20 dimensional ball, in which, close sites are always visited by users sequentially. The distributions of websites, attention flows, and dissipations can be divided into three spherical crowns (core, interim, and periphery). 20% popular sites (Google.com, Myspace.com, Facebook.com, etc.) attracting 75% attention flows with only 55% dissipations (log off users) locate in the central layer with the radius 4.1. While 60% sites attracting only about 22% traffics with almost 38% dissipations locate in the middle area with radius between 4.1 and 6.3. Other 20% sites are far from the central area. All the cumulative distributions of variables can be well fitted by “S”-shaped curves. And the patterns are stable across different periods. Thus, the overall distribution and the dynamics of collective attention on websites can be well exhibited by this geometric representation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号