首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Mobile phone location data is a newly emerging data source of great potential to support human mobility research. However, recent studies have indicated that many users can be easily re-identified based on their unique activity patterns. Privacy protection procedures will usually change the original data and cause a loss of data utility for analysis purposes. Therefore, the need for detailed data for activity analysis while avoiding potential privacy risks presents a challenge. The aim of this study is to reveal the re-identification risks from a Chinese city’s mobile users and to examine the quantitative relationship between re-identification risk and data utility for an aggregated mobility analysis. The first step is to apply two reported attack models, the top N locations and the spatio-temporal points, to evaluate the re-identification risks in Shenzhen City, a metropolis in China. A spatial generalization approach to protecting privacy is then proposed and implemented, and spatially aggregated analysis is used to assess the loss of data utility after privacy protection. The results demonstrate that the re-identification risks in Shenzhen City are clearly different from those in regions reported in Western countries, which prove the spatial heterogeneity of re-identification risks in mobile phone location data. A uniform mathematical relationship has also been found between re-identification risk (x) and data (y) utility for both attack models: y = -ax b+c, (a, b, c>0; 0<x<1), where the exponent b increases with the background knowledge of the attackers. The discovered mathematical relationship provides data publishers with useful guidance on choosing the right tradeoff between privacy and utility. Overall, this study contributes to a better understanding of re-identification risks and a privacy-utility tradeoff benchmark for improving privacy protection when sharing detailed trajectory data.  相似文献   

2.
El Emam K  Jonker E  Arbuckle L  Malin B 《PloS one》2011,6(12):e28071
BackgroundPrivacy legislation in most jurisdictions allows the disclosure of health data for secondary purposes without patient consent if it is de-identified. Some recent articles in the medical, legal, and computer science literature have argued that de-identification methods do not provide sufficient protection because they are easy to reverse. Should this be the case, it would have significant and important implications on how health information is disclosed, including: (a) potentially limiting its availability for secondary purposes such as research, and (b) resulting in more identifiable health information being disclosed. Our objectives in this systematic review were to: (a) characterize known re-identification attacks on health data and contrast that to re-identification attacks on other kinds of data, (b) compute the overall proportion of records that have been correctly re-identified in these attacks, and (c) assess whether these demonstrate weaknesses in current de-identification methods.ConclusionsThe current evidence shows a high re-identification rate but is dominated by small-scale studies on data that was not de-identified according to existing standards. This evidence is insufficient to draw conclusions about the efficacy of de-identification methods.  相似文献   

3.
The collection and sharing of person-specific biospecimens has raised significant questions regarding privacy. In particular, the question of identifiability, or the degree to which materials stored in biobanks can be linked to the name of the individuals from which they were derived, is under scrutiny. The goal of this paper is to review the extent to which biospecimens and affiliated data can be designated as identifiable. To achieve this goal, we summarize recent research in identifiability assessment for DNA sequence data, as well as associated demographic and clinical data, shared via biobanks. We demonstrate the variability of the degree of risk, the factors that contribute to this variation, and potential ways to mitigate and manage such risk. Finally, we discuss the policy implications of these findings, particularly as they pertain to biobank security and access policies. We situate our review in the context of real data sharing scenarios and biorepositories.  相似文献   

4.
Maintaining privacy in network data publishing is a major challenge. This is because known characteristics of individuals can be used to extract new information about them. Recently, researchers have developed privacy methods based on k-anonymity and l-diversity to prevent re-identification or sensitive label disclosure through certain structural information. However, most of these studies have considered only structural information and have been developed for undirected networks. Furthermore, most existing approaches rely on generalization and node clustering so may entail significant information loss as all properties of all members of each group are generalized to the same value. In this paper, we introduce a framework for protecting sensitive attribute, degree (the number of connected entities), and relationships, as well as the presence of individuals in directed social network data whose nodes contain attributes. First, we define a privacy model that specifies privacy requirements for the above private information. Then, we introduce the technique of Ambiguity in Social Network data (ASN) based on anatomy, which specifies how to publish social network data. To employ ASN, individuals are partitioned into groups. Then, ASN publishes exact values of properties of individuals of each group with common group ID in several tables. The lossy join of those tables based on group ID injects uncertainty to reconstruct the original network. We also show how to measure different privacy requirements in ASN. Simulation results on real and synthetic datasets demonstrate that our framework, which protects from four types of private information disclosure, preserves data utility in tabular, topological and spectrum aspects of networks at a satisfactory level.  相似文献   

5.
Fulfilling the promise of the genetic revolution requires the analysis of large datasets containing information from thousands to millions of participants. However, sharing human genomic data requires protecting subjects from potential harm. Current models rely on de-identification techniques in which privacy versus data utility becomes a zero-sum game. Instead, we propose the use of trust-enabling techniques to create a solution in which researchers and participants both win. To do so we introduce three principles that facilitate trust in genetic research and outline one possible framework built upon those principles. Our hope is that such trust-centric frameworks provide a sustainable solution that reconciles genetic privacy with data sharing and facilitates genetic research.  相似文献   

6.
The collapse of confidence in anonymization (sometimes also known as de-identification) as a robust approach for preserving the privacy of personal data has incited an outpouring of new approaches that aim to fill the resulting trifecta of technical, organizational, and regulatory privacy gaps left in its wake. In the latter category, and in large part due to the growth of Big Data–driven biomedical research, falls a growing chorus of calls for criminal and penal offences to sanction wrongful re-identification of “anonymized” data. This chorus cuts across the fault lines of polarized privacy law scholarship that at times seems to advocate privacy protection at the expense of Big Data research or vice versa. Focusing on Big Data in the context of biomedicine, this article surveys the approaches that criminal or penal law might take toward wrongful re-identification of health data. It contextualizes the strategies within their respective legal regimes as well as in relation to emerging privacy debates focusing on personal data use and data linkage and assesses the relative merit of criminalization. We conclude that this approach suffers from several flaws and that alternative social and legal strategies to deter wrongful re-identification may be preferable.  相似文献   

7.
The human genetics community needs robust protocols that enable secure sharing of genomic data from participants in genetic research. Beacons are web servers that answer allele-presence queries—such as “Do you have a genome that has a specific nucleotide (e.g., A) at a specific genomic position (e.g., position 11,272 on chromosome 1)?”—with either “yes” or “no.” Here, we show that individuals in a beacon are susceptible to re-identification even if the only data shared include presence or absence information about alleles in a beacon. Specifically, we propose a likelihood-ratio test of whether a given individual is present in a given genetic beacon. Our test is not dependent on allele frequencies and is the most powerful test for a specified false-positive rate. Through simulations, we showed that in a beacon with 1,000 individuals, re-identification is possible with just 5,000 queries. Relatives can also be identified in the beacon. Re-identification is possible even in the presence of sequencing errors and variant-calling differences. In a beacon constructed with 65 European individuals from the 1000 Genomes Project, we demonstrated that it is possible to detect membership in the beacon with just 250 SNPs. With just 1,000 SNP queries, we were able to detect the presence of an individual genome from the Personal Genome Project in an existing beacon. Our results show that beacons can disclose membership and implied phenotypic information about participants and do not protect privacy a priori. We discuss risk mitigation through policies and standards such as not allowing anonymous pings of genetic beacons and requiring minimum beacon sizes.  相似文献   

8.
Fostering data sharing is a scientific and ethical imperative. Health gains can be achieved more comprehensively and quickly by combining large, information-rich datasets from across conventionally siloed disciplines and geographic areas. While collaboration for data sharing is increasingly embraced by policymakers and the international biomedical community, we lack a common ethical and legal framework to connect regulators, funders, consortia, and research projects so as to facilitate genomic and clinical data linkage, global science collaboration, and responsible research conduct. Governance tools can be used to responsibly steer the sharing of data for proper stewardship of research discovery, genomics research resources, and their clinical applications. In this article, we propose that an international code of conduct be designed to enable global genomic and clinical data sharing for biomedical research. To give this proposed code universal application and accountability, however, we propose to position it within a human rights framework. This proposition is not without precedent: international treaties have long recognized that everyone has a right to the benefits of scientific progress and its applications, and a right to the protection of the moral and material interests resulting from scientific productions. It is time to apply these twin rights to internationally collaborative genomic and clinical data sharing.  相似文献   

9.
Access to genetic data across studies is an important aspect of identifying new genetic associations through genome-wide association studies (GWASs). Meta-analysis across multiple GWASs with combined cohort sizes of tens of thousands of individuals often uncovers many more genome-wide associated loci than the original individual studies; this emphasizes the importance of tools and mechanisms for data sharing. However, even sharing summary-level data, such as allele frequencies, inherently carries some degree of privacy risk to study participants. Here we discuss mechanisms and resources for sharing data from GWASs, particularly focusing on approaches for assessing and quantifying the privacy risks to participants that result from the sharing of summary-level data.  相似文献   

10.
Saidi  Ahmed  Nouali  Omar  Amira  Abdelouahab 《Cluster computing》2022,25(1):167-185

Attribute-based encryption (ABE) is an access control mechanism that ensures efficient data sharing among dynamic groups of users by setting up access structures indicating who can access what. However, ABE suffers from expensive computation and privacy issues in resource-constrained environments such as IoT devices. In this paper, we present SHARE-ABE, a novel collaborative approach for preserving privacy that is built on top of Ciphertext-Policy Attribute-Based Encryption (CP-ABE). Our approach uses Fog computing to outsource the most laborious decryption operations to Fog nodes. The latter collaborate to partially decrypt the data using an original and efficient chained architecture. Additionally, our approach preserves the privacy of the access policy by introducing false attributes. Furthermore, we introduce a new construction of a collaboration attribute that allows users within the same group to combine their attributes while satisfying the access policy. Experiments and analyses of the security properties demonstrate that the proposed scheme is secure and efficient especially for resource-constrained IoT devices.

  相似文献   

11.
Journal policy on research data and code availability is an important part of the ongoing shift toward publishing reproducible computational science. This article extends the literature by studying journal data sharing policies by year (for both 2011 and 2012) for a referent set of 170 journals. We make a further contribution by evaluating code sharing policies, supplemental materials policies, and open access status for these 170 journals for each of 2011 and 2012. We build a predictive model of open data and code policy adoption as a function of impact factor and publisher and find higher impact journals more likely to have open data and code policies and scientific societies more likely to have open data and code policies than commercial publishers. We also find open data policies tend to lead open code policies, and we find no relationship between open data and code policies and either supplemental material policies or open access journal status. Of the journals in this study, 38% had a data policy, 22% had a code policy, and 66% had a supplemental materials policy as of June 2012. This reflects a striking one year increase of 16% in the number of data policies, a 30% increase in code policies, and a 7% increase in the number of supplemental materials policies. We introduce a new dataset to the community that categorizes data and code sharing, supplemental materials, and open access policies in 2011 and 2012 for these 170 journals.  相似文献   

12.
BACKGROUND: Haplotype sharing statistics have been introduced in an ad-hoc way, often relying heavily on permutation testing. As a result, applying these approaches to whole genome association studies or to evaluate their properties in extensive simulation experiments is problematic. Further, permutation testing may be inappropriate in the presence of phase ambiguity and population stratification. AIMS: To present a simple framework for a class of haplotype sharing statistics useful for association mapping in case-parent trio data. This framework allows derivation of novel haplotype sharing tests as well as simple variance estimators and asymptotic distributions for haplotype sharing tests. RESULTS AND CONCLUSIONS: We validated that our approach is appropriately sized using simulated data, and illustrate the methodology by analyzing a Crohn's disease dataset. We find that haplotype-based analyses are much more powerful than single-locus analyses for these data.  相似文献   

13.
Changgee Chang  Zhiqi Bu  Qi Long 《Biometrics》2023,79(3):2357-2369
Electronic health records (EHRs) offer great promises for advancing precision medicine and, at the same time, present significant analytical challenges. Particularly, it is often the case that patient-level data in EHRs cannot be shared across institutions (data sources) due to government regulations and/or institutional policies. As a result, there are growing interests about distributed learning over multiple EHRs databases without sharing patient-level data. To tackle such challenges, we propose a novel communication efficient method that aggregates the optimal estimates of external sites, by turning the problem into a missing data problem. In addition, we propose incorporating posterior samples of remote sites, which can provide partial information on the missing quantities and improve efficiency of parameter estimates while having the differential privacy property and thus reducing the risk of information leaking. The proposed approach, without sharing the raw patient level data, allows for proper statistical inference. We provide theoretical investigation for the asymptotic properties of the proposed method for statistical inference as well as differential privacy, and evaluate its performance in simulations and real data analyses in comparison with several recently developed methods.  相似文献   

14.
Data sharing models designed to facilitate global business provide insights for improving transborder genomic data sharing. We argue that a flexible, externally endorsed, multilateral arrangement, combined with an objective third-party assurance mechanism, can effectively balance privacy with the need to share genomic data globally.  相似文献   

15.
16.

Background

Genomic profiling of malignant tumours has assisted clinicians in providing targeted therapies for many serious cancer-related illnesses. Although the characterisation of somatic mutations is the primary aim of tumour profiling for treatment, germline mutations may also be detected given the heterogenous origin of mutations observed in tumours. Guidance documents address the return of germline findings that have health implications for patients and their genetic relations. However, the implications of discovering a potential but unconfirmed germline finding from tumour profiling are yet to be fully explored. Moreover, as tumour profiling is increasingly applied in oncology, robust ethical frameworks are required to encourage large-scale data sharing and data aggregation linking molecular data to clinical outcomes, to further understand the role of genetics in oncogenesis and to develop improved cancer therapies.

Results

This paper reports on the results of empirical research that is broadly aimed at developing an ethical framework for obtaining informed consent to return results from tumour profiling tests and to share the biomolecular data sourced from tumour tissues of cancer patients. Specifically, qualitative data were gathered from 36 semi-structured interviews with cancer patients and oncology clinicians at a cancer treatment centre in Singapore. The interview data indicated that patients had a limited comprehension of cancer genetics and implications of tumour testing. Furthermore, oncology clinicians stated that they lacked the time to provide in depth explanations of the tumour profile tests. However, it was accepted from both patients and oncologist that the return potential germline variants and the sharing of de-identified tumour profiling data nationally and internationally should be discussed and provided as an option during the consent process.

Conclusions

Findings provide support for the return of tumour profiling results provided that they are accompanied with an adequate explanation from qualified personnel. They also support the use of broad consent regiments within an ethical framework that promotes trust and benefit sharing with stakeholders and provides accountability and transparency in the storage and sharing of biomolecular data for research.
  相似文献   

17.

Background

There is increasing recognition of the importance of sharing research data within the international scientific community, but also of the ethical and social challenges this presents, particularly in the context of structural inequities and varied capacity in international research. Public involvement is essential to building locally responsive research policies, including on data sharing, but little research has involved stakeholders from low-to-middle income countries.

Methods

Between January and June 2014, a qualitative study was conducted in Kenya involving sixty stakeholders with varying experiences of research in a deliberative process to explore views on benefits and challenges in research data sharing. In-depth interviews and extended small group discussions based on information sharing and facilitated debate were used to collect data. Data were analysed using Framework Analysis, and charting flow and dynamics in debates.

Findings

The findings highlight both the opportunities and challenges of communicating about this complex and relatively novel topic for many stakeholders. For more and less research-experienced stakeholders, ethical research data sharing is likely to rest on the development and implementation of appropriate trust-building processes, linked to local perceptions of benefits and challenges. The central nature of trust is underpinned by uncertainties around who might request what data, for what purpose and when. Key benefits perceived in this consultation were concerned with the promotion of public health through science, with legitimate beneficiaries defined differently by different groups. Important challenges were risks to the interests of study participants, communities and originating researchers through stigmatisation, loss of privacy, impacting autonomy and unfair competition, including through forms of intentional and unintentional ''misuse'' of data. Risks were also seen for science.

Discussion

Given background structural inequities in much international research, building trust in this low-to-middle income setting includes ensuring that the interests of study participants, primary communities and originating researchers will be promoted as far as possible, as well as protected. Important ways of building trust in data sharing include involving the public in policy development and implementation, promoting scientific collaborations around data sharing and building close partnerships between researchers and government health authorities to provide checks and balances on data sharing, and promote near and long-term translational benefits.  相似文献   

18.
PurposeIn this study we trained a deep neural network model for female pelvis organ segmentation using data from several sites without any personal data sharing. The goal was to assess its prediction power compared with the model trained in a centralized manner.MethodsVarian Learning Portal (VLP) is a distributed machine learning (ML) infrastructure enabling privacy-preserving research across hospitals from different regions or countries, within the framework of a trusted consortium. Such a framework is relevant in the case when there is a high level of trust among the participating sites, but there are legal restrictions which do not allow the actual data sharing between them. We trained an organ segmentation model for the female pelvic region using the synchronous data distributed framework provided by the VLP.ResultsThe prediction performance of the model trained using the federated framework offered by VLP was on the same level as the performance of the model trained in a centralized manner where all training data was pulled together in one centre.ConclusionsVLP infrastructure can be used for GPU-based training of a deep neural network for organ segmentation for the female pelvic region. This organ segmentation instance is particularly difficult due to the high variation in the organs’ shape and size. Being able to train the model using data from several clinics can help, for instance, by exposing the model to a larger range of data variations. VLP framework enables such a distributed training approach without sharing protected health information.  相似文献   

19.
The rapid growth of social network data has given rise to high security awareness among users, especially when they exchange and share their personal information. However, because users have different feelings about sharing their information, they are often puzzled about who their partners for exchanging information can be and what information they can share. Is it possible to assist users in forming a partnership network in which they can exchange and share information with little worry? We propose a modified information sharing behavior prediction (ISBP) model that can help in understanding the underlying rules by which users share their information with partners in light of three common aspects: what types of items users are likely to share, what characteristics of users make them likely to share information, and what features of users’ sharing behavior are easy to predict. This model is applied with machine learning techniques in WEKA to predict users’ decisions pertaining to information sharing behavior and form them into trustable partnership networks by learning their features. In the experiment section, by using two real-life datasets consisting of citizens’ sharing behavior, we identify the effect of highly sensitive requests on sharing behavior adjacent to individual variables: the younger participants’ partners are more difficult to predict than those of the older participants, whereas the partners of people who are not computer majors are easier to predict than those of people who are computer majors. Based on these findings, we believe that it is necessary and feasible to offer users personalized suggestions on information sharing decisions, and this is pioneering work that could benefit college researchers focusing on user-centric strategies and website owners who want to collect more user information without raising their privacy awareness or losing their trustworthiness.  相似文献   

20.
Some human subsistence economies are characterized by extensive daily food sharing networks, which may buffer the risk of shortfalls and facilitate cooperative production and divisions of labor among households. Comparative studies of human food sharing can assess the generalizability of this theory across time, space, and diverse lifeways. Here we test several predictions about daily sharing norms–which presumably reflect realized cooperative behavior–in a globally representative sample of nonindustrial societies (the Standard Cross-Cultural Sample), while controlling for multiple sources of autocorrelation among societies using Bayesian multilevel models. Consistent with a risk-buffering function, we find that sharing is less likely in societies with alternative means of smoothing production and consumption such as animal husbandry, food storage, and external trade. Further, food sharing was tightly linked to labor sharing, indicating gains to cooperative production and perhaps divisions of labor. We found a small phylogenetic signal for food sharing (captured by a supertree of human populations based on genetic and linguistic data) that was mediated by food storage and social stratification. Food sharing norms reliably emerge as part of cooperative economies across time and space but are culled by innovations that facilitate self-reliant production.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号