Results of a review published in the Journal of the American Medical Informatics Association indicate that there is no one-size-fits-all solution to preserving privacy while using mobile location data in biomedical research.
With the explosion of big data and its use in human biomedical research comes the problem of unauthorized parties accessing data and potentially re-identifying the de-identified personal information associated with those data. The introduction of inexpensive location-based services such as global positioning systems has compounded this problem. Whereas there may be no harm in knowing the exercise habits of most individuals, for instance, knowing the HIV status of individuals could cause harm, and being able to identify the location of persons in the Witness Protection Program could have tragic consequences.
Author Daniel M. Goldenholz, MD, PhD, of the Clinical Epilepsy Section at the National Institute of Neurological Disorders and Stroke at Beth Israel Deaconess Medical Center in Boston, Massachusetts, and colleagues reviewed the problems faced by institutional review boards and suggested 6 questions that these boards should be asking:
- What are the risks of collecting, sharing, and publishing individual-level location-based data?
- What types of harm are involved if re-identification occurs?
- What parties might be interested in re-identifying these data?
- What risk mitigation strategies are needed?
- Can the institutional review board assess the methods of protecting data against re-identification or is a data scientist needed?
- How will the risks be explained to study participants?
Dr Goldenholz and colleagues discussed the importance of legal mitigation strategies and the protections offered to both investigators and institutions to prevent them from being compelled to release information. The authors also outline a number of technical mitigation strategies, some of which require a data scientist. Among these strategies are data aggregation, which requires a large enough dataset; obscuring location data, which includes removal, encryption, cloaking, adding noise, decreasing resolution, and simulation; preserving only a portion of the location data or shifting location data in a random direction to preserve relative location but dropping absolute location; and performing temporal manipulations. Another consideration is social-spatial linkage analysis, which uses connections to other databases to re-identify data. Location data that are associated with other characteristics of subjects such as age, gender, ethnicity, may constitute points of vulnerability. Randomly swapping attributes can mitigate this risk.
The authors note that the use of any particular mitigation strategy depends on the study, the importance that location information plays in the research, and the risk for patient harm — where there is a high risk for harm, the mitigation strategy may need to be more complex. Dr Goldenholz and colleagues argue, however, that the only strategy to protect privacy completely is to remove all location data.
Goldenholz DM, Goldenholz SR, Krishnamurthy KB, et al. Using mobile location data in biomedical research while preserving privacy [published online June 7, 2018]. J Am Med Inform Assoc. doi:10.1093/jamia/ocy071.