Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: JMIR Medical Informatics

Date Submitted: Aug 17, 2025
Open Peer Review Period: Sep 4, 2025 - Oct 30, 2025
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Novel use of a Hashed Identifier Token for Sharing Sickle Cell Disease Data Without Sharing PHI

  • Najibah Galadanci; 
  • Gerhard Hellemann; 
  • Samuel Washko; 
  • Charles Abrams; 
  • Kathleen Torres; 
  • Julie Kanter

ABSTRACT

Background:

Background:

Sickle cell disease (SCD) is the most common inherited clinically relevant blood disorder. Anemia and progressive organ injury ultimately results in significant debility, health care utilization and reduced quality of life. As efforts continue to improve care for people living with SCD, several new registries have been developed but are often reliant on specific grants/contracts resulting in their early termination. Current ongoing medical registry initiatives lack coordination, leading to fragmented datasets that fall short of the comprehensive insights provided by well-structured, longitudinal registries. Although much of the data elements of interest are technically available in the separate electronic medical record systems of various hospitals; they are difficult to aggregate due to minimal interoperability within these systems, insufficient use of common data elements, and poor translation of natural language reports into codified data elements. These barriers to data collection prevent researchers from accessing the comprehensive needed to advance understanding of the illness at the population level. This lack of robust data creates significant gaps in our knowledge of how SCD progresses longitudinally throughout a person’s life. Efforts to create a common data system are being considered but this may result in the loss of years of data and knowledge that is needed in SCD. A better way to optimize existing data collections requires a data linkage system that connects datasets while de-duplicating and linking individuals within each database.

Objective:

Objective:

The primary objective of this project is to develop a privacy-preserving approach for securely linking three of the largest data collection efforts in SCD in the United States.

Methods:

Methods:

The study was conducted at the University of Alabama (UAB) Lifespan Sickle Cell Center (LSCCC). The LSCCC currently holds IRB approval to access three of the SCD data collection systems: the Sickle Cell Data Collection (SCDC) project, the American Society of Hematology Research Collaborative (ASH RC) Data Hub for people with SCD seen at a UAB hospital, and the Globin Research Network for Data and Discovery (GRNDaD) registry. For each of the data sources, we internally create a set of identity tokens based on our selected set of identifiers and then used hashing (SHA-256) to create a hashed token that enables data sharing without compromising participant privacy.

Results:

Results:

A total of 8026 records were combined across the three registries. Through deterministic matching of the hashed identifier token, we were able to identify 1080 unique individuals with records in at least two of the registries.

Conclusions:

Conclusion: This is the first project in SCD to implement a privacy preserving technique for linking individuals across multiple registries. By securely connecting data from each registry, the approach leverages unique data elements from each source while ensuring that PHI is protected. Enhancing data interoperability is essential to deepening our longitudinal understanding of SCD, improving our ability to study and personalize treatments for affected individuals and to ensure we can bring more treatments to the forefront for this at-risk population.


 Citation

Please cite as:

Galadanci N, Hellemann G, Washko S, Abrams C, Torres K, Kanter J

Novel use of a Hashed Identifier Token for Sharing Sickle Cell Disease Data Without Sharing PHI

JMIR Preprints. 17/08/2025:82493

DOI: 10.2196/preprints.82493

URL: https://preprints.jmir.org/preprint/82493

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.