NYMC Faculty Publications

The Iedea Harmonist Data Toolkit: A Data Quality and Data Sharing Solution for a Global HIV Research Consortium

Authors

Judith T. Lewis, Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Biomedical Engineering, Vanderbilt University, Nashville, TN, USA. Electronic address: judy.lewis@vumc.org.
Jeremy Stephens, Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA.
Beverly Musick, School of Medicine, Indiana University, Indianapolis, IN, USA.
Steven Brown, School of Medicine, Indiana University, Indianapolis, IN, USA.
Karen Malateste, French National Research Institute for Sustainable Development (IRD), Inserm, UMR 1219, University of Bordeaux, Bordeaux, France.
Cam Ha Dao Ostinelli, Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland.
Nicola Maxwell, Centre for Infectious Disease Epidemiology and Research, School of Public Health and Family Medicine, University of Cape Town, Cape Town, South Africa.
Karu Jayathilake, Department of Infectious Diseases, Vanderbilt University Medical Center, Nashville, TN, USA.
Qiuhu Shi, Department of Public Health, New York Medical College, Valhalla, NY, USA.
Ellen Brazier, Institute for Implementation Science in Population Health, City University of New York, New York, NY, USA; Graduate School of Public Health and Health Policy, City University of New York, New York, NY, USA.
Azar Kariminia, The Kirby Institute, UNSW Sydney, Australia.Follow
Brenna Hogan, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
Stephany N. Duda, Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA.

Author Type(s)

Faculty

DOI

10.1016/j.jbi.2022.104110

Journal Title

Journal of Biomedical Informatics

First Page

104110

Document Type

Article

Publication Date

7-1-2022

Department

Public Health

Abstract

We describe the design, implementation, and impact of a data harmonization, data quality checking, and dynamic report generation application in an international observational HIV research network. The IeDEA Harmonist Data Toolkit is a web-based application written in the open source programming language R, employs the R/Shiny and RMarkdown packages, and leverages the REDCap data collection platform for data model definition and user authentication. The Toolkit performs data quality checks on uploaded datasets, checks for conformance with the network's common data model, displays the results both interactively and in downloadable reports, and stores approved datasets in secure cloud storage for retrieval by the requesting investigator. Including stakeholders and users in the design process was key to the successful adoption of the application. A survey of regional data managers as well as initial usage metrics indicate that the Toolkit saves time and results in improved data quality, with a 61% mean reduction in the number of error records in a dataset. The generalized application design allows the Toolkit to be easily adapted to other research networks.

Share

COinS