Searching technology for questionable records when creating the Unified registry of ukrainian individuals identification

Authors

DOI:

https://doi.org/10.20535/2411-1031.2019.7.2.190539

Keywords:

Data accuracy, personal data, privacy, end-to-end identifier, registry, confidentiality.

Abstract

One of the most effective solutions for protecting personal data when building a Unified Identity Registry is to share end-to-end identifier and hash codes generated from combinations of personal data using one-sided hash functions. This is due to the fact that the stage of creating a unified personal identification register does not involve the use of open personal data and therefore no personal data is allowed on the server and only unique identifiers and hash codes are allowed. In accordance with the principles of creating the above registry, five required and fifteen optional types of personal data stored in the registry were analyzed and used to generate hash codes, as well as possible combinations of personal data fields (ten different combinations of personal data were used in the work) data) built on the types specified. The technology of end-to-end identification has been developed, which has the ability to track errors in the fields with personal data when entering new data and when searching the registry. For the evaluation of the proposed technology, 100,000 simulated individuals were selected with random errors in the appropriate fields that store personal data. These errors are randomly placed in the fields of the created registry database that store personal information of the required and optional types. The efficiency of the proposed technology has also been verified by registering new persons in the registry. The proposed technology has a high tolerance for errors and can correctly identify and associate an individual, even with errors in multiple fields of personal data. Correct personal data, especially in the fields of the database with mandatory personal data, is crucial to avoid erroneous entries in the created registry. In the context of one-sided hash transformation, a doubtful record with personal data can be identified by applying hash operators based on hash codes calculated according to certain combinations of personal data.

Author Biography

Yaroslav Dorohyi, National technical university of Ukraine “Igor Sikorsky Kyiv polytechnic institute”, Kyiv,

candidate of technical sciences, associate professor,
associate professor in department of Automation
and Control in Technical Systems

References

Verkhovna Rada Ukrainy. VI convocation, 11th session. (2012, Sept. 06). Zakon № 5203-VI, Pro Administratyvni Posluhy. [Online]. Available: http://zakon2.rada.gov.ua/laws/show/5203-17. Accessed on: 06.09.19.

Verkhovna Rada Ukrainy. VI convocation, 11th session. (2012, Nov. 20). Zakon № 5492-VI, Pro Yedynyi Derzhavnyi Demohrafichnyi Reiestr Ta Dokumenty Shcho Pidtverdzhuiut Hromadianstvo Ukrainy Posvidchuiut Osobu Chy Yii Spetsialnyi Status. [Online]. Available: https://zakon.rada.gov.ua/laws/card/5492-17. Accessed on: 06.09.19.

Kabinet Ministriv Ukrainy. (2016, Sept. 08). Postanova Kabinetu Ministriv Ukrainy № 606. [Online]. Available: https://zakon.rada.gov.ua/laws/show/606-2016-%D0%BF. Accessed on: 06.09.19.

Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule, HHS. [Online]. Available: https://www.hhs.gov/hipaa/for-professionals/privacy/ special-topics/de-identification/index.html. Accessed on: 06.09.19.

J.B. Freymann, J.S. Kirby, J.H. Perry, D.A. Clunie, and C.C. Jaffe, “Image data sharing for biomedical research-meeting HIPAA requirements for de-identification”, Digit Imaging, № 25 (1), pp. 14-24, 2012. doi: 10.1007/s10278-011-9422-x.

O. Uzuner. Y. Luo, and P. Szolovits, “Evaluating the state-of-the-art in automatic de-identification”, Am Med Inform Assoc., № 14 (5), pp. 550-563, 2007. doi: 10.1197/ jamia.M2444.

K.El Emam, and etc., “De-identification methods for open health data: the case of the Heritage Health Prize claims dataset”, Med Internet Res., № 27, 2012. doi: 10.2196/ jmir.2001.

B. S. Elger, and etc., “Strategies for health data exchange for secondary, cross-institutional clinical research”, Comput Methods Programs Biomed, № 99 (3), pp. 230-251, 2010. doi: 10.1016/j.cmpb.2009.12.001.

Privacy rule and research nih. Clinical research and the HIPAA Privacy Rule, HSS. [Online]. Available: https://privacyruleandresearch.nih.gov/ pdf/clin_research.pdf. Accessed on: 06.09.19.

L. Ohno-Machado, and etc., “iDASH: integrating data for analysis, anonymization, and sharing”, Am Med Inform Assoc., № 19 (2), pp. 196-201, 2012. doi: 10.1136/amiajnl-2011-000538.

K. Benitez, and B. Malin, “Evaluating re-identification risks with respect to the HIPAA privacy rule”, Am Med Inform Assoc., № 17 (2), pp. 169-177, 2010. doi: 10.1136/ jamia.2009.000026.

C. Quantin, and etc., “Linking anonymous databases for national and international multicenter epidemiological studies: a cryptographic algorithm”, Epidemiol Sante Publique, № 57 (1), pp. 33-39, 2009. doi: 10.1016/j.respe.2008.10.010.

S.B. Johnson, “Using global unique identifiers to link autism collections”, Am Med Inform Assoc., № 17 (6), pp. 689-695, 2010. doi: 10.1136/jamia.2009.002063.

Published

2019-12-30

How to Cite

Dorohyi, Y. (2019). Searching technology for questionable records when creating the Unified registry of ukrainian individuals identification. Collection "Information Technology and Security", 7(2), 114–125. https://doi.org/10.20535/2411-1031.2019.7.2.190539

Issue

Section

INFORMATION TECHNOLOGY