Improving efficiency of providing data group anonymity by automating data modification quality evaluation

Oleg Chertov; Dan Tavrov

doi:10.15587/1729-4061.2017.113046

Authors

Oleg Chertov National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute” Peremohy ave., 37, Kyiv, Ukraine, 03056, Ukraine https://orcid.org/0000-0003-0087-1028
Dan Tavrov National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute” Peremohy ave., 37, Kyiv, Ukraine, 03056, Ukraine https://orcid.org/0000-0002-3689-2931

DOI:

https://doi.org/10.15587/1729-4061.2017.113046

Keywords:

memetic algorithm, group anonymity, microfile, outlier, modified Thompson tau technique

Abstract

In the work, a modification of the method for solving the task of providing data group anonymity is proposed, which implies automated solution selection without expert participation. Modification lies in identifying solutions to the task, in which outliers are detected automatically and don’t match the outliers in the initial distribution of the information about the group of respondents. Thus, automating the solution selection improves data group anonymization efficiency by reducing the time necessary for their analysis for masking sensitive features of the distribution.

Testing the developed modification is done by solving the task of masking regional distribution of military personnel in the state of New York. As a result of solving the corresponding group anonymization task, 1,000 solutions were obtained. It is established that only 24 out of 1,000 solutions, or 2.4 % of the total number, are feasible, i. e. the ones in which all the outliers are masked. Automated selection of such a small number of solutions is significantly faster than the manual approach, which speaks in favor of the proposed modification for improving data group anonymization efficiency.

Author Biographies

Oleg Chertov, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute” Peremohy ave., 37, Kyiv, Ukraine, 03056

Doctor of Technical Sciences, Associate Professor

Department of Applied Mathematics

Dan Tavrov, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute” Peremohy ave., 37, Kyiv, Ukraine, 03056

PhD

Department of Applied Mathematics

References

Rafalski, E. M. (Ed.) Health Insurance Portability and Accountability Act of 1996 (HIPAA). Encyclopedia of Health Services Research. doi: 10.4135/9781412971942.n180
Patient Safety and Quality Improvement Act of 2005 (PSQIA) (2001). Federal Register, No. 73 (266).
Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016: May 4, 2016 (2016). Official Journal of the European Union, L 119, 1–88.
Pfitzmann, A., Hansen, M. (2010). A Terminology for Talking About Privacy by Data Minimization: Anonymity, Unlinkability, Undetectability, Unobservability, Pseudonymity, and Identity Management. Version v0.34. Privacy and data security. Available at: http://dud.inf.tu-dresden.de/Anon_Terminology.shtml
Hawkins, D. (1980). Identification of Outliers. Springer, 198. doi: 10.1007/978-94-015-3994-4
Chertov, O., Tavrov, D. (2010). Group Anonymity. Information Processing and Management of Uncertainty in Knowledge-Based Systems. Applications, 592–601. doi: 10.1007/978-3-642-14058-7_61
Chertov, O., Tavrov, D. (2014). Microfiles as a Potential Source of Confidential Information Leakage. Studies in Computational Intelligence, 87–114. doi: 10.1007/978-3-319-08624-8_4
Sweeney, L. (2002). k-Anonymity: A Model for Protecting Privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10 (05), 557–570. doi: 10.1142/s0218488502001648
Angiuli, O., Waldo, J. (2016). Statistical Tradeoffs between Generalization and Suppression in the De-identification of Large-Scale Data Sets. 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC). doi: 10.1109/compsac.2016.198
Templ, M., Meindl, B., Kowarik, A. (2015). Statistical Disclosure Control for Micro-Data Using the R Package sdcMicro. Journal of Statistical Software, 67 (4). doi: 10.18637/jss.v067.i04
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M. (2007). L-Diversity: Privacy Beyond k-Anonymity. ACM Transactions on Knowledge Discovery from Data, 1 (1). doi: 10.1145/1217299.1217302
Domingo-Ferrer, J., Soria-Comas, J. (2015). From t-closeness to differential privacy and vice versa in data anonymization. Knowledge-Based Systems, 74, 151–158. doi: 10.1016/j.knosys.2014.11.011
Salazar-González, J.-J. (2008). Statistical confidentiality: Optimization techniques to protect tables. Computers & Operations Research, 35 (5), 1638–1651. doi: 10.1016/j.cor.2006.09.007
Parmar, A. A., Rao, U. P., Patel, D. R. (2011). Blocking Based Approach for Classification Rule Hiding to Preserve the Privacy in Database. 2011 International Symposium on Computer Science and Society. doi: 10.1109/isccs.2011.103
Singh, A., Bansal, D., Sofat, S. (2014). Privacy Preserving Techniques in Social Networks Data Publishing – A Review. International Journal of Computer Applications, 87 (15), 9–14. doi: 10.5120/15282-3880
Chertov, O., Tavrov, D. (2016). Two-Phase Memetic Modifying Transformation for Solving the Task of Providing Group Anonymity. Studies in Fuzziness and Soft Computing, 239–253. doi: 10.1007/978-3-319-32229-2_17
Kleinberg, J., Tardos, E. (2005). Algorithm Design. Pearson, 864.
Tavrov, D. (2015). Memetic approach to anonymizing groups that can be approximated by a fuzzy inference system. 2015 Annual Conference of the North American Fuzzy Information Processing Society (NAFIPS) Held Jointly with 2015 5th World Conference on Soft Computing (WConSC). doi: 10.1109/nafips-wconsc.2015.7284189
Chertov, O., Tavrov, D. (2014). Memetic Algorithm for Solving the Task of Providing Group Anonymity. Studies in Fuzziness and Soft Computing, 281–292. doi: 10.1007/978-3-319-03674-8_27
Neri, F., Cotta, C. (2012). A Primer on Memetic Algorithms. Studies in Computational Intelligence, 43–52. doi: 10.1007/978-3-642-23247-3_4
Eiben, A. E., Smith, J. E. (2015). Introduction to Evolutionary Computing. Berlin, Heidelberg: Springer-Verlag, 287. doi: 10.1007/978-3-662-44874-8
Zhang, Y., Liu, J., Zhou, M., Jiang, Z. (2016). A multi-objective memetic algorithm based on decomposition for big optimization problems. Memetic Computing, 8 (1), 45–61. doi: 10.1007/s12293-015-0175-9
Turky, A., Sabar, N. R., Song, A. (2016). A multi-population memetic algorithm for dynamic shortest path routing in mobile ad-hoc networks. 2016 IEEE Congress on Evolutionary Computation (CEC). doi: 10.1109/cec.2016.7744313
Wang, Y., Chen, J., Sun, H., Yin, M. (2017). A Memetic Algorithm for Minimum Independent Dominating Set Problem. Neural Computing and Applications. doi: 10.1007/s00521-016-2813-7
Jain, P., Srivastava, K., Saran, G. (2016). Minimizing cyclic cutwidth of graphs using a memetic algorithm. Journal of Heuristics, 22 (6), 815–848. doi: 10.1007/s10732-016-9319-4
Aggarwal, C. C. (2013). Outlier Analysis. New York: Springer-Verlag, 461. doi: 10.1007/978-1-4614-6396-2
Ruggles, S., Genadek, K., Goeken, R., Grover, J., Sobek, M. (2015). Integrated Public Use Microdata Series: Version 6.0. Minneapolis: University of Minnesota. Available at: https://usa.ipums.org/usa/
Base Structure Report Fiscal Year 2014 Baseline – A Summary of the Real Property Inventory. Available at: https://www.acq.osd.mil/eie/Downloads/BSI/Base%20Structure%20Report%20FY14.pdf
Syswerda, G. (1991). Schedule Optimization Using Genetic Algorithms. Handbook of Genetic Algorithms. New York: Van Nostrand Reinhold, 332–349.
Brindle, A. (1981). Genetic Algorithms for Function Optimization. Edmonton: University of Alberta, Department of Computer Science, 93.

Improving efficiency of providing data group anonymity by automating data modification quality evaluation

Authors

DOI:

Keywords:

Abstract

Author Biographies

Oleg Chertov, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute” Peremohy ave., 37, Kyiv, Ukraine, 03056

Dan Tavrov, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute” Peremohy ave., 37, Kyiv, Ukraine, 03056

References

Downloads

Published

How to Cite

Issue

Section

License

Language

Information

Make a Submission

Developed By

Current Issue