Methodology for creating, clustering and visualizing correlation networks determined by the dynamics of thematic information flows

Authors

  • Oleksandr Puchkov Institute of special communication and information protection at the National technical university of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv, Ukraine https://orcid.org/0000-0002-8585-1044
  • Dmytro Lande Educational and scientific physico-technical institute at the National technical university of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv, Ukraine https://orcid.org/0000-0003-3945-1178
  • Ihor Subach Institute of special communication and information protection at the National technical university of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv, Ukraine https://orcid.org/0000-0002-9344-713X

DOI:

https://doi.org/10.20535/2411-1031.2025.13.1.328753

Keywords:

correlation networks, thematic information flows, clustering, visualization, dynamics vectors, content monitoring, cybersecurity, Gephi, Ph-Di diagram, semantic networks

Abstract

Given the rapid growth of information circulating in social media and the Internet space, there is an urgent need for effective methods of analyzing and visualizing thematic information flows. Correlation networks are a powerful tool for formalizing such processes, as they allow identifying relationships between different objects, including by analyzing their dynamics. This is especially relevant for the cybersecurity sector, where prompt detection of trends and connections between events can be crucial. The article is devoted to the development of a methodology for creating, clustering and visualizing correlation networks determined by the dynamics of thematic information flows. The article proposes an approach based on the analysis of vectors of publication dynamics obtained through social media content monitoring systems. Correlation networks are formed based on relationships between vectors reflecting the distribution of documents by dates. To visualize and analyze the networks, tools such as Gephi are used, as well as the author's own Ph-Di diagram to display the dynamics of information flows. The methodology allows identifying groups of interconnected objects, which can be useful for analyzing thematic information flows, particular in the field of cybersecurity. The results of the study can serve as a basis for building probabilistic networks and further scenario analysis. The advantages of the proposed methodology are the low dimensionality of the vectors, which simplifies their processing and analysis, language independence, so that the methodology can be used to analyze information flows in different languages, and ease of implementation, which makes it accessible to a wide range of researchers and analysts in the field of cybersecurity.

Author Biographies

Oleksandr Puchkov, Institute of special communication and information protection at the National technical university of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv

PhD in philosophy, professor, head

Dmytro Lande, Educational and scientific physico-technical institute at the National technical university of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv

doctor of technical sciences, professor, chair of the academic department of the information security

Ihor Subach, Institute of special communication and information protection at the National technical university of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv

doctor of technical sciences, professor, chair of the academic department of the cyber security and application of information systems and technologies

References

D. Lande, L. Strashnoi, and I. Balagura, “Method for the formation and clustering of correlation networks of concepts”, Registration, Storage and Processing of Data, vol. 23, iss. 2, pp. 27-36, 2012, doi: https://doi.org/10.35681/15609189.2021.23.2.239209.

A. A. Snarskii, D. V. Lande, D. I. Zorinets, and A. V. Levchenko, “Reciprocally time correlating objects ranking”, in Proc. XVII Inter. Scien. Conf. Named After T.A. Taran “Intell. An. of Inform. (IAI 2017)”, Kyiv, 2017, pp. 216-221.

D. Lande, “Formation of a semantic map of concepts in the field of parliamentary control”, Information and Law, iss. 4 (47), pp. 116-123, 2023, doi: https://doi.org/10.37750/26166798.2023.4(47).291611.

D. Lande, O. Puchkov, and I. Subach, “A system for analyzing large amounts of data on cybersecurity from social media”, Information Technologies and Security, vol. 8, iss. 1, pp. 418, 2020, https://doi.org/10.20535/2411-1031.2020.8.1.217993.

D. Lande, O. Puchkov, and I. Subach, “Aggregation of information from diverse networks as the basis for training cyber security specialists on processing ultra-large data sets”, Information Technologies and Security, vol. 9, iss. 1, pp. 4-16, 2021, doi: https://doi.org/10.20535/24111031.2021.9.1.247256.

D. Lande, OSINT in cybersecurity: A textbook. Kyiv, Ukraine: “Engineering LTD”, 2024.

K. Cherven, Mastering Gephi Network Visualization. Birmingham, UK: Packt Publishing, 2015.

G. Adomavicius, and J. Zhang, “Classification, ranking, and top-K stability of recommendation algorithms”, INFORMS Journal on Computing, iss. 28 (1), pp. 129-147, 2016, doi: https://doi.org/10.1287/ijoc.2015.0662.

A. Saxena, M. Prasad, A. Gupta, N. Bharill, O.P. Patel, A. Tiwari, and C.-T. Lin, “A review of clustering techniques and developments”, Neurocomputing, iss. 267, pp. 664-681, 2017, doi: https://doi.org/10.1016/j.neucom.2017.06.053.

A.K. Jain, “Data clustering: 50 years beyond K-means”, Pattern Recognition Letters, vol. 31, iss. 8, pp.651-666, 2010, doi: https://doi.org/10.1016/j.patrec.2009.09.011.

P. Luo, K. Shu, J. Wu, L. Wan, and Y. Tan, “Exploring correlation network for cheating detection”, ACM Transactions on Intelligent Systems and Technology, vol. 11, iss. 1, pp. 1-23, 2020, doi: https://doi.org/10.1145/3364221.

A. Kassambara, Network Analysis and Manipulation Using R: Quick Start Guide. STHDA, 2017.

N. Masuda, Z.M. Boyd, D. Garlaschelli, and P.J. Mucha, “Correlation networks: Interdisciplinary approaches beyond thresholding”, arXiv preprint arXiv: 2311.09536, 2023.

T. Triantoro, “Graph Viz: Exploring, analyzing, and visualizing graphs and networks with Gephi and ChatGPT”, in ODSC Community, March 30, 2023. [Online]. Available:https://opendatascience.com/graph-viz-exploring-analyzing-and-visualizing-graphsand-networks-with-gephi-and-chatgpt. Accessed on: Feb. 19, 2025.

Y. Liu et al., “Revisiting modularity maximization for graph clustering: A contrastive learning perspective”, in Proc. 30th ACM SIGKDD Conf. on Know. Disc. and DM, Barcelona, 2024, pp. 1968-1979, doi: https://doi.org/10.1145/3637528.3671967.

M. Zgurovsky, D. Lande, K. Yefremov, O. Dmytrenko, A. Boldak, and A. Soboliev, “Extracting and identifying relationships of key phrases in information flows”, in Proc. 2022 IEEE 3rd Intern. Conf. on Sys. An. & Intell. Com. (SAIC), Kyiv, 2022, pp. 4-7, doi: https://doi.org/10.1109/SAIC57818.2022.9923019.

Published

2025-05-20

How to Cite

Puchkov, O., Lande, D., & Subach, I. (2025). Methodology for creating, clustering and visualizing correlation networks determined by the dynamics of thematic information flows. Collection "Information Technology and Security", 13(1), 6–16. https://doi.org/10.20535/2411-1031.2025.13.1.328753

Issue

Section

INFORMATION TECHNOLOGY