System for analysing of big data on cybersecurity issues from social media

Authors

DOI:

https://doi.org/10.20535/2411-1031.2020.8.1.217993

Keywords:

social media monitoring, cybersecurity, open-source intelligence, social media analysis, CyberAggregator

Abstract

The paper proposes and substantiates approaches to building a corporate system for monitoring and analyzing social media on cybersecurity issues, which are based on the concept Big Data, Data/Text Mining, Information Extraction, Complex Networks, and Cloud Computing. The components of Elastic Stack technology, Sphinx information retrieval system, Graph Data Base Management System Neo4j, and Gephi graph analysis system are examined in detail. The main idea of a system for analyzing large amounts of data on cybersecurity issues from social media is the simultaneous application of methods and means of information retrieval, data analysis, and aggregation of information flows. The system should ensure the implementation of the following functions: the formation of databases by collecting information from certain information resources; settings for automatic scanning and primary processing of information from websites and social networks; maintaining full-text information databases; identifying duplicates similar in content to informational messages; full-text search; analysis of text messages, determination of tonality, the formation of analytical reports; integration with geographic information system; data analysis and visualization; study of the dynamics of thematic information flows; predicting developments based on the analysis of the dynamics of the publication in social media; providing access for many users to the functional components of the system. The practical significance of the results is to create a working layout of the content monitoring and analysis system of social media on cybersecurity issues, ready to be used as a component in information and cybersecurity decision support systems.  The interface of the system layout is considered, in which the functions of search, analysis, and forecasting of information appearance in social media are available. Central to the interface is a digest of the most relevant user needs. In the analytical mode, a number of tools are implemented for graphical presentation of the analyzed data, which are displayed as a time series of the number of relevant queries per day, as well as viewing the main topics, clusters grouped by predefined reference words. The system provides modes for forming networks of concepts that correspond to individual messages (persons, brands) and information sources that allow you to rank the concepts and explore the relationships between them.

Author Biographies

Dmytro Lande, Institute for information recording of National academy of science of Ukraine, Kyiv,

doctor of technical science, professor,
head at the specialized modeling tools
department

Oleksandr Puchkov, Institute of special communication and information protection National technical university of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv,

candidate of philosophy science,
professor, head

Ihor Subach, Institute of special communication and information protection National technical university of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv,

doctor of technical science, associate professor,
head at the cybersecurity and application
of information systems and technologies
academic department

References

D. V. Lande, I. Yu. Subach, and Yu. Ye. Boyarinova, Fundamentals of the theory and practice of data mining in the field of cyber security, Kyiv: Institute of special communication and information protection of National technical university of Ukraine “Igor Sikorsky Kyiv polytechnic institute”, 2018.

D. Boyd, and K. Crawford, “Critical questions for Big Data”, Journal Information, Communication & Society, vol. 15, iss. 5, pp. 662-679, 2012, doi: https://doi.org/10.1080/1369118X.2012.678878.

R. Layton, and P. A. Watters, Automating open source intelligence: algorithms for OSINT: Elsevier, Syngress, 2016, doi: https://doi.org/10.1016/C2014-0-02170-3.

B. Akhgar, P. S. Bayerl, and F. Sampson, Open Source Intelligence Investigation. From Strategy to Implementation: Springer International Publishing AG, 2016.

N. Memon, and R. Reda Alhajj, Counterterrorism and Open Source Intelligence, Wien, Austria: Springer-Verlag, 2011.

E. J. Appel, Cybervetting. Internet Searches for Vetting, Investigations, and Open-Source Intelligence: Taylor & Francis Group, LLC, 2015.

J. W. Foreman, Data Smart. Using Data Science to Transform Information into Insight: Wiley, 2013.

N. Marz, and J Warren, Big Data: Principles and best practices of scalable realtime data systems: Manning, 2012.

D. Cielen, A. Meysman, and M. Ali, Introducing Data Science. Big Data, Machine Learning, and More, Using Python Tools: Manning Publications Co., 2016.

K. Krishnan, Data Warehousing in the Age of Big Data: Elsevier Inc, 2013.

D. Easley, and J. Kleinberg, Networks, Crowds, and Markets: Reasoning About a Highly Connected World: Cambridge University Press, 2010.

G. Ragozini, and M. P. Vitale, Challenges In Social Network Research: Methods And Applications: Lecture Notes In Social Network: Springer, 2020.

M. Kaya, J. Kawash, S. Khoury, and M. Y. Day, Social Network Based Big Data Analysis and Applications: Springer International Publishing, 2018.

M. Kaya, Ö. Erdogan, and J. Rokne, From Social Data Mining and Analysis to Prediction and Community Detection: Springer International Publishing, 2017.

K. A. Zweig, Network Analysis Literacy: A Practical Approach to the Analysis of Networks, Wien, Austria: Springer-Verlag, 2016.

M. A. Russell, and M. Klassen, Mining the Social Web Data Mining Facebook Twitter LinkedIn Instagram: O’Reilly Media, 2019.

M. A. Russell, 21 Recipes for Mining Twitter: O’Reilly Media, 2011.

ATP 2-22.9, Army Techniques Publication, no. 2-22.9 (FMI 2-22.9). Headquarters Department of the Army Washington, DC, 10 July 2012.

D. Lande, and E. Shnurko-Tabakova, “OSINT as a part of cyber defense system”, Theoretical and Applied Cybersecurity, no. 1, pp. 103-108, 2019, doi: https://doi.org/10.20535/tacs.2664-29132019.1.169091.

D. Lande, “Information Streams Analysis in the Global Computer Networks”, Visnyk NAS of Ukraine, no. 3, pp. 46-54, 2017, doi: https://doi.org/10.15407/visn2017.03.045.

A. Dodonov, D. Lande, V. Tsyganok, O. Andriichuk, S. Kadenko, and A. Graivoronskaya, Information Operations Recognition. From Nonlinear Analysis to Decision-Making: Lambert Academic Publishing, 2019.

P. Kisel’ov, and D. Lande, “Development of software for analysis and forecasting of information operations”, in Proc. of the scientific-practical conference of cadets (students), graduate students, doctoral students and young scientists “Topical issues of special information and telecommunications systems”, Kyiv, 2019, pp. 180.

O. Dodonov, D. Lande, O. Nesterenko, and B. Berezin, “Approach to forecasting the effectiveness of public administration using OSINT technologies”, in Proc. of the XIX International Scientific and Practical Conference ITS-2019, Kyiv, 2019. pp. 230-233.

D. Lande, I. Subach, and A. Sobolyev, “Computer program “Computer program of social networks content monitoring on cybersecurity “CyberAggregator” (“CyberAggregator”)”, Ukraine, Certificate of registration of copyright to the work № 91831, July 31, 2019.

D. Lande, N. Kalyan, and O. Matiishin, “Social media aggregation system on cybersecurity”, in Proc. of the XVII All-Ukrainian scientific-practical conference of students, graduate students and young scientists “Theoretical and applied problems of physics, mathematics and computer science”, Kyiv, 2019, pp. 10-11.

D. Sornette, How to predict the collapse of financial markets. Critical events in complex financial systems, Litres, 2017.

O. V. Urentsov, “Testing the possibility of predicting crises in the financial market using the method of D. Sornette”, in Proc. of the Institute of System Analysis of the Russian Academy of Sciences, 2008, no. 40, pp. 174-191.

Published

2020-07-09

How to Cite

Lande, D., Puchkov, O., & Subach, I. (2020). System for analysing of big data on cybersecurity issues from social media. Collection "Information Technology and Security", 8(1), 4–18. https://doi.org/10.20535/2411-1031.2020.8.1.217993

Issue

Section

INFORMATION TECHNOLOGY