Integration of information search technologies and artificial intelligence in the field of cybersecurity
DOI:
https://doi.org/10.20535/2411-1031.2023.11.2.293789Keywords:
open-source intelligence, сontent monitoring system, semantic concepts, cybersecurity, analysis of news documents, summarization of document arrays, generative artificial intelligence, Llama-2, CyberAggregatorAbstract
The paper explores the possibility of integrating traditional intelligence systems in open-source intelligence (OSINT) with advanced generative artificial intelligence (GAI) technologies, which are becoming a key factor in the development of analytical systems. The main focus of the research is on improving the functionality of the social media content monitoring system for cybersecurity issues, called CyberAggregator. The study identifies several analytical components where the application of GAI technology is most effective, including the creation of networks of key words and persons, identification of toponyms, and information summarization (building summaries, digests). The practical aspect of the research is dedicated to integrating the content monitoring system with the large language model Llama-2. The steps of this integration are provided, and the interaction process between the information search system and Llama-2 is described. The installation of dependencies and processing of queries transformed into prompts for the GAI system are detailed. This integration opens up broad possibilities for utilizing the large language model to address semantic tasks, thereby enhancing the analytical capabilities of intelligence systems. The paper identifies perspectives for using GAI to further develop and enhance information analysis systems in open sources, providing new opportunities to expand the understanding and effective use of artificial intelligence technologies in the context of tasks and ensuring cyber and information security.
References
St. Wolfram, What Is ChatGPT Doing ... and Why Does it Work?, Champaign, IL, USA: Wolfram Media, Inc., 2023.
N. Kumar, A. Sen, V. Hordiichuk, M. Jaramillo, B. Molodetskyi, and A. Kasture, “AI in Cybersecurity: Threat Detection and Response with Machine Learning”, Tuijin Jishu / Journal of Propulsion Technology, vol. 44, no. 3, pp. 38-46, 2023. doi: https://doi.org/10.52783/tjjpt.v44.i3.237.
D. Lande, and L. Strashnoy, “Concept Networking Methods Based on ChatGPT & Gephi”, SSRN Preprint (April 17, 2023). 12 p. doi: https://doi.org/10.2139/ssrn.4420452.
D. Lande, and L. Strashnoy, “Formation of networks of concepts in the field of law with the help of an artificial intelligence system”, Information and law, no. 2 (45), pp. 88-93, 2023. doi: https://doi.org/10.37750/2616-6798.2023.2(45).282326.
O. D. Okey, E. U. Udo, R. L. Rosa, D. Z. Rodríguez, and J. H. Kleinschmidt, “Investigating ChatGPT and cybersecurity: A perspective on topic modeling and sentiment analysis”, Computers & Security, vol. 135, art. 103476, 2023. doi: https://doi.org/10.1016/j.cose.2023.103476.
D. Lande, O. Puchkov, and I. Subach, “System for analysing of big data on cybersecurity issues from social media”, Information Technology and Security, vol 8, iss. 1 (14), pp. 4-18, 2020. doi: https://doi.org/10.2053/2411-1031.2020.8.1.217993.
S. Pranav, and K. M. N. Sharath, Learning Elastic Stack 7.0: Distributed search, analytics, and visualization using Elasticsearch, Logstash, Beats, and Kibana, 2nd Edition, Birmigham, UK: Packt Publishing, 2019.
D. Lande, I. Subach, and A. Gladun, Processing of extremely large data sets (Big Data): a tutorial, Kyiv, Ukraine, 2021. [Online]. Available: https://ela.kpi.ua/handle/123456789/46129.
H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models”, ArXiV Preprint arXiv:2307.09288, 2023. doi: https://doi.org/10.48550/arXiv.2307.09288.
Z. Zhao, Z. Zhang, and F. Hopfgartner, “A Comparative Study of Using Pre-trained Language Models for Toxic Comment Classification”, in Proc. WWW '21: The Web Conference 2021, pp. 500-507, April 2021. doi: https://doi.org/10.1145/3442442.3452313.
“Prompt engineering”, OpenAI. [Online]. Available: https://platform.openai.com/docs/guides/prompt-engineering. Accessed on: June 18, 2023.
“Best practices for prompt engineering with OpenAI API”, OpenAI. [Online]. Available: https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api. Accessed on: June 20, 2023.
D. Lande, and L. Strashnoy, O. Driamov, and A. Feger, “Formation of activity scenarios based on generative artificial intelligence services”, Artificial Intelligence, no. 97 (3), pp. 94-103, 2023. doi: https://doi.org/10.15407/jai2022.01.08.
K. Cherven, Mastering Gephi Network Visualization, Birmigham, UK: Packt Publishing, 2015.
T. Triantoro, “Graph Viz: Exploring, Analyzing, and Visualizing Graphs and Networks with Gephi and ChatGPT”, ODSC Community, 2023. [Online]. Available: https://opendatascience.com/graph-viz-exploring-analyzing-and-visualizing-graphs-and-networks-with-gephi-and-chatgpt. Accessed on: June 20, 2023.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Collection "Information Technology and Security"
This work is licensed under a Creative Commons Attribution 4.0 International License.
The authors that are published in this collection, agree to the following terms:
- The authors reserve the right to authorship of their work and pass the collection right of first publication this work is licensed under the Creative Commons Attribution License, which allows others to freely distribute the published work with the obligatory reference to the authors of the original work and the first publication of the work in this collection.
- The authors have the right to conclude an agreement on exclusive distribution of the work in the form in which it was published this anthology (for example, to place the work in a digital repository institution or to publish in the structure of the monograph), provided that references to the first publication of the work in this collection.
- Policy of the journal allows and encourages the placement of authors on the Internet (for example, in storage facilities or on personal web sites) the manuscript of the work, prior to the submission of the manuscript to the editor, and during its editorial processing, as it contributes to productive scientific discussion and positive effect on the efficiency and dynamics of citations of published work (see The Effect of Open Access).