Skip to main content

Towards Safe Cyber Practices: Developing Proactive Cyber Threat Intelligence System for Dark Web Forums Content By Employing Deep Learning Approaches.

Sangher, K. S., Singh, A., Pandey, H. M. and Kumar, V., 2023. Towards Safe Cyber Practices: Developing Proactive Cyber Threat Intelligence System for Dark Web Forums Content By Employing Deep Learning Approaches. Information Systems, 14 (6), 349.

Full text available as:

[img]
Preview
PDF (OPEN ACCESS ARTICLE)
information-14-00349-v2.pdf - Published Version
Available under License Creative Commons Attribution.

4MB

DOI: 10.3390/info14060349

Abstract

The untraceable part of the Deep Web, also known as the Dark Web, is one of the most used "secretive spaces" to execute all sorts of illegal and criminal activities by terrorists, cybercriminals, spies, and offenders. Identifying actions, products, and offenders on the Dark Web is challenging due to its size, intractability, and anonymity. Therefore, it is crucial to intelligently enforce tools and techniques capable of identifying the activities of the Dark Web to assist law enforcement agencies as a support system. Therefore, this study proposes four deep learning architectures (RNN, CNN, LSTM, and Transformer) based classification models using the pre-trained word embedding representations to identify the illicit activities related to cybercrimes on Dark Web Forums. We used Agora Dataset derived from DarkNet Market Archive for our work, having 109 listed activities in Categories. The listing in the dataset is vaguely described, and several data points are untagged, which rules out the automatic labeling of category items as a target class. Hence, to overcome this constraint, we applied a meticulously designed human annotation scheme to annotate the data taking into account all the attributes to infer the context. In this research, we have conducted comprehensive evaluations to assess the performance of our proposed approaches. Our proposed BERT-based classification model has achieved an accuracy score of 96%. Given the unbalancedness of the experimental data, our results indicate the advantage of our tailored data preprocessing strategies applied and validate our annotation scheme. Thus, in real-world scenarios, our work can be used to analyze Dark Web Forums and identify cybercrimes by law enforcement agencies and can pave the path to develop sophisticated systems as per the requirements.

Item Type:Article
ISSN:0306-4379
Uncontrolled Keywords:dark web forum; cyber security; cybercrimes; deep learning; natural language processing; Agora marketplace; BERT; law enforcement agencies
Group:Faculty of Science & Technology
ID Code:38702
Deposited By: Symplectic RT2
Deposited On:14 Jul 2023 15:06
Last Modified:14 Jul 2023 15:06

Downloads

Downloads per month over past year

More statistics for this item...
Repository Staff Only -