Sangher, K. S., Singh, A., Pandey, H. M. and Kumar, V., 2023. Towards Safe Cyber Practices: Developing Proactive Cyber Threat Intelligence System for Dark Web Forums Content By Employing Deep Learning Approaches. Information Systems, 14 (6), 349.
Full text available as:
|
PDF (OPEN ACCESS ARTICLE)
information-14-00349-v2.pdf - Published Version Available under License Creative Commons Attribution. 4MB | |
Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk. Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material. |
DOI: 10.3390/info14060349
Abstract
The untraceable part of the Deep Web, also known as the Dark Web, is one of the most used "secretive spaces" to execute all sorts of illegal and criminal activities by terrorists, cybercriminals, spies, and offenders. Identifying actions, products, and offenders on the Dark Web is challenging due to its size, intractability, and anonymity. Therefore, it is crucial to intelligently enforce tools and techniques capable of identifying the activities of the Dark Web to assist law enforcement agencies as a support system. Therefore, this study proposes four deep learning architectures (RNN, CNN, LSTM, and Transformer) based classification models using the pre-trained word embedding representations to identify the illicit activities related to cybercrimes on Dark Web Forums. We used Agora Dataset derived from DarkNet Market Archive for our work, having 109 listed activities in Categories. The listing in the dataset is vaguely described, and several data points are untagged, which rules out the automatic labeling of category items as a target class. Hence, to overcome this constraint, we applied a meticulously designed human annotation scheme to annotate the data taking into account all the attributes to infer the context. In this research, we have conducted comprehensive evaluations to assess the performance of our proposed approaches. Our proposed BERT-based classification model has achieved an accuracy score of 96%. Given the unbalancedness of the experimental data, our results indicate the advantage of our tailored data preprocessing strategies applied and validate our annotation scheme. Thus, in real-world scenarios, our work can be used to analyze Dark Web Forums and identify cybercrimes by law enforcement agencies and can pave the path to develop sophisticated systems as per the requirements.
Item Type: | Article |
---|---|
ISSN: | 0306-4379 |
Uncontrolled Keywords: | dark web forum; cyber security; cybercrimes; deep learning; natural language processing; Agora marketplace; BERT; law enforcement agencies |
Group: | Faculty of Science & Technology |
ID Code: | 38702 |
Deposited By: | Symplectic RT2 |
Deposited On: | 14 Jul 2023 15:06 |
Last Modified: | 14 Jul 2023 15:06 |
Downloads
Downloads per month over past year
Repository Staff Only - |