Skip to main content

Machine Learning Based Detection and Evasion Techniques for Advanced Web Bots.

Iliou, C., 2022. Machine Learning Based Detection and Evasion Techniques for Advanced Web Bots. Doctoral Thesis (Doctoral). Bournemouth University.

Full text available as:

ILIOU, Christos_Ph.D._2022.pdf
Available under License Creative Commons Attribution Non-commercial.



Web bots are programs that can be used to browse the web and perform different types of automated actions, both benign and malicious. Such web bots vary in sophistication based on their purpose, ranging from simple automated scripts to advanced web bots that have a browser fingerprint and exhibit a humanlike behaviour. Advanced web bots are especially appealing to malicious web bot creators, due to their browserlike fingerprint and humanlike behaviour which reduce their detectability. Several effective behaviour-based web bot detection techniques have been pro- posed in literature. However, the performance of these detection techniques when target- ing malicious web bots that try to evade detection has not been examined in depth. Such evasive web bot behaviour is achieved by different techniques, including simple heuris- tics and statistical distributions, or more advanced machine learning based techniques. Motivated by the above, in this thesis we research novel web bot detection techniques and how effective these are against evasive web bots that try to evade detection using, among others, recent advances in machine learning. To this end, we initially evaluate state-of-the-art web bot detection techniques against web bots of different sophistication levels and show that, while the existing approaches achieve very high performance in general, such approaches are not very effective when faced with only advanced web bots that try to remain undetected. Thus, we propose a novel web bot detection framework that can be used to detect effectively bots of varying levels of sophistication, including advanced web bots. This framework comprises and combines two detection modules: (i) a detection module that extracts several features from web logs and uses them as input to several well-known machine learning algo- rithms, and (ii) a detection module that uses mouse trajectories as input to Convolutional Neural Networks (CNNs). Moreover, we examine the case where advanced web bots utilise themselves the re- cent advances in machine learning to evade detection. Specifically, we propose two novel evasive advanced web bot types: (i) the web bots that use Reinforcement Learning (RL) to update their browsing behaviour based on whether they have been detected or not, and (ii) the web bots that have in their possession several data from human behaviours and use them as input to Generative Adversarial Networks (GANs) to generate images of humanlike mouse trajectories. We show that both approaches increase the evasiveness of the web bots by reducing the performance of the detection framework utilised in each case. We conclude that malicious web bots can exhibit high sophistication levels and com- bine different techniques that increase their evasiveness. Even though web bot detection frameworks can combine different methods to effectively detect such bots, web bots can update their behaviours using, among other, recent advances in machine learning to in- crease their evasiveness. Thus, the detection techniques should be continuously updated to keep up with new techniques introduced by malicious web bots to evade detection.

Item Type:Thesis (Doctoral)
Additional Information:If you feel that this work infringes your copyright please contact the BURO Manager.
Uncontrolled Keywords:web bots; web bot detection; evasive web bots; advanced web bots; mouse movements; mouse biometrics; humanlike behaviour; machine learning; convolutional neural networks; generative adversarial networks; reinforcement learning
Group:Faculty of Science & Technology
ID Code:37671
Deposited By: Symplectic RT2
Deposited On:18 Oct 2022 09:13
Last Modified:18 Oct 2022 09:13


Downloads per month over past year

More statistics for this item...
Repository Staff Only -