Abstract
The increasing growth of illegal online activities in the so-called dark web - that is, the hidden collective of internet sites only accessible by a specialized web browsers - has challenged law enforcement agencies in recent years with sparse research efforts to help. For example, research has been devoted to supporting law enforcement by employing Natural Language Processing (NLP) to detect illegal activities on the dark web and build models for their classification. However, current approaches strongly rely upon the linguistic characteristics used to train the models, e.g., language semantics, which threatens their generalizability. To overcome this limitation, we tackle the problem of predicting illegal and criminal activities - a process defined as threat intelligence - on the dark web from a complementary perspective - that of dark web code maintenance and evolution - and propose a novel approach that uses software quality metrics and dark website appearance parameters instead of linguistic characteristics. We performed a preliminary empirical study on 10.367 web pages and collected more than 40 code metrics and website parameters using sonarqube. Results show an accuracy of up to 82% for predicting the three types of illegal activities (i.e., suspicious, normal, and unknown) and 66% for detecting 26 specific illegal activities, such as drugs or weapons trafficking. We deem our results can influence the current trends in detecting illegal activities on the dark web and put forward a completely novel research avenue toward dealing with this problem from a software maintenance and evolution perspective.
Original language | English |
---|---|
Title of host publication | Proceedings - 2022 IEEE International Conference on Software Maintenance and Evolution, ICSME 2022 |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 439-443 |
Number of pages | 5 |
ISBN (Electronic) | 9781665479561 |
DOIs | |
Publication status | Published - 2022 |
Event | 38th IEEE International Conference on Software Maintenance and Evolution, ICSME 2022 - Limassol, Cyprus Duration: 2 Oct 2022 → 7 Oct 2022 Conference number: 38 |
Conference
Conference | 38th IEEE International Conference on Software Maintenance and Evolution, ICSME 2022 |
---|---|
Abbreviated title | ICSME 2022 |
Country/Territory | Cyprus |
City | Limassol |
Period | 2/10/22 → 7/10/22 |
Bibliographical note
Funding Information:VI. ACKNOWLEDGEMENT We thank Martijn Keizer for the work done during his master thesis. The work is supported by EU TwiningDESTINI project (857420), and, the Dutch Ministry of Justice and Safety through the Regional Table Human Trafficking Region East Brabant sponsored the project SENTINEL.
Publisher Copyright:
© 2022 IEEE.
Funding
VI. ACKNOWLEDGEMENT We thank Martijn Keizer for the work done during his master thesis. The work is supported by EU TwiningDESTINI project (857420), and, the Dutch Ministry of Justice and Safety through the Regional Table Human Trafficking Region East Brabant sponsored the project SENTINEL.
Keywords
- Dark Web
- Machine Learning
- Software Code metrics
- Software Code Quality