Publications

Towards Understanding Alerts raised by Semi-supervised Network Intrusion Detection Systems

Published in 26th International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2023), 2023

The use of Machine Learning for anomaly detection in cyber security-critical applications, such as intrusion detection systems, has been hindered by the lack of explainability. Without understanding the reason behind anomaly alerts, it is too expensive or impossible for human analysts to verify and identify cyber-attacks. Our research addresses this challenge and focuses on semi-supervised network intrusion detection, where only benign network traffic is available for training the detection model. We propose a novel post-hoc explanation method, called AE-pvalues, which is based on the p-values of the reconstruction errors produced by an Auto-Encoder-based anomaly detection method. Our work identifies the most informative network traffic features associated with an anomaly alert, providing interpretations for the generated alerts. We conduct an empirical study using a large-scale network intrusion dataset, CICIDS2017, to compare the proposed AE-pvalues method with two state-of-the-art baselines applied in the semi-supervised anomaly detection task. Our experimental results show that the AE-pvalues method accurately identifies abnormal influential network traffic features. Furthermore, our study demonstrates that the explanation outputs can help identify different types of network attacks in the detected anomalies, enabling human security analysts to understand the root cause of the anomalies and take prompt action to strengthen security measures.

Recommended citation: Maxime Lanvin, Pierre-François Gimenez, Yufei Han, Frédéric Majorczyk, Ludovic Mé, et al.. Towards Understanding Alerts raised by Unsupervised Network Intrusion Detection Systems. The 26th International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2023), Oct 2023, Hong Kong, France. ⟨10.1145/3607199.3607247⟩ https://dl.acm.org/doi/abs/10.1145/3607199.3607247

Conditionally Acyclic CO-Networks for Efficient Preferential Optimization

Published in 26th European Conference on Artificial Intelligence (ECAI 2023), 2023

This paper focuses on graphical models for modelling preferences in combinatorial space and their use for item optimisation. The preferential optimisation task seeks to find the preferred item containing some defined values, which is useful for many recommendation settings in e-commerce. We show that efficient (i.e., with polynomial time complexity) preferential optimisation is achieved with a subset of cyclic CP-nets called conditional acyclic CP-net. We also introduce a new graphical preference model, called Conditional-Optimality networks (CO-networks), that are more concise than conditional acyclic CP-nets and LP-trees but have the same expressiveness with respect to optimisation. Finally, we empirically show that preferential optimisation can be used for encoding alternatives into partial instantiations and vice versa, paving the way towards CO-nets and CP-nets unsupervised learning with the minimal description length (MDL) principle.

Recommended citation: Gimenez, Pierre-François & Mengin, Jérôme. (2023). Conditionally Acyclic CO-Networks for Efficient Preferential Optimization. 10.3233/FAIA230352. https://www.researchgate.net/publication/374310696

BAGUETTE: Hunting for evidence of malicious behavior in dynamic analysis reports

Published in 20th International conference on security and cryptography SECRYPT 2023, 2023

Malware analysis consists of studying a sample of suspicious code to understand it and producing a representation or explanation of this code that can be used by a human expert or a clustering/classification/detection tool. The analysis can be static (only the code is studied) or dynamic (only the interaction between the code and its host during one or more executions is studied). The quality of the interpretation of a code and its later detection depends on the quality of the information contained in this representation. To date, many analyses produce voluminous reports that are difficult to handle quickly. In this article, we present BAGUETTE, a graph-based representation of the interactions of a sample and the resources offered by the host system during one execution. We explain how BAGUETTE helps automatically search for specific behaviors in a malware database and how it efficiently assists the expert in analyzing samples.

Recommended citation: Vincent Raulin, Pierre-François Gimenez, Yufei Han, Valérie Viet Triem Tong. BAGUETTE: Hunting for Evidence of Malicious Behavior in Dynamic Analysis Reports. 20th International conference on security and cryptography SECRYPT 2023, Jul 2023, Rome, Italy. https://hal.science/hal-04102144/

Errors in the CICIDS2017 dataset and the significant differences in detection performances it makes

Published in CRiSIS 2022 17th International Conference on Risks and Security of Internet and Systems, 2022

Among the difficulties encountered in building datasets to evaluate intrusion detection tools, a tricky part is the process of labelling the events into malicious and benign classes. The labelling correctness is paramount for the quality of the evaluation of intrusion detection systems but is often considered as the ground truth by practitioners and is rarely verified. Another difficulty lies in the correct capture of the network packets. If it is not the case, the characteristics of the network flows generated from the capture could be modified and lead to false results. In this paper, we present several flaws we identified in the labelling of the CICIDS2017 dataset and in the traffic capture, such as packet misorder, packet duplication and attack that were performed but not correctly labelled. Finally, we assess the impact of these different corrections on the evaluation of supervised intrusion detection approaches.

Recommended citation: Lanvin, M., Gimenez, P. F., Han, Y., Majorczyk, F., Mé, L., & Totel, E. (2022, December). Errors in the CICIDS2017 dataset and the significant differences in detection performances it makes. In CRiSIS 2022-International Conference on Risks and Security of Internet and Systems. https://link.springer.com/chapter/10.1007/978-3-031-31108-6_2

Explainable artificial intelligence for cybersecurity: a literature survey

Published in Annals of Telecommunications, 2022

Artificial intelligence (AI) and machine learning (ML) are increasingly becoming essential in the development of cybersecurity solutions, with deep learning (DL) algorithms being extensively applied in recent years, e.g., for detecting Android malware or vulnerable source code. However, sharing the same fundamental limitation with other DL application domains, such as computer vision and natural language processing, AI-based cybersecurity solutions lack the capability of justifying the results (ranging from detection and prediction to reasoning and decision-making) and making them human-understandable. As a result, explainable AI (XAI) has emerged as a paramount topic addressing the related challenges of making AI models explainable or interpretable to human users. It is particularly relevant in the cybersecurity domain, in that XAI may allow security operators, who are overwhelmed with tens of thousands of security alerts per day (most of which are false positives) per day, to better assess the potential threats and reduce alert fatigue. With such a strong motivation, we conduct an extensive literature review on the intersection between XAI and cybersecurity. Particularly, we investigate the academic literature from two perspectives: the applications of XAI to cybersecurity (e.g., intrusion detection, malware classification, etc.) and the application of cybersecurity to XAI (e.g., attacks on XAI pipelines, potential countermeasures, etc.). We characterize the security of XAI with several security properties that have been discussed in the literature. We also formulate open questions that are either left out or not properly addressed in the literature and provide tentative answers.

Recommended citation: Fabien Charmet, Harry Chandra Tanuwidjaja, Solayman Ayoubi, Pierre-François Gimenez, Yufei Han, Houda Jmila, Gregory Blanc, Takeshi Takahashi & Zonghua Zhang. Explainable artificial intelligence for cybersecurity: a literature survey. Ann. Telecommun. (2022). https://doi.org/10.1007/s12243-022-00926-7

The complexity of unsupervised learning of lexicographic preferences

Published in M-PREF 2022 13th Multidisciplinary Workshop on Advances in Preference Handling, 2022

This paper considers the task of learning users’ preferences on a combinatorial set of alternatives, as generally used by online configurators, for example. In many settings, only a set of selected alternatives during past interactions is available to the learner. Fargier et al. [2018] propose an approach to learn, in such a setting, a model of the users’ preferences that ranks previously chosen alternatives as high as possible; and an algorithm to learn, in this setting, a particular model of preferences: lexicographic preferences trees (LP-trees). In this paper, we study complexity-theoretical problems related to this approach. We give an upper bound on the sample complexity of learning an LP-tree, which is logarithmic in the number of attributes. We also prove that computing the LP tree that minimises the empirical risk can be done in polynomial time when restricted to the class of linear LP-trees.

Recommended citation: Fargier, H., Gimenez, P. F., Mengin, J., & Le Nguyen, B. N. (2022, July). The complexity of unsupervised learning of lexicographic preferences. In 13th Multidisciplinary Workshop on Advances in Preference Handling (M-pref 2022)@ IJCAI 2022 https://hal.inria.fr/hal-03784693/document

Debiasing Android Malware Datasets: How can I trust your results if your dataset is biased?

Published in IEEE Transactions on Information Forensics and Security, 2022

Android security has received a lot of attention over the last decade, especially malware investigation. Researchers attempt to highlight applications’ security-relevant characteristics to better understand malware and effectively distinguish malware from benign applications. The accuracy and the completeness of their proposals are evaluated experimentally on malware and goodware datasets. Thus, the quality of these datasets is of critical importance: if the datasets are outdated or not representative of the studied population, the conclusions may be flawed. We specify different types of experimental scenarios. Some of them require unlabeled but representative datasets of the entire population. Others require datasets labeled with valuable characteristics that may be difficult to compute, such as malware datasets. We discuss the irregularities of datasets used in experiments, questioning the validity of the performances reported in the literature. This article focuses on providing guidelines for designing debiased datasets. First, we propose guidelines for building representative datasets from unlabeled ones. Second, we propose and experiment a debiasing algorithm that, given a biased labeled dataset and a target representative dataset, builds a representative and labeled dataset. Finally, from the previous debiased datasets, we produce datasets for experiments on Android malware detection or classification with machine learning algorithms. Experiments show that debiased datasets perfom better when classifying with machine learning algorithms.

Recommended citation: Miranda, T. C., Gimenez, P. F., Lalande, J. F., Tong, V. V. T., & Wilke, P. (2022). Debiasing Android Malware Datasets: How can I trust your results if your dataset is biased?. IEEE Transactions on Information Forensics and Security. https://ieeexplore.ieee.org/abstract/document/9787514

Towards a Representation of Malware Execution Traces for Experts and Machine Learning

Published in RESSI 2022-Rendez-Vous de la Recherche et de l'Enseignement de la Sécurité des Systèmes d'Information, 2022

Dynamic analysis is a common technique to analyze the run-time behavior of software and identify malware (malicious software). Execution traces typically contain the list of system calls with their parameters, the list of accessed files, etc. Several representations have been proposed to organize these data better and help both human experts and automated tools analyze them effectively. This paper reviews these representations and identifies four research problems that the first author plans to investigate during his Ph.D.

Recommended citation: Raulin, V., Gimenez, P. F., Han, Y., & Tong, V. V. T. (2022). Towards a Representation of Malware Execution Traces for Experts and Machine Learning. RESSI 2022-Rendez-Vous de la Recherche et de l'Enseignement de la Sécurité des Systèmes d'Information. https://hal.archives-ouvertes.fr/hal-03675366/document

Detecting APT through graph anomaly detection

Published in RESSI 2022-Rendez-Vous de la Recherche et de l'Enseignement de la Sécurité des Systèmes d'Information, 2022

Despite fruitful achievements made by unsupervised machine learning-based anomaly detection for network intrusion detection systems, they are still prone to the issue of high false alarm rates, and it is still difficult to reach very high recalls. In 2020, Leichtnam et al. proposed Sec2graph, an unsupervised approach applied to security objects graphs that exhibited interesting results on single-step attacks. The graph representation and the embedding allowed for better detection since it creates qualitative features. In this paper, we present new experiments to assess the performances of this approach for detecting APT attacks. We achieve better detection performances than the original work’s baseline detection methods on the DAPT2020 dataset. This work is realised in the context of the Ph.D. thesis of Maxime Lanvin, which started in October 2021.

Recommended citation: Lanvin, M., Gimenez, P. F., Han, Y., Majorczyk, F., Mé, L., & Totel, É. (2022, May). Detecting APT through graph anomaly detection. In RESSI 2022-Rendez-Vous de la Recherche et de l'Enseignement de la Sécurité des Systèmes d'Information. https://hal.archives-ouvertes.fr/hal-03675346/document

Towards generic quality assessment of synthetic traffic for evaluating intrusion detection systems

Published in RESSI 2022-Rendez-Vous de la Recherche et de l'Enseignement de la Sécurité des Systèmes d'Information, 2022

Network Intrusion Detection Systems (NIDSes) evaluation requires background traffic. However, real background traffic is hard to collect. We hence rely on synthetic traffic generated especially for this task. The quality of the generated traffic has to be evaluated according to some clearly defined criteria. In this paper, we show how to adapt the quality assessment solutions proposed for different fields of data generation such as image or text generation to network traffic. We summarize our study by discussing the criteria that evaluate the quality of a generated network traffic and by proposing functions to evaluate these criteria. This is the first contribution in the context of the Ph.D. thesis of Adrien Schoen.

Recommended citation: Schoen, A., Blanc, G., Gimenez, P. F., Han, Y., Majorczyk, F., & Mé, L. (2022, May). Towards generic quality assessment of synthetic traffic for evaluating intrusion detection systems. In RESSI 2022-Rendez-Vous de la Recherche et de l'Enseignement de la Sécurité des Systèmes d'Information. https://hal.archives-ouvertes.fr/hal-03675359/document

GUI-Mimic, a cross-platform recorder and fuzzer of Graphical User Interface

Published in 9th GreHack Conference, 2021

In program analysis, a fuzzing toolset is needed to automatically trigger software operations in a natural while efficient way. Especially in dynamic analysis of malware, such a toolset can help execute the suspicious files to unveil their malicious payloads hidden by other benign-looking behaviors. In the fields of software testing, this tool is necessary for triggering and testing the programmed functionalities. Nevertheless, there has not yet been an easy-to-use tool that works on Windows for the purpose of generating activity through the Graphical User Interface (GUI). To meet this requirement, in our work, we develop GUI-Mimic. It is designed to integrate some useful features for stimulating different types of software – mouse and keyboard recording, random mouse and keyboard inputs, editing, trimming, randomization, transformations – to deliver an easy-to-deploy GUI fuzzer over different Operating Systems.

Recommended citation: Raulin, V., Gimenez, P. F., Han, Y., Viet Triem Tong, V., Ouairy, L. (2021). GUI-Mimic, a cross-platform recorder and fuzzer of Graphical User Interface. 9th GreHack Conference https://hal.archives-ouvertes.fr/hal-03449827/document

RIDS: Radio intrusion detection and diagnosis system for wireless communications in smart environment

Published in ACM Transactions on Cyber-Physical Systems, 2021

The expansion of the Internet-of-Things (IoT) market is visible in homes, factories, public places, and smart cities. While the massive deployment of connected devices offers opportunities to improve quality of life and to develop new services, the impact of such devices on the security of the users in a context where the level of malicious threat continues to increase is a major concern. One of the challenges is the heterogeneity and constant evolution of wireless technologies and protocols used. To overcome this problem, we propose RIDS, a Radio Intrusion Detection System that is based on the monitoring and profiling of radio communications at the physical layer level using autoencoder neural networks. RIDS is independent of the wireless protocols and modulation technologies used. Besides, it is designed to provide a threefold diagnosis of the detected anomalies: temporal (start and end date of the detected anomaly), frequential (main frequency of the anomaly), and spatial (location of the origin of the anomaly). To demonstrate the relevance and the efficiency of our approach, we collected a large dataset of radio-communications recorded with three different probes deployed in an experimental room. Multiple real-world attacks involving a wide variety of communication technologies are also injected to assess the detection and diagnosis efficiency. The results demonstrate the efficiency of RIDS in detecting and diagnosing anomalies that occurred in the 400–500 Mhz and 800–900 Mhz frequency bands. It is noteworthy that compromised devices and attacks using these communication bands are generally not easily covered by traditional solutions.

Recommended citation: Gimenez, P. F., Roux, J., Alata, E., Auriol, G., Kaâniche, M., & Nicomette, V. (2021). RIDS: Radio intrusion detection and diagnosis system for wireless communications in smart environment. ACM Transactions on Cyber-Physical Systems, 5(3), 1-1. https://dl.acm.org/doi/10.1145/3441458

Hardware-Performance-Counters-based anomaly detection in massively deployed smart industrial devices

Published in IEEE 19th International Symposium on Network Computing and Applications (NCA), 2020

Energy providers are massively deploying devices to manage distributed resources or equipment. These devices are used for example to manage the energy of smart factories efficiently or to monitor the infrastructure of smart-grids. By design, they typically exhibit homogeneous behavior, with similar software and hardware architecture. Unfortunately, these devices are also of interest to attackers aiming to develop botnets or compromise companies’ security. This paper presents a new protection approach based on Hardware Performance Counters (HPC) to detect anomalies in massively deployed devices. These HPC are processed using outlier detection algorithms. Compared to existing solutions, we propose a lightweight approach based on a comparative analysis of devices’ HPC without relying on the modeling of the software applications running on the devices. To assess the relevance and the effectiveness of the approach, a thorough experimental analysis is carried out in a representative industrial-type environment, sampling the data from 100 Raspberry Pi to simulate about 10,000 devices deployed simultaneously. The results show high detection and performance efficiency under different software profiles and attack payloads. Moreover, the calibration of the approach depends primarily on the hardware rather than the application software running on the devices. It should ease its deployment in an operational environment.

Recommended citation: Bourdon, M., Gimenez, P. F., Alata, E., Kaaniche, M., Migliore, V., Nicomette, V., & Laarouchi, Y. (2020, November). Hardware-Performance-Counters-based anomaly detection in massively deployed smart industrial devices. In 2020 IEEE 19th International Symposium on Network Computing and Applications (NCA) (pp. 1-8). IEEE. https://ieeexplore.ieee.org/document/9306726

On-board Diagnosis: A First Step from Detection to Prevention of Intrusions on Avionics Applications

Published in IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), 2020

Nowadays, air travel is one of the safest transportation means. While safety is historically well integrated into avionics systems, it is becoming increasingly important to take into account the security of such systems for the future. In particular, Host-based Intrusion Detection Systems (HIDS) are commonly used in traditional information systems to improve their security. The adaptation of such systems for deployment inside an aircraft has been studied in another work and has shown to be effective in detecting anomalous behavior in an avionic application. However, the detection itself is not sufficient to provide an on-board reaction, and to prevent such intrusion. This paper proposes to improve such HIDS by introducing a signature-based system capable of providing a first diagnosis after the detection of an anomalous behavior. The proposed diagnosis approach is based on the definition of the signature of an alert, and its comparison with a knowledge database that is regularly updated throughout aircraft lifetime. This approach has been implemented on a real avionic computer and yielded good results in terms of classification accuracy and resources consumption.

Recommended citation: Damien, A., Gimenez, P. F., Feyt, N., Nicomette, V., Kaâniche, M., & Alata, E. (2020, October). On-board Diagnosis: A First Step from Detection to Prevention of Intrusions on Avionics Applications. In 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE) (pp. 358-368). IEEE. https://ieeexplore.ieee.org/document/9251064

Experimental evaluation of three value recommendation methods in interactive configuration

Published in Journal of Universal Computer Science, 2020

The present work deals with the recommendation of values in interactive configuration, with no prior knowledge about the user, but given a list of products previously configured and bought by other users (“sales histories”). The basic idea is to recommend, for a given variable at a given step of the configuration process, a value that has been chosen by other users in a similar context, where the context is defined by the variables that have already been decided, and the values that the current user has chosen for these variables. From this point, two directions have been explored. The first one is to select a set of similar configurations in the sales history (typically, the k closest ones, using a distance measure) and to compute the best recommendation from this set – this is the line proposed by [Coster et al., 2002]. The second one, that we propose here, is to learn a model from the entire sample as representation of the users’ preferences, and to use it to recommend a pertinent value; three families of models are experimented: the Bayesian networks, the naive Bayesian networks and the lexicographic preferences trees.

Recommended citation: H Fargier, PF Gimenez, J Mengin - Journal of Universal Computer Science, 2020 https://www.jucs.org/jucs_26_3/experimental_evaluation_of_three.html

Apprentissage de préférences en espace combinatoire et application à la recommandation en configuration interactive

Published in Université Paul Sabatier - Toulouse III, 2018

The analysis and the exploitation of preferences occur in multiple domains, such as economics, humanities and psychology. E-commerce got interested in the subject a few years ago with the surge of product personalisation. Our study deals with the representation and the learning of preferences on objects described by a set of attributes. These combinatorial spaces are huge, which makes the representation of an ordering in extenso intractable. That’s why preference representation languages have been built: they can represent preferences compactly on these huge spaces. In this dissertation, we study preference representation languages and preference learning.

Recommended citation: Pierre-François Gimenez. Apprentissage de préférences en espace combinatoire et application à la recommandation en configuration interactive. Intelligence artificielle [cs.AI]. Université Paul Sabatier - Toulouse III, 2018. Français. ⟨NNT : 2018TOU30182⟩. ⟨tel-02303275⟩ https://www.theses.fr/2018TOU30182

Learning lexicographic preference trees from positive examples

Published in AAAI Conference on Artificial Intelligence, 2018

This paper considers the task of learning the preferences of users on a combinatorial set of alternatives, as it can be the case for example with online configurators. In many settings, what is available to the learner is a set of positive examples of alternatives that have been selected during past interactions. We propose to learn a model of the users’ preferences that ranks previously chosen alternatives as high as possible. In this paper, we study the particular task of learning conditional lexicographic preferences. We present an algorithm to learn several classes of lexicographic preference trees, prove convergence properties of the algorithm, and experiment on both synthetic data and on a real-world bench in the domain of recommendation in interactive configuration.

Recommended citation: Fargier, H., Gimenez, P. F., & Mengin, J. (2018, April). Learning lexicographic preference trees from positive examples. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 32, No. 1). https://ojs.aaai.org/index.php/AAAI/article/download/11808/11667

Recommendation for product configuration: an experimental evaluation (Best paper award)

Published in 18th International Configuration Workshop, 2016

The present work deals with the the recommendation of values in interactive configuration, with no prior knowledge about the user, but given a list of products previously configured and bought by other users (“sale histories”). The basic idea is to recommend, for a given variable at a given step of the configuration process, a value that has been chosen by other users in a similar context, where the context is defined by the variables that have already been decided, and the values that the current user has chosen for these variables. From this point, two directions have been explored. The first one is to select a set of similar configurations in the sale history (typically,the k closest ones, using a distance measure) and to compute the best recommendation from this set - this is the line proposed by [9]. The second one, that we propose here, is to learn a Bayesian network from the entire sample as model of the users’ preferences, and to use it to recommend a pertinent value.

Recommended citation: Fargier, H., Gimenez, P. F., & Mengin, J. (2016, September). Recommendation for product configuration: an experimental evaluation. In 18th International Configuration Workshop (CWS 2016) within CP 2016: 22nd International Conference on Principles and Practice of Constraint Programming (pp. pp-9). https://hal.archives-ouvertes.fr/hal-01445239/file/fargier_17217.pdf