Projects per year
Description
Replication package for the paper entitled "Smelling Secrets: Leveraging Machine Learning and Language Models for Sensitive Parameter Detection in Ansible Security Analysis", accepted for publication at the 25th IEEE International Conference on Source Code Analysis & Manipulation (SCAM 2025).
Contents
00_data_collection — Scripts and results for data collection and ground truth construction.
RQ1_ml — Scripts, results, and models for RQ1 (performance of machine learning classifiers)
RQ2_lm — Scripts, results, and models for RQ2 (performance of language model classifiers)
RQ3_comparison — Scripts and results for RQ3 (comparison of best models from RQ1 and RQ2 against baselines)
RQ4_unseen_params — Scripts and results for RQ4 (prediction of unannotated parameters)
Contents
00_data_collection — Scripts and results for data collection and ground truth construction.
RQ1_ml — Scripts, results, and models for RQ1 (performance of machine learning classifiers)
RQ2_lm — Scripts, results, and models for RQ2 (performance of language model classifiers)
RQ3_comparison — Scripts and results for RQ3 (comparison of best models from RQ1 and RQ2 against baselines)
RQ4_unseen_params — Scripts and results for RQ4 (prediction of unannotated parameters)
Abstract
Replication package for the paper entitled "Smelling Secrets: Leveraging Machine Learning and Language Models for Sensitive Parameter Detection in Ansible Security Analysis", accepted for publication at the 25th IEEE International Conference on Source Code Analysis & Manipulation (SCAM 2025).
Contents
00_data_collection — Scripts and results for data collection and ground truth construction.
RQ1_ml — Scripts, results, and models for RQ1 (performance of machine learning classifiers)
RQ2_lm — Scripts, results, and models for RQ2 (performance of language model classifiers)
RQ3_comparison — Scripts and results for RQ3 (comparison of best models from RQ1 and RQ2 against baselines)
RQ4_unseen_params — Scripts and results for RQ4 (prediction of unannotated parameters)
Contents
00_data_collection — Scripts and results for data collection and ground truth construction.
RQ1_ml — Scripts, results, and models for RQ1 (performance of machine learning classifiers)
RQ2_lm — Scripts, results, and models for RQ2 (performance of language model classifiers)
RQ3_comparison — Scripts and results for RQ3 (comparison of best models from RQ1 and RQ2 against baselines)
RQ4_unseen_params — Scripts and results for RQ4 (prediction of unannotated parameters)
Size
4.89GB
Version
1.0
| Date made available | 4 Aug 2025 |
|---|---|
| Publisher | figshare |
| Date of data production | 2025 - |
Keywords
- Infrastructure as Code
- Ansible
- Machine Learning
- Language Models
- Secrets
- Security
Format
- Format
- py
- ipynb
- md
- csv
- gz
- tar.xz
- txt
Projects
- 2 Active
-
VLAAI2: Cybersecurity Research Program Flanders – second cycle
De Meuter, W. (Administrative Promotor), Braeken, A. (CoI (Co-Promotor)), Devriese, D. (Co-Promotor), Gonzalez Boix, E. (Co-Promotor) & De Roover, C. (Co-Promotor)
1/01/24 → 31/12/28
Project: Applied
-
FWOSBO47: SBO Project : BaseCamp Zero - Towards Zero-Touch Testing
De Roover, C. (Administrative Promotor)
1/10/22 → 30/09/26
Project: Applied
Research output
- 1 Conference paper
-
Smelling Secrets: Leveraging Machine Learning and Language Models for Sensitive Parameter Detection in Ansible Security Analysis
Opdebeeck, R., Pontillo, V., Velázquez-Rodríguez, C., De Meuter, W. & De Roover, C., 2025, Proceedings of the 25th IEEE International Conference on Source Code Analysis and Manipulation (SCAM 2025). IEEE, p. 66-77 12 p. (Proceedings - 2025 IEEE International Conference on Source Code Analysis and Manipulation, SCAM 2025).Research output: Chapter in Book/Report/Conference proceeding › Conference paper
Open Access