ML-FEED: Machine Learning Framework for Efficient Exploit Detection

Tanujay Saha, Tamjid Al Rahat, Najwa Aaraj, Yuan Tian, Niraj K. Jha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Machine learning (ML)-based methods have recently become attractive for detecting security vulnerability exploits. Unfortunately, state-of-the-art ML models like long short-term memories (LSTMs) and transformers incur significant computation overheads. This overhead makes it infeasible to deploy them in real-time environments. We propose a novel ML-based exploit detection model, ML-FEED, that enables highly efficient inference without sacrificing performance. We develop a novel automated technique to extract vulnerability patterns from the Common Weakness Enumeration (CWE) and Common Vulnerabilities and Exposures (CVE) databases. This feature enables ML-FEED to be aware of the latest cyber weaknesses. Second, it is not based on the traditional approach of classifying sequences of application programming interface (API) calls into exploit categories. Such traditional methods that process entire sequences incur huge computational overheads. Instead, ML-FEED operates at a finer granularity and predicts the exploits triggered by every API call of the program trace. Then, it uses a state table to update the states of these potential exploits and track the progress of potential exploit chains. ML-FEED also employs a feature engineering approach that uses natural language processing-based word embeddings, frequency vectors, and one-hot encoding to detect semantically-similar instruction calls. Then, it updates the states of the predicted exploit categories and triggers an alarm when a vulnerability fingerprint executes. Our experiments show that ML-FEED is 72.9× and 75, 828.9× faster than state-of-the-art lightweight LSTM and transformer models, respectively. We trained and tested ML-FEED on 79 real-world exploit categories. It predicts categories of exploit in real-time with 98.2% precision, 97.4% recall, and 97.8% F1 score. These results also outperform the LSTM and transformer baselines. In addition, we evaluated ML-FEED on the attack traces of CVE vulnerability exploits in three popular Java libraries and detected all three reported critical vulnerabilities in them.

Original languageEnglish (US)
Title of host publicationProceedings - 2022 IEEE 4th International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications, TPS-ISA 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages140-149
Number of pages10
ISBN (Electronic)9781665474085
DOIs
StatePublished - 2022
Event4th IEEE International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications, TPS-ISA 2022 - Virtual, Online, United States
Duration: Dec 14 2022Dec 16 2022

Publication series

NameProceedings - 2022 IEEE 4th International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications, TPS-ISA 2022

Conference

Conference4th IEEE International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications, TPS-ISA 2022
Country/TerritoryUnited States
CityVirtual, Online
Period12/14/2212/16/22

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality

Keywords

  • Exploit detection
  • Machine Learning

Fingerprint

Dive into the research topics of 'ML-FEED: Machine Learning Framework for Efficient Exploit Detection'. Together they form a unique fingerprint.

Cite this