Phishing detection with LLM

Table of Contents

In a recent project I tried to automate the phishing handling process. So if an email is reported as suspicious from end user the email is sent to a sandbox for a verdict and guess what: The sandbox mostly comes to the conclusion that the email is safe. But: IT IS NOT!

Introduction

Many companies have a button to report potential phishing. With that end users can send suspicious email to a dedicated location so further checks can be done by security analysts. Unfortunately in our modern world many emails are reported as suspected phishing. To support the analysts in processing those huge amount of emails an automation would be helpful.

First Try

Spoiler: This approach failed ;)

The first approach was to send each email to a sandbox to check if the email is malicious or not. Great idea you would think - I thought that too; until I got the results. Nearly all emails checked by the sandbox came back with the verdict Safe.

But after a manual check of the emails it was clear that they were at least suspicious - but often clearly phishings. After thinking again about the reason for a sandbox it was clear that it can only detect malicious behaviour only if there something bad happens when the email gets opened. This is not the case for phishings, spams and similar threats.

Second Try

As the first try failed another approach was needed. That was the moment where LLM came into my mind. I thought a LLM should be able to detect if an email is suspicious or if it is good.

Due the fact emails contains often some kind of attachements and also “Phishers” using images to lure for users it was necessary to use a so called Large Multimodal Models (LMMs) which is also capable of processing images in emails. For my tests I’m using llava.

For testing I used a local installation:

Ollama as runtime environment (provides API for my python script)
python script to process all emails in a test folder and asking the LMM over the REST API of Ollama
The script asks the model to answer with a JSON structure. It should contain keys like is_phishing, is_spam, is_suspicious and is_good. Additional it asks for URLs which are in the email.

Here is the PoC script:

# pip install requests

import os
import json
import requests

def process_email(data):
    api_url = "http://localhost:11434/api/generate"
    http_headers = {
        "Content-Type": "application/json",
        "Accept": "application/json",
    }

    http_data = {
        "model": "llava",
        "prompt": data + "\n\nInspect given email and answer in JSON. Use states is_phishing, is_spam, is_suspicious, is_good. Extract all URLs and add it to the response.",
        "stream": False
    }

    response = requests.post(api_url, headers=http_headers, data=json.dumps(http_data))

    if response.status_code == 200:
        response_text = response.text
        response_data = json.loads(response_text)
        print(response_data["response"])
    else:
        print(f"Error: {response.status_code}: {response.text}")

def main():
    if not os.path.exists("./test-data/processed"):
        os.mkdir("./test-data/processed/")

    files = os.listdir("./test-data/")
    for file in files:
        print(f'Processing {file}')
        if file.endswith(".eml"):

            with open(f'./test-data/{file}', 'r') as f:
                data = f.read()
                process_email(data)

            os.rename(f'./test-data/{file}', f'./test-data/processed/{file}')

if __name__ == "__main__":
    main()

The result looks like:

(venv) eisi@BigMac dev % clear; python pishing-llm-test.py

Processing test-email-001.eml
{
  "is_phishing": false,
  "is_spam": true,
  "is_suspicious": true,
  "is_good": false,
  "urls": []
}
Processing test-email-002.eml
{
  "is_phishing": false,
  "is_spam": false,
  "is_suspicious": true,
  "is_good": false,
  "urls": ["www.google.com"]
}
...

Nice, right? I compared the result from the LMM with manual checks and checks with the sandbox. Of course manual checks have a detection rate of 100% - but the LMM reaches 85% in the tested samples. The last place is held by the sandbox checks: only 22% where detected as suspicious..

Note that the results of the LLM/LMM will change with every run of the script. That’s due the nature of LLM/LMMs. Also the output format is not always clean.

Summary

LLM and LMM are the secret weapon to classify emails even better than before. They can help to reduce the load on your security analysts. Already a simple PoC shows that an LMM is able to detect many malicious emails. I’m pretty sure if some more love is spent into tuning of the LMM more accurate results are possible (or just use larger models).

Introduction

First Try

Second Try

Summary

Further Reading