Cybercriminal Abuse of Large Language Models
Summary:
Cybercriminals are increasingly exploiting artificial intelligence technologies, particularly large language models, to support malicious activities such as writing malware, crafting phishing emails, and scanning for system vulnerabilities. While legitimate LLMs come equipped with safeguards including alignment and real-time guardrails to prevent harmful outputs, cybercriminals are bypassing these restrictions through several means. They turn to uncensored LLMs, such as Llama 2 Uncensored and WhiteRabbitNeo, which lack safety constraints and are often used locally or shared through platforms like Ollama. Others develop and market custom illicit LLMs including FraudGPT, DarkGPT, and GhostGPT, advertised as tools for writing malicious code, generating scam content, and finding cardable sites. However, some of these tools are fraudulent in themselves, as in the case of FraudGPT, where the supposed developer was found to be scamming users out of cryptocurrency.
For those without the resources to run uncensored or custom LLMs, many cybercriminals rely on jailbreaking mainstream models. Jailbreaking techniques include obfuscation using encoded or altered characters, role-playing prompts that override safety features, adversarial suffixes that manipulate the model’s output, meta prompting that leverages the model’s knowledge of its limitations, and payload splitting where multiple benign prompts are combined into harmful content. These methods are used to bypass built-in ethical safeguards. Cybercriminals also use LLMs for tasks similar to legitimate users such as programming, content generation, and research, but repurpose these capabilities to develop ransomware, phishing kits, vulnerability scanners, and to brainstorm criminal schemes. Forum posts reveal integrations of LLMs with external tools like Nmap to assist in automating network reconnaissance.
Security Officer Comments:
In addition to using LLMs, attackers are targeting the models themselves. Some exploit the use of Python’s pickle module to embed malicious code in downloadable model files, which executes when deserialized. Although tools like Picklescan are used to detect such threats, vulnerabilities persist and infected models have been found. LLMs using Retrieval Augmented Generation are also susceptible to poisoning, where attackers manipulate external data sources to influence responses or inject hidden instructions, potentially targeting specific users. As artificial intelligence continues to evolve, Cisco Talos assesses that cybercriminals will further adopt and refine their use of LLMs to enhance operational speed, scale, and effectiveness. While these tools do not necessarily introduce entirely new forms of attack, they act as significant force multipliers for traditional cybercrime methods.
Suggested Corrections:
https://blog.talosintelligence.com/cybercriminal-abuse-of-large-language-models/
Cybercriminals are increasingly exploiting artificial intelligence technologies, particularly large language models, to support malicious activities such as writing malware, crafting phishing emails, and scanning for system vulnerabilities. While legitimate LLMs come equipped with safeguards including alignment and real-time guardrails to prevent harmful outputs, cybercriminals are bypassing these restrictions through several means. They turn to uncensored LLMs, such as Llama 2 Uncensored and WhiteRabbitNeo, which lack safety constraints and are often used locally or shared through platforms like Ollama. Others develop and market custom illicit LLMs including FraudGPT, DarkGPT, and GhostGPT, advertised as tools for writing malicious code, generating scam content, and finding cardable sites. However, some of these tools are fraudulent in themselves, as in the case of FraudGPT, where the supposed developer was found to be scamming users out of cryptocurrency.
For those without the resources to run uncensored or custom LLMs, many cybercriminals rely on jailbreaking mainstream models. Jailbreaking techniques include obfuscation using encoded or altered characters, role-playing prompts that override safety features, adversarial suffixes that manipulate the model’s output, meta prompting that leverages the model’s knowledge of its limitations, and payload splitting where multiple benign prompts are combined into harmful content. These methods are used to bypass built-in ethical safeguards. Cybercriminals also use LLMs for tasks similar to legitimate users such as programming, content generation, and research, but repurpose these capabilities to develop ransomware, phishing kits, vulnerability scanners, and to brainstorm criminal schemes. Forum posts reveal integrations of LLMs with external tools like Nmap to assist in automating network reconnaissance.
Security Officer Comments:
In addition to using LLMs, attackers are targeting the models themselves. Some exploit the use of Python’s pickle module to embed malicious code in downloadable model files, which executes when deserialized. Although tools like Picklescan are used to detect such threats, vulnerabilities persist and infected models have been found. LLMs using Retrieval Augmented Generation are also susceptible to poisoning, where attackers manipulate external data sources to influence responses or inject hidden instructions, potentially targeting specific users. As artificial intelligence continues to evolve, Cisco Talos assesses that cybercriminals will further adopt and refine their use of LLMs to enhance operational speed, scale, and effectiveness. While these tools do not necessarily introduce entirely new forms of attack, they act as significant force multipliers for traditional cybercrime methods.
Suggested Corrections:
- Use only trusted and verified AI models from reputable sources. Avoid downloading LLMs from untrusted platforms or forums, and run new models in sandboxed environments before deployment.
- Employ threat detection solutions that can identify and block obfuscated or AI-generated phishing content, including anomalous language patterns and payloads.
- Implement strong endpoint protection to detect and prevent execution of malicious scripts embedded in model files, particularly those using serialization formats like pickle.
- Continuously monitor LLM integrations with external tools, including API connections and RAG components, to detect and respond to signs of data poisoning or misuse.
- Educate users and developers on common LLM jailbreak tactics, including adversarial prompting and obfuscation techniques, to reduce the risk of accidental misuse or compromise.
https://blog.talosintelligence.com/cybercriminal-abuse-of-large-language-models/