Unraveling the OWASP Top 10 for Large Language Models

Large language models have undoubtedly transformed the AI landscape, but they also present unique security challenges. By understanding the OWASP Top 10 for Large Language Models and adopting appropriate mitigation strategies, we can harness the power of these models while ensuring they are secure, trustworthy, and used responsibly. Collaboration between researchers, developers, and security experts is vital in maintaining the integrity and safety of large language models as we continue to explore the potential of AI in our digital world.

6/9/20213 min read

In recent years, large language models have revolutionized natural language processing tasks, enabling incredible advancements in various fields. These models, like GPT-3, have the ability to generate human-like text, answer questions, and even engage in conversations. However, with their immense power comes the potential for misuse and security risks. To address these concerns, the Open Web Application Security Project (OWASP) has developed the “OWASP Top 10 for Large Language Models” list. In this blog, we will delve into the surface level details of OWASP Top 10 for Large Language Models, exploring the most critical security risks associated with these powerful AI systems and how we can mitigate them. Kudos to the team @steve wilson from Contrast security and OWASP foundation for doint it. One can easily download the pdf from here. This document details out the major security vulnerabilities in application using Large langugae models.

I’ll include a couple examples for each category here, all taken from the OWASP manual. If you wish to learn more, I kindly ask that you download the pdf and study it carefully.

LLM01. Prompt Injection:

https://embracethered.com/blog/posts/2023/chatgpt-plugin-vulns-chat-with-code/

https://www.researchsquare.com/article/rs-2873090/v1

https://embracethered.com/blog/posts/2023/chatgpt-cross-plugin-request-forgery-and-prompt-injection./

LLM02. Insecure Output handling:

Snyk Vulnerability DB- Arbitrary Code Execution: https://s(cu>ity.snyk.io/vuHn/SNYKPYTHON- LAN€CHAIN-qoªªmq

ChatGPT Plugin Exploit Explained: From Prompt Injection to Accessing Private Data: https:// (mb>ac(th(>(d.com/bHog/posts/202m/chatgpt-c>oss-pHugin->(!u(st-fo>g(>yand-p>ompt-inj(ction.t

Don’t blindly trust LLM responses. Threats to chatbots: https://embracethered.com/blog/posts/2023/ai-injections-threats-context-matters/

LLM03. Training data poisoning:

PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news:

https://blog.mithrilsecurity.io/poisongpt-how-we-hid-a-lobotomized-llm-on-hugging-face-to-spread-fake-news/

Backdoor Attacks on Language Models: Can We Trust Our Model’s Weights?

https://towardsdatascience.com/backdoor-attacks-on-language-models-can-we-trust-our-models-weights-73108f9dcb1f

Poisoning Language Models During Instruction Tuning:

https://arxiv.org/abs/2305.00944

LLM04. Model Denial of service:

Sponge Examples: Energy-Latency Attacks on Neural Networks:

https://arxiv.org/abs/2006.03463

Learning From Machines: Know Thy Context

https://lukebechtel.com/blog/lfm-know-thy-context

LLM05. Supply Chain Vulnerabilities:

ChatGPT Data Breach Confirmed as Security Firm Warns of Vulnerable Component Exploitation

https://www.securityweek.com/chatgpt-data-breach-confirmed-as-security-firm-warns-of-vulnerable-component-exploitation/

Compromised PyTorch-nightly dependency chain

https://pytorch.org/blog/compromised-nightly-dependency/

ML Supply Chain Compromise

https://atlas.mitre.org/techniques/AML.T0010/

LLM06. Sensitive information disclosure:

AI data leak crisis: New tool prevent company secret from being fed to ChatGPT:

https://www.foxbusiness.com/politics/ai-data-leak-crisis-prevent-company-secrets-chatgpt

Lessons learned from ChatGPT’s Samsung leak

https://cybernews.com/security/chatgpt-samsung-leak-explained-lessons/

OWASP AI Security and Privacy Guide

https://owasp.org/www-project-ai-security-and-privacy-guide/

LLM07. Insecure Plugin Design:

Open AI ChatGPT Plugin authentication

https://platform.openai.com/docs/plugins/authentication/service-level

ChatGPT Plugin Exploit Explained: From Prompt Injection to Accessing Private Data

https://embracethered.com/blog/posts/2023/chatgpt-cross-plugin-request-forgery-and-prompt-injection./

LLM08. Excessive Agency:

The Dual LLM pattern for building AI assistants that can resist prompt injection

https://simonwillison.net/2023/Apr/25/dual-llm-pattern/

Nemo Guardrailes interface guidelines:

https://github.com/NVIDIA/NeMo-Guardrails/blob/main/docs/security/guidelines.md

Lang-chains: Human approvals for tools

https://python.langchain.com/docs/modules/agents/tools/human_approval

LLM09. Overreliance:

Understanding LLM Hallucinations:

https://towardsdatascience.com/llm-hallucinations-ec831dcd7786

AI hallucinations-package-risk

https://vulcan.io/blog/ai-hallucinations-package-risk

Practical Steps to Reduce Hallucination and Improve Performance of Systems Built with Large Language Models

https://newsletter.victordibia.com/p/practical-steps-to-reduce-hallucination

LLM10. Model Theft:

Meta’s powerful AI language model has leaked online

https://www.theverge.com/2023/3/8/23629362/meta-ai-language-model-llama-leak-online-misuse

D-DAE: Defense-Penetrating Model Extraction Attacks

https://www.computer.org/csdl/proceedings-article/sp/2023/933600a432/1He7YbsiH4c

How Watermarking Can Help Mitigate The Potential Risks Of LLMs?

https://www.kdnuggets.com/2023/03/watermarking-help-mitigate-potential-risks-llms.html

Large language models have undoubtedly transformed the AI landscape, but they also present unique security challenges. By understanding the OWASP Top 10 for Large Language Models and adopting appropriate mitigation strategies, we can harness the power of these models while ensuring they are secure, trustworthy, and used responsibly. Collaboration between researchers, developers, and security experts is vital in maintaining the integrity and safety of large language models as we continue to explore the potential of AI in our digital world.

OWASP team also has shared the meeting recording link of the discussion happened while creating these exhaustive list on the wiki site. I have attached the link below. There will be one meeting scheduled on coming 9th August 8:30 AM Pacific time. One can visit those recording link and appreciate the hard work for the team and join in the coming session if you wanna be part of their discussion as well.

It was an effort of nearly 500 security specialist , AI Researchers, developers , industry leaders and academic .Our sincere thanks to all the contributors for bringing this up in speed and make the LLM world more secure.

Contributors link:

https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/wiki/Contributors

Meeting recording link:

https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/wiki/Meetings

OWASP top 10 LLM Wiki link:

https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/wiki

You can reach out to me on:

Website: https://mohitksharma.in/

LinkedIn: https://www.linkedin.com/in/mkroot/