Llama-Guard-3-1B

Meta's 1B-parameter safety classifier built on Llama 3.2, designed to detect unsafe content in LLM inputs and outputs across 13 hazard categories.

about

Meta pruned this classifier from Llama 3.2 1B using a three-stage process that reduced decoder layers to 12 and MLP hidden dimensions to 6400, yielding 1.12B parameters optimized for on-device deployment. It outputs structured safe/unsafe labels with specific violation codes aligned to the MLCommons standardized hazards taxonomy, making it interoperable across safety frameworks without custom category definitions.

Licensellama 3.3
Context window(in thousands)128,000

Use cases for Llama-Guard-3-1B

  1. Real-time input/output filtering: At 1.12B parameters with quantization, it classifies LLM prompts and responses against 13 MLCommons hazard categories fast enough to run inline without noticeable latency.
  2. Mobile and edge content safety: The pruned architecture (12 decoder layers, 6400 MLP hidden dim) fits on-device for applications that need safety classification without server round-trips.
  3. Standardized safety taxonomy compliance: Aligned to the MLCommons hazard framework (S1-S13), it produces interoperable safe/unsafe labels compatible with any safety pipeline using the same standard.

Quality

Arena EloN/A
MMLUN/A
MT BenchN/A

Llama Guard 3 is a safety classifier, so standard benchmarks like MMLU do not apply. It achieves an F1 score of 0.936-0.939 on the MLCommons hazard taxonomy across 13 safety categories, matching the performance of the OpenAI Moderation API. At 1.12B parameters (pruned from Llama 3.2 1B), it runs on-device with latency low enough for inline input/output filtering without bottlenecking the main model.

Claude-Opus-4-6

1501

GLM-5

1456

gpt-5.1

1455

Kimi-K2.5

1454

gpt-5.2

1440

pricing

The cost of running Llama Guard 3 with Telnyx Inference is $0.0002 per 1,000 tokens. Classifying 10,000,000 LLM inputs and outputs at 100 tokens each would cost $200, adding safety filtering to any pipeline for a fraction of a cent per request.

What's Twitter saying?

  • Developers praise Llama Guard 3 for its high safety accuracy in red teaming, noting Llama models "punch above their weight" and balance guardrails without excess, making users "very bullish on open source."
  • GitHub evaluators report lower-than-expected precision (e.g., 30.20% on ToxicChat, AUCPR of 50% vs. claimed 62% for Guard 2), highlighting potential issues with evaluation notebooks.
  • Commentators highlight multilingual support in 8 languages (English, French, German, Hindi, Italian, Portuguese, Spanish, Thai) as a key advantage over English-only safety tools.

Explore Our LLM Library

Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.

Organizationdeepseek-ai
Model NameDeepSeek-R1-Distill-Qwen-14B
Taskstext generation
Languages SupportedEnglish
Context Length43,000
Parameters14.8B
Model Tiermedium
Licensedeepseek

TRY IT OUT

Chat with an LLM

Powered by our own GPU infrastructure, select a large language model, add a prompt, and chat away. For unlimited chats, sign up for a free account on our Mission Control Portal here.

HOW IT WORKS

Selecting LLMs for Voice AI

RESOURCES

Get started

Check out our helpful tools to help get you started.

  • Icon Resources ebook

    Test in the portal

    Easily browse and select your preferred model in the AI Playground.

  • Icon Resources Docs

    Explore the docs

    Don’t wait to scale, start today with our public API endpoints.

  • Icon Resources Article

    Stay up to date

    Keep an eye on our AI changelog so you don't miss a beat.

Sign up and start building

faqs

What is Llama Guard 3?

Llama Guard 3 is Meta's safety classifier model, fine-tuned from Llama 3.2 1B specifically for content moderation. It classifies both LLM inputs and outputs as safe or unsafe across 13 hazard categories based on the MLCommons standardized taxonomy.

What does Llama Guard do?

Llama Guard classifies prompts and responses as safe or unsafe, listing any content categories that were violated. It acts as a moderation layer that can be deployed alongside other LLMs to filter harmful content before it reaches users.

Is Llama Guard 3 free?

Yes, Llama Guard 3 1B is open-source and available under Meta's license for both research and commercial use. It is available on Hugging Face and can be run locally using Ollama or other inference frameworks.

What languages does Llama Guard support?

Llama Guard 3 1B supports content safety classification in eight languages: English, French, German, Hindi, Italian, Portuguese, Spanish, and Thai. For additional language coverage, the larger 8B variant offers broader capabilities.

How is Llama Guard different from other safety models?

Llama Guard is purpose-built for LLM content moderation rather than general text classification. Its 1B size makes it lightweight enough to run alongside production LLMs without significant overhead, and it comes in a pruned quantized variant optimized for mobile deployment.

Llama Guard 3 1B: AI Safety and Content Moderation Model