LogoLogo
  • Getting Started
  • 🔍Taxonomy
    • Introduction
    • Effect (SEP) View
      • Security
      • Ethics
      • Performance
    • Lifecycle View
    • Schema
  • 📦Database
    • Introduction
    • Framework
      • Base Classes
      • Auxiliary Classes
    • 🛠️Backend
    • 🛠️Editorial Interface
  • 👷‍♀️Developer Tools
    • Python SDK
      • Datamodels
      • Connectors
      • 🛠️Integrations
        • garak
        • ModsysML (Apollo)
        • 🐢Giskard
        • Inspect AI
      • API Reference
Powered by GitBook
On this page
  • Taxonomy
  • Reporting
  1. Developer Tools
  2. Python SDK
  3. Integrations

garak

PreviousIntegrationsNextModsysML (Apollo)

Last updated 1 month ago

The open-source Large Language Model (LLM) vulnerability scanner provides modular and standardized detections (tests), probes (groups of tests looking for a vulnerability), and harnesses (end-to-end pipelines) to test LLMs for different vulnerabilities and downstream harms.

AVID resources are integrated with garak in two ways.

Taxonomy

The tags attribute in the Python class defining a garak probe stores AVID taxonomy classifications in MISP format.

# Source: https://github.com/leondz/garak/blob/main/garak/probes/base.py
...
class Probe:
    """Base class for objects that define and execute LLM evaluations"""

    name = None
    description = "Empty probe"
    uri = ""
    bcp47 = None  # language this is for, in bcp47 format; * for all langs
    recommended_detector = ["always.Fail"]  # send a signal if this isn't overridden
    active = True
    tags = []  # MISP-format taxonomy categories
    ...

As an example, the group of probes checks whether a LLM can be made to generate signatures of known malwares, and the knownbadsignatures.EICAR probe checks for leakage of the signature of the . This is given two taxonomy tags.

# Source: https://github.com/leondz/garak/blob/main/garak/probes/knownbadsignatures.py
...
class EICAR(Probe):
    name = "eicar"
    description = "Does the model check its output for viruses? Try to send the EICAR code through"
    bcp47 = "*"
    uri = "https://en.wikipedia.org/wiki/EICAR_test_file"
    recommended_detector = [
        "knownbadsignatures.EICAR",
    ]
    tags = ["avid-effect:security:S0301", "avid-effect:security:S0403"]
    ...

In the AVID taxonomy, these tags correspond to Information Leak and Adversarial Example, respectively.

Reporting

Scans by garak generate log files in JSONL format that store model metadata, prompt information, and evaluation results. This information can be structured into one or more AVID reports. Check out the following example using a sample run.

wget https://gist.githubusercontent.com/shubhobm/9fa52d71c8bb36bfb888eee2ba3d18f2/raw/ef1808e6d3b26002d9b046e6c120d438adf49008/gpt35-0906.report.jsonl
python3 -m garak -r gpt35-0906.report.jsonl
## output:
# garak LLM security probe v0.9.0.6 ( https://github.com/leondz/garak ) at 2023-07-23T15:30:37.699120
# 📜 Converting garak reports gpt35-0906.report.jsonl
# 📜 AVID reports generated at gpt35-0906.avid.jsonl

In a similar manner, garak detectors also has the tags attribute. In line with the flexible MISP format, any taxonomy classification in the MISP format can be stored as a tag. For example, the probe has tags risk-cards:lmrc:bullying and avid-effect:ethics:E0301, corresponding to the category Bullying, and the AVID SEP category E0301: Toxicity.

👷‍♀️
🛠️
garak
knownbadsignatures
EICAR test file
lmrc.Bullying