Giskard

  • What it is:Giskard is an AI security testing platform that detects vulnerabilities in LLM agents through red teaming, including hallucinations, prompt injections, and security flaws. Open-source Python library and enterprise Hub available.
  • Best for:Enterprise AI teams building LLM agents, Organizations with compliance needs, Teams needing on-premise deployment
  • Pricing:Free tier available, paid plans from Contact for pricing
  • Rating:85/100Very Good
  • Expert's conclusion:Giskard is highly recommended for enterprises interested in LLM safety as well as continuous red teaming.
Reviewed byMaxim Manylov·Web3 Engineer & Serial Founder

What Is Giskard and What Does It Do?

Giskard is a software company out of France focused on providing testing and quality assurance for AI models, including detecting potential biases and vulnerabilities in AI models, and ensuring that the AI models meet compliance requirements. Giskard was founded in 2021 by experienced AI professionals who worked for Dataiku, Thales, and CERN. Giskard offers free open source and enterprise level solutions for AI teams to ensure their AI systems are reliable and ethical. Giskard also offers a comprehensive testing platform for both Large Language Models (LLMs) and other types of machine learning (ML) models for use within the enterprise.

Active
📍Paris, France
📅Founded 2021
🏢Private
TARGET SEGMENTS
AI/ML EngineersData ScientistsEnterprise AI TeamsFinancial InstitutionsLarge Corporations

What Are Giskard's Key Business Metrics?

📊
$4.91M
Raised Funding
💵
$2M
Estimated Revenues
🏢
11-50
Employees
📊
Initial Revenues
Company Stage
📊
Popular open-source library
GitHub Popularity

How Credible and Trustworthy Is Giskard?

85/100
Excellent

Giskard’s credibility is demonstrated by the support it received from well known venture capital firms and co-founders of leading AI companies, as well as the adoption of Giskard’s holistic testing platform by major firms and the rapid growth of Giskard’s open source framework that addresses many of the most pressing AI safety needs.

Funding & Backers90/100
Market Traction88/100
Team Expertise92/100
Product Innovation85/100
Backed by top VCs (Bessemer, Elaia)Endorsed by Hugging Face and Mistral co-foundersEnterprise customers including banks and Fortune 500 companiesOpen-source GitHub library gaining tractionFocus on EU AI Act compliance

What is the history of Giskard and its key milestones?

2015

Founders Meet

At Capgemini, Alex Combessie and Jean-Marie John-Mathews form a connection which lays the groundwork for potential future collaboration related to AI.

2021

Company Founded

Giskard AI is formed in Paris by Alex Combessie, Jean-Marie John-Mathews, and Andrey Avtomonov in order to solve the quality problems associated with AI models.

2022

Early Growth

The product development process begins to accelerate with the focus on developing an open-source testing framework for AI.

2023

Funding & Expansion

Giskard raises $4.91 million from Bessemer Venture Partners and Elaia; releases an open-source LLM testing framework; expands its team to twenty employees.

Who Are the Key Executives Behind Giskard?

Alex CombessieCo-founder & Co-CEO
An engineer from former Dataiku with over eleven years of experience in NLP model integration and software engineering; also has experience in data science and business development.
Jean-Marie John-Mathews, PhDCo-founder & Co-CEO
A PhD qualified researcher in AI ethics; former employee of Capgemini and Thales; has extensive experience in data science and philosophy.
Matteo Dora, PhDChief Technology Officer
Technical leadership role in AI development.
Kevin MessiaenSoftware Engineering Lead
Leads engineering efforts for the Giskard platform.

What Are the Key Features of Giskard?

Open-Source Python Library
Allows users to integrate the testing framework into their ML pipelines in order to create automated test suites that include performance testing, bias detection, hallucination detection, vulnerability scanning, and prompt injection attacks.
AI Quality Hub
Premium tool used for debugging LLMs, comparing different models, analyzing errors, and documenting regulatory compliance.
LLMon Monitoring
Provides real time evaluations of the output of LLMs for toxicity, hallucinations, and fact checking before the LLM output is delivered to users.
🔗
CI/CD Integration
Automated Testing in Development Pipelines via GitHub Scan Reports & Custom Test Suites for RAG Applications 14
👥
Red Teaming & Security Testing
Detection of AI Vulnerabilities Including Prompt Injection, Data Leakage, Harmful Content Generation, etc. 15

What Technology Stack and Infrastructure Does Giskard Use?

Infrastructure

Self-hosted options available; cloud-agnostic with CI/CD pipeline support

Technologies

PythonOpen-Source SDK

Integrations

Hugging FaceMLFlowWeights & BiasesPyTorchTensorFlowLangChainOpenAI APIs

AI/ML Capabilities

LLM evaluation, RAG testing, model robustness, bias detection, vulnerability scanning, hallucination detection, automated test generation

Based on product descriptions from official site, TechCrunch, and company profiles

What Are the Best Use Cases for Giskard?

Enterprise AI Teams at Banks (e.g., Societe Generale, BPCE)
Ensures Regulatory Compliance & Detects Biases / Security Risks in Financial AI Models 16
LLM Developers Building Chatbots
Automatic Detection of Hallucinations, Prompt Injections & Misinformation 17
RAG Application Builders
Custom Test Suites Utilizing Vector Databases for Domain-Specific Accuracy 18
Data Science Teams in Manufacturing (e.g., Michelin)
CI/CD Integration for Robust Deployment of ML Models 19
GenAI Security Teams
Red Teaming/Vulnerability Scanning for Production AI Systems 20
Consumer-Facing AI Startups
Real-Time Monitoring(LLMon) to Prevent Toxic/Harmful Outputs 21
NOT FORNon-AI Software Teams
Limited Value Without ML Model Evaluation Needs 22
NOT FORSmall Teams Without CI/CD
Advanced Automation Requirements Existent Only With Development Pipeline Infrastructure 23

How Much Does Giskard Cost and What Plans Are Available?

Pricing information with service tiers, costs, and details
Service$CostDetails🔗Source
Open-Source LibraryFreeFree open-source library for AI red teaming, LLM evaluation, and testing.Giskard Pricing page
Giskard Hub (Enterprise Platform)Contact for pricingEnterprise platform for continuous AI red teaming, LLM security testing, and RAG evaluation. No hidden fees; additional costs for customization.Giskard website and CompareYourTech
On-Premise DeploymentCustom (contact sales)Available for mission-critical workloads in public sector, defense, or sensitive applications.Giskard FAQ
Open-Source LibraryFree
Free open-source library for AI red teaming, LLM evaluation, and testing.
Giskard Pricing page
Giskard Hub (Enterprise Platform)Contact for pricing
Enterprise platform for continuous AI red teaming, LLM security testing, and RAG evaluation. No hidden fees; additional costs for customization.
Giskard website and CompareYourTech
On-Premise DeploymentCustom (contact sales)
Available for mission-critical workloads in public sector, defense, or sensitive applications.
Giskard FAQ

How Does Giskard Compare to Competitors?

FeatureGiskardCompetitors (General)
Automated Red TeamingYes - Dynamic multi-turn attacks, 40+ probesOften static testing
LLM Security TestingYes - Hallucinations, vulnerabilities, domain-specificLimited domain specificity
Open-Source OptionYes - 1900+ GitHub starsVaries
On-Premise SupportYes - For sensitive appsLimited
Data Residency (EU/US)YesNot always
RBAC & Audit TrailsYesVaries
SOC 2 Type II, HIPAAYesVaries
Automated Red Teaming
GiskardYes - Dynamic multi-turn attacks, 40+ probes
Competitors (General)Often static testing
LLM Security Testing
GiskardYes - Hallucinations, vulnerabilities, domain-specific
Competitors (General)Limited domain specificity
Open-Source Option
GiskardYes - 1900+ GitHub stars
Competitors (General)Varies
On-Premise Support
GiskardYes - For sensitive apps
Competitors (General)Limited
Data Residency (EU/US)
GiskardYes
Competitors (General)Not always
RBAC & Audit Trails
GiskardYes
Competitors (General)Varies
SOC 2 Type II, HIPAA
GiskardYes
Competitors (General)Varies

How Does Giskard Compare to Competitors?

vs Static LLM Testing Tools

Giskard Uses Dynamic, Multi-Turn Red Teaming Agents Adapting in Real Time vs Static Probes 24

Superior for Detection of Sophisticated Conversational Vulnerabilities 27

vs Network-Layer Security Tools

Giskard Tests Domain-Specific Hallucinations & Quality Issues Proactively in Development Pipeline 25

Prevents Business Failures Missed by Reactive Monitoring 28

vs General ML Testing Platforms

Giskard Specializes in Black-Box API Testing of LLM Agents with Reproducible Test Suites 26

Strongest for Conversational AI Security & Regression Prevention 29

What are the strengths and limitations of Giskard?

Pros

  • Comprehensive Coverage of LLM Vulnerabilities – 40+ Probes for Security & Business Failures 30
  • Open Source Library Available – Free Tier w/ 1900+ GitHub Stars 31
  • Proactive Testing Pipeline – Converts Issues to Reproducible Test Suites to Prevent Regressions 32
  • Enterprise Grade Security – SOC 2 Type II, HIPAA, GDPR, Data Residency Options 33
  • On-Premise Deployment Option – Available for Sensitive Mission-Critical Applications 34

Cons

  • No Public Pricing Information – Contact Sales for Quotes 35
  • Limited to Text-To-Text Conversational Agents – Requires API Endpoint Access 36
  • Additional Costs May be Added For Customization – Fees for Specific Enterprise Needs 37 START_TEXT

Who Is Giskard Best For?

Best For

  • Enterprise AI teams building LLM agentsThe company provides continuous Red Teaming, Security Testing & Regression Prevention in Development Pipeline.
  • Organizations with compliance needsCompany offers GDPR, SOC 2 Type II, HIPAA Compliance along with Data Residency & Role Based Access Control.
  • Teams needing on-premise deploymentCompany supports Sensitive Applications in Public Sector or Defense.

Not Suitable For

  • Small teams with simple ML modelsPricing is Enterprise Focused but Open-Source is suitable for Basic Needs.
  • Non-API accessible LLM agentsBlack-Box testing just requires API Endpoint Access Only.
  • Real-time only guardrail usersCompany focuses on Batch Evaluations and Proactive Testing rather than Pure Runtime Protection.

Are There Usage Limits or Geographic Restrictions for Giskard?

Supported Agents
Conversational AI agents in text-to-text mode via API endpoint (black-box)
Open-Source Scope
Core evaluation and testing; enterprise features require Hub
Deployment Options
Cloud (EU/US residency), on-premise for qualified customers
Data Policy
0-training policy; no model training on customer data

Is Giskard Secure and Compliant?

SOC 2 Type IIAchieved compliance certification.
HIPAACompliance for handling protected health information.
GDPRNative adherence as European entity.
Data Residency & IsolationEU or US processing with isolation guarantees.
RBAC & Audit TrailsRole-based access control, audit logging, IP provider integration.
EncryptionEnd-to-end encryption at rest and in transit.
0-Training PolicyNo training on customer data or IP.

What Customer Support Options Does Giskard Offer?

Channels
https://www.giskard.ai/contactprivacy@giskard.aiCustomer engineers via call and email for Enterprise subscribers24x7x365 technical support through AWS infrastructure
Hours
Enterprise support: Not specified; AWS support: 24x7x365
Response Time
Fast-response for AWS support; enterprise details not specified
Satisfaction
Not publicly available
Specialized
Technical consulting from AI security team available after Hub subscription
Business Tier
Giskard Enterprise subscription provides dedicated customer engineers via call and email
Support Limitations
Basic support via contact form; full customer engineer access requires Enterprise subscription
No mention of live chat or phone support for standard users

What APIs and Integrations Does Giskard Support?

Black-box API Testing
Supports LLM agents accessible via API endpoint without needing internal model details
Hugging Face Integration
Compatible with Hugging Face Hub models and API
LangChain Embeddings
Supports LangChain for knowledge base integration in evaluations
RetrievalQA Chains
Integrates with LangChain RetrievalQA for RAG-based LLM testing
AWS Marketplace
Deployable via AWS with API access

What Are Common Questions About Giskard?

For Support Customers have contact form at giskard.ai/contact, Email Support and Enterprise Subscribers also receive Customer Engineers via Call/Email. If you are an AWS user then you will receive 24x7 support.

Yes Giskard has specialization in Evaluation of Conversational AI Agents in Text-To-Text Mode through Black Box API Testing.

Company has Data Residency (EU/US), RBAC, Audit Trails, IP Controls & 0 Training Policy to Protect Customer Data.

No, Giskard works as Black-Box Testing - Company Just Requires Access to Your LLM Agent's API End Point.

Following Scanning Enterprise Users Can Get Technical Consulting from Giskard's AI Security Team for Mitigation.

AXA, BNP Paribas, Michelin & Societe Generale Trust Giskard for LLM Evaluation & Security.

Yes, Designed for Business Stakeholders with Collaborative Red-Teaming Playground & Annotation Tools.

AWS Marketplace Charges Per Project Initialization; Enterprise Subscription Available with Dedicated Support.

Is Giskard Worth It?

Giskard is Specialized AI Safety Platform That Excels In LLM Red Teaming & Vulnerability Detection Through Black-Box API Testing. It Has Enterprise Grade Security Features & Integration Capabilities Making It Ideal for Production Deployments of AI. Customer Validation From Major Enterprises Confirm Its Effectiveness.

Recommended For

  • Teams of Developers Building LLM Applications.
  • Companies Prioritizing AI Security & Compliance.
  • ML Engineers Needing Robust Model Evaluation Pipelines.
  • Companies Using Hugging Face or LangChain Ecosystems.

!
Use With Caution

  • Small Teams Wanting Simple Open-Source Testing
  • Users requiring the ability to inspect their white box models of LLMs
  • Start-ups with budgetary constraints that do not have the need for an enterprise version of a product

Not Recommended For

  • Testing of non-LLM machine learning (ML) models
  • Only real time inference monitoring
Expert's Conclusion

Giskard is highly recommended for enterprises interested in LLM safety as well as continuous red teaming.

Best For
Teams of Developers Building LLM Applications.Companies Prioritizing AI Security & Compliance.ML Engineers Needing Robust Model Evaluation Pipelines.

What do expert reviews and research say about Giskard?

Key Findings

Giskard provides black-box security testing of LLMs through API's, and has strong adoption from enterprise clients (AXA, BNP Paribas), EU/US data residency options, and dedicated customer engineering support for enterprise clients.

Data Quality

High - Direct from official site, AWS Marketplace, and customer case studies

Risk Factors

!
Enterprise-level support requires paid subscription
!
The limited information provided regarding the free tier does not provide enough detail
!
Quality of support can vary based on subscription level
Last updated: January 2026

What Additional Information Is Available for Giskard?

Enterprise Customers

Giskard is trusted by AXA, BNP Paribas, Michelin, and Societe Generale for production evaluation pipelines of LLMs.

Security & Compliance

Data residency options are available within the EU and US, Role Based Access Control (RBAC), Audit Trails, 0-training policy protects customer and client IP/data.

Deployment Options

Giskard is available on the AWS Marketplace and includes infrastructure support, and will support any LLM agent accessible via API.

Use Cases

Chatbots providing customer service, conversational AI agents, etc.; Giskard excels in detecting hallucinations, robustness testing.

Technical Focus

Black-box testing does not require knowledge of internal model structure; Giskard integrates with Hugging Face and LangChain ecosystems.

Business Accessibility

Red-teaming collaborative environment and annotation tools for domain experts and Product Managers.

What Are the Best Alternatives to Giskard?

  • AgentBench: Benchmarking platform for AI agents; more research-focused than Giskard's production security focused approach. Better suited for academic evaluation.
  • Sendbird AI Agent Platform: Automation of customer support across SMS/web/mobile; channel-focused rather than Giskard's focus on model security testing. Better suited for multi-channel deployment.
  • LangSmith: LangChain tracing/debugging tool; developer-centric rather than Giskard's focus on security and red teaming. Better suited for LangChain-specific workflows.
  • Weights & Biases (W&B): Platform for tracking experiments of ML; broad ML ops rather than Giskard's specific focus on LLM security. Better suited for model training workflows.
  • Honeycomb: Observability platform; runtime monitoring rather than Giskard's focus on pre-deployment vulnerability testing. Better suited for production inference.

What Are Giskard's Evaluation Metrics?

Ratio of correctly predicted events
Accuracy
Harmonic mean of precision and recall
F1-score
Plots false positives vs true positives
AUC (Area Under Curve)
Prediction uncertainty measure
Log Loss
True positives over predicted positives
Precision
True positives over actual positives
Recall
String comparison to expected output
Exact Match

What Testing Capabilities Does Giskard Offer?

Automated Vulnerability Scanning

Finds hidden vulnerabilities using scan reports.

Performance Bias Detection

Examines the bias that is present in different subgroups by slicing the metrics.

Subgroup Fairness Testing

Evaluates how well a model performs on protected or proxy variables.

Custom Test Suites

Provides a way to consistently compare the same metrics across multiple models.

RAG System Evaluation

Metrics for evaluating the performance of a model in which the retrieval-augmentation of generation is being used.

Cross-Validation Support

Multiple data sets for measuring robustness when evaluating efficiency.

How Does Giskard's Benchmark Support Compare?

BenchmarkDescriptionSupported
LLM EvaluationFairness, robustness, explainability metricsYes
RAG EvaluationCorrectness, context precision by topicYes
Classification BenchmarksAccuracy, F1, AUC on standard datasetsYes
Fairness BenchmarksDemographic parity across subgroupsYes

What Model Compatibility Does Giskard Support?

Scikit-learnMLflowLarge Language ModelsHugging FaceRAG SystemsTabular Models

What Is Giskard's Evaluation Modes?

Holdout Technique
Separate training/testing data for simple evaluation
Cross Validation
Multiple data samples for comprehensive testing
Automated Scanning
Comprehensive vulnerability detection
LLM-as-Judge
Uses LLMs to evaluate answer correctness
Subgroup Slicing
Fairness evaluation across demographics

How Does Giskard Ensure Safety Through Testing?

Performance Bias Detection

Finds differences in performance among subgroups.

Fairness Metrics

Evaluation of demographic parity and equality of opportunity.

Robustness Testing

Measures the resistance to adversarial attacks.

Underconfidence Detection

Flags low confidence predictions.

Bias Across Protected Groups

Tests for patterns of discrimination.

What Is Giskard's Ci Cd Integration?

MLflow Integration
Logs scan reports, metrics, and JSON results
Automated Scanning
CI/CD compatible vulnerability detection
Test Suite Generation
Standardized evaluation for pipeline automation
HTML/JSON Reports
Machine-readable outputs for CI systems

Expert Reviews

📝

No reviews yet

Be the first to review Giskard!

Write a Review

Similar Products