Cisco AI Defense

Secure your Foundation Models

Ensure the foundation models at the heart of your applications are secure and safe.

Foundation models accelerate AI application adoption

The introduction of foundation models has been a significant breakthrough in the field of artificial intelligence. They are trained on vast datasets and offer advanced capabilities such as natural language fluency and complex reasoning, making them applicable to a wide range of general purposes. These can be anything from providing customer support and virtual assistance to creating new bodies of copy, images, and video.

Creating a foundation model from scratch is a challenging and resource intensive undertaking. For this reason, most companies will elect to build their AI applications using a subset of readily available, state-of-the-art frontier models. Their flexibility, utility, and cost effectiveness has radically transformed machine learning and AI application development.

Every foundation model introduces inherent risks

Organizations working on new foundation models or leveraging existing ones should be aware of their inherent risks in order to build AI applications that are not only broadly capable, but also fair, robust, safe, and secure.

If the data used to train or fine-tune a foundation model contains toxic content, bias, personally identifiable information (PII), or other malicious inclusions, it can manifest as outputs that are skewed, harmful, or which violate privacy standards. Even without explicit data poisoning, model results are susceptible to being dangerous, factually inaccurate, intentionally misinformed, or hallucinated.

Most organizations that leverage an existing foundation model will fine-tune it to improve contextual relevance for a specific AI application. However, even the slightest modifications can break base model alignment and introduce new safety and security vulnerabilities. Research demonstrates that this can occur inadvertently even when fine-tuning data is entirely benign, meaning that model alignment must be reevaluated after every instance of fine-tuning.

Once deployed in an application, these models become prone to prompt injection, model extraction, denial of service, toxic responses, and any number of other adversarial techniques targeting production systems.

Mitigate foundation model risks with Cisco

Cisco AI Defense provides an automated, end-to-end solution to protect your AI applications from safety and security threats. A comprehensive suite of algorithmically generated tests assess model vulnerabilities, recommend guardrails necessary for safe deployment, and measure adherence to relevant security standards and regulations. Our guardrails protect applications by detecting and mitigating threats in real time.

Validate models with rigorous algorithmic testing

Model testing is necessary to identify issues such as toxicity, bias, and susceptibility to various attacker techniques.

AI Defense’s algorithmic testing continuously evaluates how models fare in a broad and diverse set of scenarios to promptly surface safety and security risks. These results enable us to measure compliance with relevant standards and suggest guardrails to safeguard your AI applications.

Address new vulnerabilities introduced in fine-tuning or production

The need to validate models extends beyond development, as new vulnerabilities can emerge after every instance of fine-tuning or in production.

By operating tests in a continuous and discreet manner, AI Defense continuously reevaluates model safety and security alignment. Timely identification of new risks facilitates timely investigation and rapid resolution.

Protect AI applications from threats in real time

Even well aligned models remain susceptible to prompt injections, denial of service attacks, and various other techniques when they are deployed in AI applications.

AI Defense Runtime Protection addresses the vulnerabilities surfaced during model testing by protecting AI applications in real time. Inputs and outputs are examined to intercept malicious prompts and prevent the model from relaying toxic, private, or otherwise harmful content.