Input

Outputs

Foundation models

left

Checkpoint-AI

right

Jailbroken LLMs

What happens when your LLM is not aligned to your user-base? Trying to build general enough sytems that can entertain might not be what drives your Metrics (KPIs).

We use a modified version of the dataset from Red Teaming Language Models to Reduce Harms to make it applicable to our demo.

We only present the jailbroken responses from the LLM. A standard LLM can be jailbroken with 55% success rate (left) as compared to Checkpoint-AI's alignment method (right).

Conclusion

Aligning models with Checkpoint-AI. Can lead to a specialist model known as `forward-alignment` as well as avoid poorly generated content known as `backward-alignment`. Aligment is neccessary to drive your KPIs