BrainBrowser builds the safety and alignment layer for frontier AI systems working with leading AI labs and foundational model developers on adversarial red teaming, jailbreak and vulnerability discovery, safety evaluations, and agent behavior testing. This role sits on the assurance side of that stack: finding where the model fails before it ships.

The Role

Design and execute prompt attacks across assigned safety policy categories. Document findings with structured reports. Identify patterns in what works and why. Contribute to prompt taxonomy development over time.

What You'll Do

Design and execute prompt attacks targeting specific policy violation categories: harmful instructions, manipulation, privacy violations, and others
Work from the safety policy document

Who We Need

Strong adversarial thinking you find loopholes instinctively
Familiarity with LLM behavior: system prompts, role-play, indirect instruction, multi-turn manipulation
Comfortable reading and applying a detailed policy document
Structured documentation habits
Prior experience in one of: AI red teaming, offensive security, trust safety, content moderation, social engineering research

Engagement: Fixed price per approved violation identified | 10 samples to start calibration batch

Apply with: Send a short note on why this work interests you.

This job is provided by Shine.com
Show more Show less

Freelance Adversarial AI Researcher | LLM Red Teaming | Remote | Global Project

Similar Jobs