Z

AgenticOps Engineer

Accepting applications

Zywave · United States

Full-Time Mid_senior AIaiateganrf
Posted
28 Apr
Category
Test
Experience
Mid_senior
Country
United States
Brief Description

The Agentic Ops Engineer operates as a cross-team specialist responsible for the health, reliability, and continuous improvement of AI agent output across the organization. They are not embedded in any single team’s daily sprint cycle. Instead, they maintain a bird’s-eye view of how agents perform across a handful of product engineering teams, identifying systemic patterns, diagnosing recurring failure modes, and tuning the shared prompt and tooling infrastructure that every team depends on.

When a Product Engineer hits a wall with agent output quality—whether it’s a spec the agent can’t parse, a prompt structure that produces inconsistent results, or a workflow that degrades over time—the Agentic Ops Engineer is who they pull in.

What You Will Do

Agent Performance Monitoring & Pattern Detection

Continuously monitor agent output quality, latency, and failure rates across all three product teams.
Identify recurring patterns such as “agents keep struggling with this type of spec” or “this prompt structure consistently produces higher-quality output.
Build and maintain dashboards and alerting systems that surface degradation before teams feel it.
Conduct periodic reviews of agent interaction logs to flag systemic issues and emerging trends.

Prompt Engineering & System Tuning

Own the shared prompt infrastructure: templates, system prompts, few-shot libraries, and chain-of-thought scaffolding used across teams.
Iterate on prompt structures based on observed failure modes and A/B performance data.
Develop and maintain a prompt playbook documenting what works, what doesn’t, and why.
Evaluate and integrate new model capabilities, versioning changes, and API updates as they roll out from providers.

Escalation Support & Embedded Problem-Solving

Serve as the on-call specialist when a Product Engineer encounters persistent agent output quality issues.
Diagnose root causes: Is it the prompt? The spec format? The model’s limitations? A context window issue?
Pair with Product Engineers to rapidly prototype and test fixes, then roll improvements back into shared systems.
Maintain a knowledge base of resolved issues and their solutions to reduce repeat escalations.

Tooling, Evaluation & Infrastructure

Build and maintain evaluation harnesses, benchmarks, and regression test suites for agent workflows.
Develop internal tooling for prompt version control, output comparison, and automated quality scoring.
Collaborate with platform/infra teams to optimize agent execution pipelines (caching, context management, token budgets).
Establish and track key metrics: output acceptance rate, revision frequency, time-to-resolution on escalations.

Knowledge Sharing & Team Enablement

Run regular cross-team syncs sharing findings, patterns, and updated best practices.
Produce internal documentation, guidelines, and training materials on working effectively with agents.
Coach Product Engineers on prompt construction, spec formatting, and debugging agent behavior.
Serve as the organizational point of contact for agent-related decisions (model selection, provider evaluation, capability assessments).

Required

What you will bring

5+ years of software engineering experience with strong fundamentals in systems thinking and debugging.
Hands-on experience building with LLM APIs (prompt design, chain-of-thought, tool use, function calling).
Demonstrated ability to diagnose and resolve complex, cross-cutting technical issues.
Strong analytical skills: comfortable building dashboards, writing queries, and reasoning about statistical patterns in output quality.
Excellent written and verbal communication—this role lives on documentation, cross-team clarity, and knowledge transfer.

Preferred

Experience with prompt evaluation frameworks, LLM observability tools (e.g., LangSmith, Braintrust, Humanloop), or building internal evaluation harnesses.
Background in developer tooling, platform engineering, or SRE/DevOps with an understanding of reliability principles applied to non-deterministic systems.
Familiarity with multiple LLM providers and models; able to reason about trade-offs in capability, cost, and latency.
Experience working cross-functionally across multiple product teams without direct authority.

Why Work at Zywave?

Zywave empowers insurers and brokers to drive profitable growth and thrive in today’s escalating risk landscape. Only Zywave delivers a powerful Performance Multiplier, bringing together transformative, ecosystem-wide capabilities to amplify impact across data, processes, people, and customer experiences. More than 15,000 insurers, MGAs, agencies, and brokerages trust Zywave to sharpen risk assessment, strengthen client relationships, and enhance operations. Additional information can be found at www.zywave.com.

Show more Show less