Others Require Alert Thresholds
You have to instrument, tune thresholds for each data stream, and anticipate every failure mode worth watching. Miss one, and you're blind to it.
Herald predicts issues before alert thresholds fire, investigates without runbooks, and resolves novel incidents.
Found 2 likely root causes. Pick one to investigate:
Real investigation on Herald production; sensitive details redacted. See more
Trusted in Production
Why other AI SREs don't work
The metric stays below the alert threshold for most of the chart, then crosses it. Marker 1 identifies the threshold crossing. The area above the threshold after that crossing is labeled no runbook coverage, and marker 2 identifies the uncovered alert territory.
You have to instrument, tune thresholds for each data stream, and anticipate every failure mode worth watching. Miss one, and you're blind to it.
You document every investigation workflow before it's needed. Maintain them as your stack evolves. When something novel breaks, there's no runbook and no investigation.
Every other AI SRE is purely reactive — and only handles failures someone already anticipated.
THE HERALD APPROACH
Herald builds a context graph before any alerts fire — observability, codebase, CI/CD, docs, and dependencies — so it knows what normal looks like and can work to solve any problem.
context: Jira tool 65 · config read path · CUST-8291-X
No thresholds to set. Herald builds a custom anomaly detection model for each data stream and surfaces validated issues before your customers notice.
validated signal: HTTP 500s rose to 18–26% over 22 minutes while other tenants stayed flat
Never write another runbook. Herald evaluates multiple hypotheses simultaneously, each against the right data source, and delivers RCAs in minutes.
RCA: Schema Drift · legacy Vault keys rejected after PR #4275
In Production
Heralds's agent onboards and adapts quickly. Gartner's 2026 AI SRE Market Guide identifies proactive incident prevention and contextual awareness as next-generation capabilities. Herald already does both.
Herald was founded by PhDs and Professors from UC Berkeley's innovation center, RISELab, combining expertise in AI, LLMs, data systems, and scalable infrastructure.
When your AI-powered RCA tool floods Slack with hallucinated walls of text, the problem isn't the model. It's the missing data engineering underneath it.
When everyone can build software, someone still has to keep it running. A reliability leader helped me understand how engineering organizations are facing a new influx of code from all sides.
Ben Sigelman argues that AI-generated code is a reliability crisis in slow motion, and what it means for how we observe production systems.