
We just shipped a new Baz coding agent, SRE, that listens to traces and logs from APM (Application Performance Monitoring), maps them back to the exact code that caused them, and opens concise fix pull-requests.
SRE Agent utilizes technology that we have been working on for nearly 2 years, that closes the loop between observability and Git. Instead of handing a stack trace to a chat and hoping for context, Baz uses the same semantic model that powers our agentic reviewers to reason about traces, span metadata, and logs. The agent produces an explanation, a short fixing prompt, and when appropriate it triggers our Fixer agent to create a sandboxed session and open a PR.
That means from alert to PR in one shot:
* It finds real production issues in APM, links them to the exact file and function in your repo, and explains what is wrong.
* It generates an actionable fixing prompt that a developer or a fixing agent can run. The prompt names the file, line range, and the specific change to make.
* It can trigger a sandboxed fixer run that produces a branch and opens a PR, complete with a curated PR description and the Baz review artifacts.
* It surfaces the findings in your Slack channel with structured blocks that include the explanation, suggested fix, questions, and issue id.
SRE Agent is built on three engineering truths we believe will power every future codebase:
1. Semantic code understanding. Baz already extracts structured facts from code and generates reviewer findings. The SRE agent reuses that capability to reason about what code the trace is pointing at and why it fails.
2. Observability mapping. The agent ingests traces, spans, and logs, extracts git metadata from spans, and prepares a local repo snapshot. That mapping of trace to repository to file to function is the semantic bridge that makes any automated remediation safe and targeted.
3. Remediation plumbing. We built Fixer, a sandboxed runtime and PR pipeline that runs fixes, produces logs and branches, and pushes PRs through the same review machinery Baz already uses. That lets the SRE agent not only recommend a fix but actually deliver it as a properly formatted PR.
During internal trials the SRE agent posted findings in our SRE channel such as a clear, concise explanation that a middleware was calling the session validator with an empty token, and a suggested fix to short-circuit with a 401 when the Authorization header is missing. That message included a suggested fix prompt, and the team merged an agent-created PR that implemented the change. That thread and the PR show the full loop: Trace issue to code to fix PR.
Code review has always been about asking whether a change is good before it lands. Baz now extends that promise to running systems. We can reason about runtime failures with code context, synthesize an explanation and a precise remediation, and use the same review and fixer infrastructure to deliver the change. That means Baz can take on more SDLC ownership: triage, fault diagnosis, fix generation, sandboxed validation, PR creation, and audit. We are not replacing engineers. We are removing the busy work that eats time between an incident and a durable fix.
SRE Agent runs under org-level configuration. Slack posting and automatic fixer runs are controlled by toggles so teams can choose visibility and automation. APM access is scoped and audited. Fixer runs happen in a sandboxed environment and report full session logs back into the PR and into Baz’s UI so teams can validate what ran. We designed the flow to be deliberate and inspectable at every step.
What’s next
* Broadening observability sources beyond APM.
* Enriching fixes with cross-repo context so fixes understand shared modules and system-level contracts.
* More control surfaces for customers: fine-grained toggles, audit trails, and integration guides for observability scopes.
SRE Agent starts rolling out today to all Baz customers. Not using Baz yet? Sign up here: baz.co/login
If you want to learn more about bridging the gap between obserbability, check out these resources on the challanges and solutions for better trace-to-code mapping in Python and Go.
https://baz.co/resources/extending-opentelemetry-to-pinpoint-code-elements-our-journey-to-close-the-gap
https://github.com/baz-scm/falken-trace-py
https://github.com/baz-scm/falken-trace-go



.png)