Consistent and effective code reviews are one of the hardest coding tasks to get right. Every review feels like starting from scratch, especially when changes might impact or break existing APIs. It’s a constant challenge to piece together the right context and perform the detailed analysis needed—something that’s nearly impossible to articulate fully, review after review.
Imagine a partner in code review that never gets tired, never misses a detail, and always applies the same high standard, no matter how complex or repetitive the task. It processes changes with unwavering focus, analyzing every potential impact on APIs and ensuring conventions are applied consistently. This reviewer doesn’t need breaks, doesn’t lose context, always aligned with your team’s conventions, and never grows frustrated with tedious fixes. It’s always ready, delivering fast, reliable feedback that helps developers focus on what they do best: building great software.
From raw diff limitations to agentic reviews
Performing a great code review is not an easy task - it requires navigating through missing context under high cognitive load. Reviewers are expected to understand the system's architecture, historical decisions, constraints, and the intended goals of the change—all while piecing together fragmented information to make informed decisions.
2024 marked a significant leap forward in LLM-driven coding and code comprehension. We've experienced this firsthand, experimenting with coding workflows that integrate commercial and open-source models alongside code indexing tools. Additionally, we drew inspiration from early implementations of AI code review agents on GitHub, showcasing the use of reasoning models with dynamic, evolving prompts.
While there are numerous open source AI reviewers available (our recent GitHub search uncovered over ten actively maintained open-source projects[1]) many of these implementations operate as encapsulated solutions, assuming that AI can independently retrieve the necessary context to perform its tasks. However, based on our research, LLMs inherently lack the ability to traverse ASTs effectively. As a result, we believe these approaches overestimate the current capabilities of LLMs in accurately parsing and understanding code context. We deeply respect the efforts of the open-source community in advancing these tools and view them as valuable stepping stones toward more robust solutions.
So, why can’t we simply delegate code reviews to AI, by feeding the PR diff to an LLM? The answer isn’t straightforward as it may seem. While LLMs excel at analyzing patterns and generating natural language, they lack the deeper context required for meaningful code reviews. In fact, they even struggle with trivial tasks like generating correct line numbers. A raw diff alone doesn’t tell the whole story - it lacks the context of the change, its purpose, and its impact on the broader system. Without this understanding, the feedback isn’t accurate, often superficial or irrelevant
The features that make AI Reviewers work for Developers
Baz reviewer identifies the bugs that you need to catch during review, helps make your code more concise and clean, ensures adherence to best practices, and even suggests better variable names - naming is always the hardest part. Behind these capabilities we’ve invested in the experiences that not just make them possible, but go beyond the raw diff.
1. Enriching the Original Git Diff with Context
A plain git diff highlights added or removed lines of code - that’s usually not enough to understand what changed. When we approach a code review, we use knowledge about the code that is not explicitly written - whether it is the goal of the PR, best practices used and conventions, and even the broader scope of the code affected by the presented diff. Reviewer bridges this gap by enriching the plain diff with the necessary context required to understand the scope of the change.
2. Multiple Specialized Review Flows
We run the code through multiple review flows, each focusing on a specific aspect of programming. Whether it’s idiomatic usage for the language, code conciseness, test quality, or adherence to best practices and organizations’ conventions. Every flow emphasizes a unique dimension of what makes great code.
3. Reducing Noise
Once the LLM generates feedback from the specialized flows, we take it a step further by deduplicating and merging related comments and validating that each holds meaningful and actionable insight, reducing the overall noise of the system.
4. Customization through effective feedback loops
Baz Reviewer adjusts its findings based on users feedback, and by analyzing the addressed rate of findings within the organization. In addition, we enable uploading configuration files to customize Baz Reviewer to your specific review guidelines.
Examples
Detecting Incomplete Access Control Updates
In this example, Baz Reviewer pinpointed an oversight where the resolve_review_status
method didn't fully account for access control logic. It highlighted that the has_right_access
field was not being tested correctly, preventing potential false negatives that could compromise the approval process. This precise feedback allowed the team to immediately address the issue, ensuring access logic was robust and accurate.
Following Up on Addressed Comments
Baz Reviewer demonstrated its ability to track changes and validate fixes. When a developer updated the code to use has_right_access
instead of is_right_access
, the reviewer acknowledged the improvement and confirmed alignment with the team’s conventions. This seamless interaction showed how Baz Reviewer maintains context and ensures meaningful feedback is acted upon.
Suggesting Concise Component Logic
While reviewing a front-end component, Baz Reviewer identified an opportunity to streamline code by using optional chaining and nullish coalescing operators. This suggestion not only improved readability but also adhered to best practices, enabling the team to deliver clean, maintainable code.
Go beyond the “LGTM” review
In a world where code reviews are essential yet increasingly complex, Baz Reviewer offers a transformative solution. By enriching raw diffs with critical context, running multiple specialized review flows, and minimizing noise, it ensures reviews are not just thorough but highly actionable. With its ability to adapt to team-specific guidelines and learn from feedback, Baz Reviewer isn’t just an AI tool—it’s a dependable partner in delivering high-quality, clean, and maintainable code.
Ready to redefine your code review process? Request early access
Want to learn more? Read the docs