Blog

cyber.md: AI-native posture that speaks agent

cyber.md is a proposed Markdown-based security posture file that helps coding agents preserve security intent during normal development workflows. Rather than acting as a vulnerability report or attack playbook, cyber.md captures protected assets, trust boundaries, coding invariants, defensive patterns, and testing expectations in a format agents can consume safely and contextually.

May 12, 2026

Guy Eisenkot

Tables of content

Heading H2

Heading H3

Agents have changed how I think about secure development [1], and specifically the way security knowledge moves through a repository. Today, developers and agents at Baz are generating more code than ever, and the amount of code being written is outpacing our SOC2-mandated security processes.

The classic security review was designed for a world where humans opened pull requests, humans reviewed them, humans remembered the threat model, humans knew which parts of the codebase were dangerous to touch AND humans eventually exploited. That is no longer the case.

Agents are independently refactoring API routes, modifying deployment logic, adding integrations, and editing database queries. Simultaneously, bad-actor operated agents [14, 15] are leveraging cyber-capable models to conduct reconnaissance and enumeration with unprecedented scale and intelligence.They do it quickly, repeatedly, and often with a very narrow view of the task in front of them.

We think there’s a need for a new primitive: versioned security posture that agents can read, diff, and apply during normal development.

We’re calling this cyber.md

The file itself is intentionally boring. It names the repo or service it applies to. It describes the protected assets. It records the trust boundaries. It captures the invariants that should survive refactors. It explains which tests should be suggested when certain kinds of code change. Commit the advice, not the exploit.

But there is an important constraint: cyber.md should not become a vulnerability report or risk register. The checked-in version should contain safe guidance, invariants, patterns, and tests. Exact exploit chains, private topology, customer-specific incidents, sensitive findings, and bypass instructions should not live in plaintext next to the code. If the review produces sensitive details, those belong behind some kind of private context broker [2, 3, 4, 5].

Why Markdown? What’s a posture file?

Agents know Markdown. The happy path and the edge cases are well known to agents, and repo-aware tools are particularly good at using Markdown files as durable context.

Security posture should work the same way, with more care.

A posture file is not a pentest output. It is not a list of bugs to fix. It is not a secret notebook of attack paths. It is the durable security memory of a repository [6, 7, 8, 9, 10]. It answers questions like: what assets are protected here, where are the trust boundaries, which invariants must code preserve, which modules need extra care, what tests should be suggested when certain areas change, what guidance is safe to show every agent, and what guidance must stay private.

The goal is not to tell an agent how to break the codebase. The goal is to tell the agent how not to accidentally weaken it.

Not just for security review

cyber.md might make you think it’s just for scheduled security scans, but the more interesting use case is normal coding: Inside an agent-native workflow [13], a cyber.md becomes a source for opportunistic hardening. When a developer asks Claude Code to add a feature, change an endpoint, refactor a worker, update a dependency, or adjust logging, a lightweight context layer can inspect the task and changed files, select the relevant posture guidance, and inject a small amount of advice before implementation. That advice should be intentionally small and just enough context to bias the implementation toward the safer path.

Under the hood

The simplest version of the cyber.md loop has three parts.

First, a posture generator reads the repository and creates or updates cyber.md. This can run manually, on a schedule, after major architecture changes, or as part of a security review. It uses a reasoning model to summarize the codebase into assets, boundaries, invariants, risk categories, preferred patterns, and test expectations.

Second, a posture context layer reads cyber.md during coding tasks. In Claude Code, this can be implemented with hooks, commands, or an MCP-backed broker. In CI, it can be implemented as a GitHub Action or a pull request bot. In a custom agent harness, it can be another context retrieval step [11].

Third, an agent receives only the relevant guidance. The agent does not need the entire posture file every time. It needs the small slice that applies to the current task.

A typical flow is straightforward: the developer asks the agent to change code, the task and touched files are classified, relevant posture guidance is selected, private context is fetched only if needed, a small advice packet is injected, the agent writes normal product code, and the diff is checked against the relevant invariants.

The important part is that cyber.md remains the durable, reviewable source of security guidance. The hooks and brokers are just delivery mechanisms.

Attackers can read markdown

There is an obvious problem with putting security guidance in a repository: attackers can read Markdown too. A repo-visible posture file can leak what the team cares about protecting. Even if it avoids exploit details, it may still reveal trust boundaries, sensitive surfaces, data flow assumptions, and areas that deserve extra scrutiny. Public forks, cloned repositories, archived copies, and old commit history can make those assumptions hard to remove later.

A repo-visible file should include durable security assumptions, safe implementation advice, trust boundaries, protected assets, risk categories, preferred patterns, test expectations, confidence levels, and human review triggers.

It should not include secrets, credentials, exploit chains, exact bypass steps, customer-specific incidents, sensitive production topology, internal attacker paths, or instructions that make exploitation easier.

There is another risk: Markdown files that agents treat as instructions can become an attack surface. A small change to an instruction file can alter how an agent behaves. A malicious dependency, compromised branch, unsafe prompt injection, or poisoned context source [16] can turn helpful guidance into harmful guidance. In other words, don’t import cyber.md from anywhere.

It should be reviewed like code. It should be protected by ownership rules. Changes should be diffed carefully. Sensitive versions should be encrypted or permissioned. The agent should treat guidance as advisory and should still defer to tests, reviewers, policies, and explicit developer intent.

Optional: the split model

We are also experimenting with a split model for sensitive but useful information. In this scenario we keep the “safe posture guide” described previously in Git, alongside the regular code. Sensitive versions that include things like vulnerability scan reports, private architecture notes, and sensitive service context are saved in a permissioned broker and/or as an encrypted file. The broker (in our case a tightly controlled MCP server) should return the minimum useful advice for the task, not the whole private threat model.

What can go wrong?

There are several ways to build this badly.

A file named cyber.md is easy to find. If it contains detailed threat modeling notes [23, 24], it becomes free reconnaissance. Realistically your attacker, once they have access to your code, produces this without you. We believe the mitigation is separation: safe guidance in Git, sensitive detail on a brokered file elsewhere (see above, split mode).

The second failure mode is instruction poisoning. Any file that an agent reads as guidance can be modified, misread, over-weighted, or combined with malicious context. This is especially important for repo-level instruction files. Treat them as part of the trusted development surface: Protect them with code owners, review rules, signing where practical, and conservative runtime handling.

The third failure mode is the broker. A context broker that classifies tasks, retrieves posture guidance, filters sensitive details, and injects advice could become a target.

The fourth failure mode is pull request noise. If you use a non cyber-capable model you will likely get bad advice.

The fifth failure mode is stale guidance. Threat models, like any context, rot. Architecture. A stale cyber.md can make agents preserve rules that no longer match reality. This file needs owners, review cadence, and confidence levels.

For now, cyber-capable models and a posture file is not a replacement for SAST, DAST, SCA, secret scanning, dependency review, human threat modeling, architecture review, incident response, or security ownership [22]. We’re suggesting a context primitive that makes agents more likely to preserve known security properties during coding. It does not prove the code is secure.

What should go in cyber.md?

While we are still refining the best format, we have identified several effective patterns.

A useful cyber.md begins by defining its scope, specifying the relevant repositories, languages, frameworks, and data sensitivity levels [18, 20, 21]. It must explicitly outline protected assets that require careful handling, such as user identities, secrets, admin privileges, and integration credentials. Furthermore, it should identify trust boundaries, providing a mandatory invariant and specific coding guidance for each.

We strongly believe the cyber.md should be a defensive AppSec guidance layer [17], not an attack playbook. Its purpose is to help a cyber-capable model review code safely by describing security intent, expected controls, ownership and trust boundaries without exploit recipes or sensitive operational details.

An exposure layer could define where any public surface interacts with users and agents. It should guide the model to check whether inputs are authenticated, validated, scoped, rate-limited, and handled according to the intended trust level. It should not enumerate attacker tactics or provide step-by-step abuse paths.

The reachability layer exists to help the model understand whether a risky code path is relevant to the reviewed Change. It should describe service relationships, ownership, and expected control handoffs at a high level so the model can avoid false positives and identify missing protections. It should avoid disclosing unnecessary topology, secrets, or operational internals.

The privilege layer exists to clarify what authorization model should apply. It should describe expected identity boundaries, tenant scoping, role checks, and policy enforcement responsibilities so the model can verify that sensitive operations are protected before execution.

The data-flow layer exists to protect sensitive data. It should describe data classes, allowed movement, required validation, storage expectations, logging rules, and encryption requirements. The goal is to help the model verify safe handling, not to reveal where valuable data can be targeted.

The control layer exists to encode defensive standards. It should capture required controls such as authn/authz, input validation, output encoding, secrets handling, dependency hygiene [19], audit logging, isolation, and secure defaults. Controls should be tied to owners and source evidence so findings are actionable.

The evidence layer exists to keep the model grounded. Claims should point to code, config, policy, tests, or scan results, but only at the level needed for remediation. The output should answer: what defensive expectation applies, whether the Change preserves it, what evidence supports the conclusion, and what remediation is needed.

We hate scan reports

Security teams often struggle because review happens after the shape of the code is already set. This means they scan a given state with zero impact on what’s already working in production and is generally where great code scan reports go to die. cyber.md is an opporutnity to move posture guidance earlier. Not to a separate planning meeting. Not to a quarterly review. To the exact moment the agent is about to write code.

The improvement is not dramatic on any single pull request. That’s the point. The codebase gets harder to break because small improvements keep showing up where the code is already changing. But compounding only works if the advice is good. Bad advice compounds too.

That is why the loop needs feedback [12]. Did the suggestion result in a useful test? Did the reviewer accept it? Did it prevent a real class of issue? Did it create churn? Did the same warning appear five times without changing the code? These questions matter more than whether the agent sounded security-aware.

What’s coming?

The first version of the loop is intentionally simple: a Markdown file, a posture generator, and task-specific injection into coding sessions.

There are a few possible next steps. We’re exploring them but we’re open to feedback if you think there’s a better way:

Posture diffs should become first-class. When cyber.md changes, developers and agents should be able to see whether the repository added a new trust boundary, improved an invariant, downgraded confidence, or introduced a new watch area.
Agent hooks should get better at selecting only the relevant posture context. Large files are useful for humans, but agents need concise packets.
Private posture brokers should become easier to run. Teams should be able to keep sanitized guidance in Git while storing sensitive findings in encrypted or permissioned backends.
Test generation should become more reliable. The most valuable output is often not advice, but a small regression test that locks in the invariant.
Posture maturity should become measurable. Over time, teams should be able to see whether more high-risk flows have invariants, whether more changes include security tests, whether repeated advice is turning into durable code patterns, and whether developer trust in the suggestions is improving or declining.

What will it cost?

Nothin. A cyber.md file is just Markdown. You can write the first version manually. You can ask a reasoning model to draft it from the repository. You can review it like any other documentation change. The runtime cost depends on how often you regenerate posture and how much context you inject. The practical version does not need to run a full codebase scan on every pull request. Generate or refresh the posture file occasionally, then use cheap task-level selection during everyday coding.

There is also an attention cost. Developers have a limited appetite for AI-generated review comments. If the loop creates noisy, generic, or low-confidence suggestions, it will train people to ignore it. That is worse than not having it at all.

Where do I start?

Create cyber.md at the repo root or in the service directory. Add scope, protected assets, trust boundaries, security invariants, area-specific guidance, preferred patterns, and required tests by change type.

For Claude Code, start with a hook that reads the current prompt and touched files, selects a small relevant slice, and injects a compact advice packet. For pull requests, start with a CI job that comments with two or three posture suggestions only when the diff touches a relevant area.

‍

References

Debois, Patrick. “Context Is the New Code.” AI Engineer Europe, 2024. AI Engineer Europe.
Lowin, Jeremiah. “Your MCP Server is Bad (and You Should Feel Bad).” AI Engineer, 2026. AI Engineer.
OX Security. “Anthropic’s Model Context Protocol Has Critical Security Flaw Exposed.” Tom’s Hardware, 2026. Tom’s Hardware.
Cyata. “Anthropic’s Official Git MCP Server Had Some Worrying Security Flaws.” TechRadar, 2026. TechRadar.
Koi Security. “A Malicious MCP Server Is Silently Stealing User Emails.” ITPro, 2025. ITPro.
Packer, Charles, Sarah Wooders, Vivian Fang, and Ion Stoica. MemGPT: Towards LLMs as Operating Systems. Berkeley, CA: University of California, Berkeley, 2023. arXiv:2310.08560.
MemGPT Project. “MemGPT.” Accessed May 10, 2026. MemGPT Project.
Monigatti, Leonie. “MemGPT: Towards LLMs as Operating Systems.” 2025. Leonie Monigatti.
Rasmussen, Preston, et al. Zep: A Temporal Knowledge Graph Architecture for Agent Memory. 2025. arXiv:2501.13956.
Pawar, Tejas, et al. IMDMR: An Intelligent Multi-Dimensional Memory Retrieval System for Enhanced Conversational AI. 2025. arXiv:2511.05495.
Lewis, Patrick, Ethan Perez, Aleksandara Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, et al. “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.” Advances in Neural Information Processing Systems 33 (2020). arXiv:2005.11401.
Shinn, Noah, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. “Reflexion: Language Agents with Verbal Reinforcement Learning.” 2023. arXiv:2303.11366.
Schick, Timo, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luka Zettlemoyer, Nicola Cancedda, and Thomas Scialom. “Toolformer: Language Models Can Teach Themselves to Use Tools.” 2023. arXiv:2302.04761.
Park, Joon Sung, Joseph O’Brien, Carrie Cai, Meredith Ringel Morris, Percy Liang, and Michael Bernstein. “Generative Agents: Interactive Simulacra of Human Behavior.” Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, 2023. arXiv:2304.03442.
Wang, Lei, et al. “Voyager: An Open-Ended Embodied Agent with Large Language Models.” 2023. arXiv:2305.16291.
Anthropic. “Constitutional AI: Harmlessness from AI Feedback.” 2022. arXiv:2212.08073.
OWASP Foundation. OWASP Application Security Verification Standard 4.0.3. 2021. OWASP.
National Institute of Standards and Technology. Secure Software Development Framework (SSDF) Version 1.1. Gaithersburg, MD: NIST, 2022. NIST SP 800-218.
Google. Supply-chain Levels for Software Artifacts (SLSA). Accessed May 10, 2026. SLSA.
CISA. Secure by Design. Accessed May 10, 2026. CISA.
Google. Building Secure and Reliable Systems. Sebastopol, CA: O’Reilly Media, 2020. Google SRE.
Dowd, Mark, John McDonald, and Justin Schuh. The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities. Boston: Addison-Wesley, 2006.
Threat Modeling Manifesto Working Group. Threat Modeling Manifesto. Accessed May 10, 2026. Threat Modeling Manifesto.
Microsoft. “The STRIDE Threat Model.” Accessed May 10, 2026. Microsoft Learn.

‍