Time to move on from Git. It served us well.
Git was designed for humans collaborating on the Linux kernel. It was built for a world where a handful of very smart people make deliberate, carefully considered changes, push them up, and argue about them on a mailing list. That world made a lot of sense in 2005. It makes less sense in 2026, when half the code being written is coming from AI agents that don’t sleep, don’t get bored, and who often brute-force a solution with more and more generated text, than is maybe even necessary (but still make fix the issue).
We need to talk about what version control should actually look like now. I’ve been building toward an answer with Flock, and I want to walk through the ideas.
The VCS Should Understand Your Code
The root problem with git is that it has no idea what it’s storing. It’s a content-addressed blob store. It knows bytes. It doesn’t know that you just changed a function signature, or that the class you deleted had 12 downstream callers, or that the code you’re writing is structurally identical to something that already exists three directories over.
Flock uses tree-sitter grammars to build an AST-aware semantic layer on top of the version control system. This means it actually parses your code into an abstract syntax tree and understands the structure — functions, classes, interfaces, dependencies, call graphs.
This unlocks basically everything else I’m going to talk about. Once your VCS understands code as code instead of as text, the entire experience changes.
For example: conflict detection. Git sees two people edited line 47 and panics. An AST-aware system sees “these two changes modified different parameters of the same function signature” and can tell you whether that’s actually a conflict or if it merges cleanly. It can classify changes by risk — a renamed variable is low risk, a changed public interface with 12 callers is high risk. That distinction matters enormously when you’re reviewing agent output at scale.
Real-Time Awareness Over WebSockets
Here’s a fun exercise. Ask yourself: what does your git server actually know about what’s happening in your repo right now?
Nothing. You push bytes, you pull bytes, and in between the server is completely in the dark. It doesn’t know that two agents are about to create a merge conflict. It doesn’t know that the function you’re rewriting has someone else actively working on it.
Flock maintains persistent WebSocket connections between every client — human editors, agent CLIs, whatever — and the server. Events stream continuously. Not raw diffs or character streams, but structured semantic updates classified by the AST layer. When an agent checkpoints some work, everyone subscribed to that file, symbol, or module gets a notification immediately. No push, no pull. Changes just flow.
This means the server has a live map of who’s working where, on what, right now. And that enables some really fun things.
Conflict Forecasting
Because the server sees all active explorations and their semantic change sets in real time, it can predict conflicts before they happen.
Picture this: Agent A is rewriting PaymentProcessor.validate(). Agent C just claimed a task that touches the same function. The server sees the overlap and sends a heads-up: “Agent A is actively modifying validate() via task tk-a1b2 — coordinate or continue?”
This is advisory, not blocking. Nobody gets locked out. But instead of discovering the conflict three hours later at merge time, you get a five-minute early warning. That’s the difference between a painful merge and a quick conversation.
Ghost Text
This one’s my favorite. Ghost text is faint editor overlays showing what other developers or agents are currently changing — at the symbol level, not raw keystrokes.
So you’re working on PaymentController and you see a faint annotation: “Agent B is modifying validate(): +2 params, body rewritten, ~15 lines.” You haven’t left your editor. You didn’t check a dashboard. You just… know.
It’s semantic presence. Not Google Docs-style character-by-character co-editing (please, no). Just enough awareness to stay coordinated without being distracted. Co-aware, not co-editing.
Continuous Semantic Review
This is maybe the thing I’m most excited about killing: the batch PR.
We’ve all just… accepted that code review means staring at a massive diff after the work is already done. You open a PR, you see 47 files changed, your eyes glaze over, you leave a comment about a missing semicolon on line 312, and you click approve. We’ve confused the mechanism with the goal. The goal is quality. The mechanism — batching into a PR — is just how git forces you to work.
Flock replaces this with continuous review. When an agent finishes a semantic unit of work — say it’s done refactoring validate() — the server generates a review notification for just that change. You review it. Approve or comment. The agent keeps working on the next piece. Repeat.
And the reviews themselves are organized by semantic change, not by file. Not “payment.ts changed.” Instead: “Signature change: PaymentProcessor.process() — added currency parameter. 12 callers affected, all updated.” You can expand to the actual diff for any change, but you start from meaning. You see an impact graph showing downstream consumers and whether they’ve been handled. You never scroll through thousands of lines hoping to find what matters.
The Exploration Tree
Right now, when an agent solves a problem, you see the final result. You don’t see the three approaches it tried and abandoned. You don’t know that it spent 20 minutes on an async approach before deciding it added complexity without benefit, or that it considered extracting a service class before realizing it would break dependency injection.
Flock tracks all of this in an exploration tree. Agents work in isolated workspaces called explorations. They can try an approach, checkpoint it, abandon it, start a new one. The full history is preserved:
Task: Fix payment validation bug
├─ Exploration 1: "try-async-approach"
│ ├─ Modified 3 functions in payment.ts
│ ├─ Checkpointed
│ └─ Abandoned: "async added complexity without benefit"
├─ Exploration 2: "extract-validation-service"
│ ├─ Created PaymentValidator class
│ ├─ Checkpointed
│ └─ Abandoned: "breaks DI container registration"
└─ Exploration 3: "targeted-fix" ← promoted
├─ Modified validate() directly
├─ Added test coverage
└─ Promoted to main
You can spectate an agent working in real time through its semantic trail — not terminal output, but a structured view of what it’s doing and why. When the promoted approach has issues, you can look at the alternatives and ask, “what about that second approach? What if we fix the DI issue?”
That’s institutional memory for agent decision-making.
Policy Enforcement at Creation Time
This is where the governance stuff comes in, and honestly I think it’s the most practically impactful piece.
Architectural decisions today live in wikis and ADRs that nobody reads. Especially not agents. So agents merrily violate your architecture while generating perfectly functional code that will slowly rot your codebase from the inside.
Flock has a policy engine that intercepts operations at five points: file writes, checkpoints, exploration promotion, merges, and task lifecycle events. Every operation gets one of three verdicts: Allow, Gate (pause for human review), or Block (rejected with explanation).
Some examples of what this enables:
Scope enforcement. The agent gets a task. The policy engine knows what files and symbols are in scope. If the agent starts wandering — and it will, because agents love to “helpfully” refactor things you didn’t ask them to touch — the system catches it immediately. In the default “split” mode, out-of-scope observations get automatically extracted into discovery tasks. The insight is captured, the scope stays clean.
DRY detection with three-layer analysis. This one’s cool. Layer 1 is signature matching — fast comparison of return types, parameter types, name similarity. Layer 2 is body analysis — AST structural comparison that catches methods with different names but identical operations. Layer 3 is pattern conformance — it checks whether you’re implementing a standalone class when there’s an existing interface with three implementations you should be conforming to. Before the agent even writes code, the system surfaces relevant existing implementations from a reuse index. “Hey, CurrencyService already does this. Use that.”
Anti-pattern detection via AST. Not grep-based string matching that catches variable names and comments. Actual AST rules that match type declarations. You can define domain-specific rules — no floating-point for currency calculations, no hardcoded interest rates, no PII fields without encryption annotations — and they’re enforced at write time with structured explanations of why it’s wrong.
Architecture rules. No direct database calls from the API layer. No public methods without interfaces in Core. No circular dependencies. Because Flock has the AST and dependency graph, it checks these in real time as code is written. Not in a PR review. Not in a linter that runs in CI twenty minutes later. Right now, as the agent types.
All of these policies live in .flock/policies.toml, versioned with your code. Different branches can have different policies. The whole team explicitly agrees on what’s enforced. It’s not some invisible linter config that one person set up three years ago.
Event Sourcing and Cryptographic Trust
Under the hood, Flock is built on an append-only event log instead of git’s snapshot-plus-delta model. Every operation — checkpoint, exploration, promotion, merge — generates an immutable event. Each event includes a BLAKE3 hash of the previous event (forming a chain) and an Ed25519 cryptographic signature. Snapshots get Merkle root verification.
What this means in practice: you can’t rewrite history. You can’t silently alter a checkpoint. If an agent goes rogue, you have a complete, cryptographically verified trail of exactly what it did, when, and in what order. Reverting a bad change is a new event (a revert), not history rewriting.
This matters more than you might think in an agentic world. When a human writes bad code, you can ask them what they were thinking. When an agent writes bad code, you need the system to answer that question for you.
What I’m Actually Building
Flock is the CLI and core engine. It’s written in Rust. The event log, semantic layer, exploration model, and policy engine all live here. It works fully offline — local-first — with real-time features layering on when you connect to a server.
The server side handles WebSocket connections, event routing, task brokering, presence tracking, conflict forecasting, and the expensive semantic computations (AST parsing, dependency graphs, impact analysis) that get shared across all clients so agents aren’t each doing redundant work.
Is it done? No. Is it a lot of work? Yes. Am I going to build it anyway? Obviously.
We’re heading into a world where most code is written by agents. The tools we use to manage that code should understand it, coordinate it in real time, enforce quality at the moment of creation, and give humans the visibility they need to stay in control. Git doesn’t do any of that. It was never designed to.
Time to build something that is.