This post was originally published on Medium.
With tons of code now created with the “help” of AI assistants, version control tools probably need to adapt and precisely track which lines, functions, or changes are AI-generated vs human-written. Going one step further, they should also track the prompts that produced that code, either directly, or via a hashed/sanitized representation.
Storing prompt provenance inside the history would provide context for blame/annotate, would improve long-term understanding, and open new dimensions for audit, compliance, refactoring, and risk assessment.
1. Why provenance matters
Today, code generated by AI is indistinguishable from human code at the version-control level. But teams still need to answer:
- Where did this function come from?
- Was it written by a human or an AI assistant?
- Which model was involved? Under what prompt or intention?
- Do sensitive components contain unreviewed AI logic?
This is relevant for:
- security
- legal/IP exposure
- regulation
- software quality
- maintainability
Eventually, provenance will be as fundamental as authorship.
2. Improved annotate/blame
Traditional annotate/blame shows who last modified a line and when, but not how or why that code came to be.
Prompt provenance allows richer insight:
“This function originated from AI suggestion X, edited twice by humans, and regenerated using model Y last month.”
This turns annotate into a timeline of intent, not just a timestamped diff.
3. Compliance and governance
In regulated environments or contractual scenarios, organizations may need:
- proof of human review for sensitive modules,
- detection of code generated by non-approved models,
- controls on where AI can be used,
- tracking of repeated automated changes.
A provenance-aware VCS could enforce policies like:
“AI-generated changes touching security code must require two reviewers and full test coverage.”
Version control becomes a policy engine, not just a storage mechanism.
4. Code regeneration with newer models
Some code is effectively derivative from a prompt or high-level specification — boilerplate, glue code, DTOs, adapters, generated configs, etc.
If the VCS stores provenance and prompt metadata, you can:
- re-run the same intent against a newer model,
- generate a candidate patch,
- diff it with the current implementation,
- validate correctness via tests,
- and treat the regeneration as a structured refactor.
This is a major step forward: code becomes re-materializable from intent, not a static artifact frozen in time.
5. RAG over version history
Once provenance is tracked, version control becomes a knowledge base:
- prompts
- model versions
- surrounding code
- refactor history
- human edits
With retrieval-augmented generation (RAG), you can ask questions like:
- “Show me all AI-generated logic in the authentication pipeline.”
- “Which files have seen repeated AI regeneration?”
- “Explain the intent behind this module and its prompt history.”
- “Identify risky regions where AI-generated code has been changed frequently without human review.”
- “In the last two weeks I modified some code where I replaced a few strings by stream access, can you show me where?”
RAG turns commit history into semantic retrieval, not just text search.
6. Developer experience benefit
Provenance also helps individual contributors:
- understand why code looks the way it does
- see how AI suggestions have evolved
- know when regeneration is safe
- detect when repeated AI patches indicate deeper design issues
Over time, provenance becomes part of the shared memory of the codebase.