The Verification Gap — Alexander Feick

The meeting ends. The AI writes it up.

Every week, millions of meetings end the same way. The AI produces a summary. The project manager reads it, corrects a misattributed action item, fixes a decision that the model hallucinated, and moves on. Ten minutes of careful, skilled verification work — and no one will ever know it happened.

Tools like Microsoft Copilot have made AI note-taking ubiquitous. They transcribe, summarize, extract action items, and tag decisions. For many teams, the meeting recap is the first thing people check the next morning. The generation problem is solved. What remains unsolved is the trust problem.

Consider what happens when a typical meeting ends. Here is the output Copilot produces — a clean, structured recap that looks authoritative from the moment it appears.

It looks comprehensive. It looks correct. Most people will never question it.

See example›

‹Back to text

▶ What Copilot produces today

The invisible value of review

When a project manager reviews AI-generated meeting notes, their value isn't in producing the summary — the model already did that. The value is in confirming that what was captured is correct and aligned with what actually happened. They catch the hallucinated deadline. They notice the AI merged two separate conversations into a single action item. They flag the decision that was discussed but never actually agreed to.

This is verification work, and it's the most important human contribution in an AI-assisted workflow. But the tools don't capture it. They don't surface it. They don't share it.

Citations are not verification

Today, Copilot adds small superscript citations linking summary claims to transcript timestamps. This looks like accountability. It isn't.

Citations are a fundamentally human trust technique. When a person writes a report and adds a citation, they are signalling: I'm a professional, this happened, and here is where you can go check it yourself. That signal works because it's backed by a human level of care — the author took the time to get it right, and the citation is an invitation to verify their diligence.

AI co-opts the form of citation without the substance behind it. A model can and frequently does pass errors through, then confidently cite them. The citation says “I found this in the transcript.” It does not say “I got this right.”

Every person who opens the document faces the same question: can I trust these notes? And every person must answer it independently. If no one checks the citations, they are worthless. The superscript numbers are decoration — they tell you the AI can point to its source material, but they tell you nothing about whether anyone with actual context has confirmed the result.

So why don't we log when citations are checked? Why don't we make it easy for humans to correct the AI when it gets something wrong, and surface that correction to everyone else? This is the missed productivity opportunity. Not the generation of notes — that's already automated. The missed opportunity is that the verification work your people are already doing is treated as a private, disposable task rather than a shared organizational asset.

See example›

‹Back to text

▶ The same notes with Copilot's citation references

Trust doesn't scale when verification is invisible

Consider a typical week. A project manager verifies the meeting recap for the Monday sprint review, the Wednesday architecture sync, and the Friday stakeholder update. That's perhaps 30 minutes of focused review work — checking facts against memory, comparing action items to what was discussed, flagging sections where the AI got it wrong.

On Thursday, a teammate opens the Monday notes to check a commitment. What does he see?

There is no signal that the PM already reviewed these notes. No indication of which sections were confirmed, which were corrected, or which the PM hadn't gotten to yet. The teammate must either re-verify independently or choose to trust the model on faith.

Multiply this by every person, every meeting, every week. The organization is paying for verification work that evaporates the moment it's done.

See example›

‹Back to text

⚠ The teammate's experience: no signal of human review

What verification capture could look like

The fix is not complicated. If the tool knows that a human reviewed a section, it should say so. If a reviewer flagged an item as inaccurate, that flag should be visible to the next person who opens the document. If three sections were verified and one wasn't, the document should communicate that at a glance.

This is not a new concept. In On Trust and AI, I describe a pattern for AI-assisted legal research where every citation carries a verification state — green for confirmed, amber for flagged, grey for unreviewed. The reviewer's name and timestamp are attached. A partner opening the brief can see immediately which parts have been checked and by whom.

Sarah, the project manager, spent 10 minutes after the meeting verifying the recap. That work is now visible to everyone on the team. Green means confirmed. Amber means something was flagged. Grey means no one has looked at it yet. Trust moves from the model to the team.

Notice what changed. The content is identical — the AI still generated these notes. But now each section carries a human signal. The flagged action item includes Sarah's note that the AI captured three regression issues but only two were actually discussed in detail. The risks section is grey because she hasn't reviewed it yet. Anyone opening this document instantly knows what's been checked and what hasn't.

See example›

‹Back to text

✓ The same notes with verification state visible

Scaling trust across the organization

The real power emerges when you zoom out. If every meeting recap carries verification state, you can answer a question that's currently invisible: across all our recent meetings, which AI-generated outputs has someone actually reviewed?

The Client Onboarding Kickoff notes? No one's looked at them. The Security Audit Readiness Sync? Fully verified by David. The verification work that project managers and leads do every day is now a visible, shareable asset — and the team can make informed decisions about which AI outputs to rely on.

Trust is no longer invisible. Verification becomes a shared resource, not a private chore.

See example›

‹Back to text

▶ A team-level view of verification coverage

Surface the work. Scale the trust.

We are not yet in an era of fully aligned, reliably correct AI. Models hallucinate. Transcripts miss nuance. Summaries compress away context. This isn't a failure of any particular product — it's the nature of the technology as it exists today.

What is a failure is the decision to treat human verification as invisible. When a project manager reviews the AI's output, they are performing the most trust-critical task in the workflow. They are the reason the rest of the team can rely on the notes. To bury that work — to leave no trace that it happened — is to waste the very thing that makes AI outputs safe to use at scale.

Generation is abundant. Belief is scarce. The tools that win will be the ones that make it easy to trust the team reviewing the AI — not just the AI itself.

See example›

‹Back to text