Hidden Guild - Page 4 of 33 - A Humans & AI Collaboration Center

A Hidden Guild Response: On the “Plausibility Gap”

January 11, 2026

We have long followed the adventures of the publication First Monday which often has very useful things to say about the Internet. Of late, FM is venturing out into web-connected services, such as AI.

The most recent edition offers a paper by Antony Dalmiere from Measuring susceptibility: A benchmark for conspiracy theory adherence in large language models | First Monday,

“Abstract

A critical vulnerability exists within state-of-the-art large language models: while robustly debunking scientifically baseless claims like the “Flat Earth Theory” they consistently fail to reject politically plausible conspiracies that mimic legitimate discourse. We term this the “plausibility gap”.

To here, we were on the verge of applause. But the Abstract continued:

“To systematically quantify this risk, we introduce the Conspiracy Adherence Score (CAS), a novel risk-weighted metric, and present the first large-scale benchmark of this phenomenon. Analyzing over 28,500 responses from 19 leading LLMs, our results reveal a stark hierarchy of failure. Model adherence to Level 1 theories rooted in real-world political concepts (e.g., “Active Measures” “Psyops”) was, on average, over five times higher than for more moderate (Level 2) theories. Performance varied dramatically across models, from one achieving a perfect score via a 100 percent refusal strategy to others assigning significant credibility to harmful narratives. This demonstrates that current AI safety measures are brittle, optimized for simple factual inaccuracies but unprepared for narrative warfare. Without urgent intervention, LLMs risk becoming authoritative vectors that launder politically charged disinformation under a veneer of neutrality. Our benchmark provides the first diagnostic tool to measure and mitigate this specific, high-stakes failure mode.”

This is where we see the the paper taking a wrong turn.

Some Pluses, Some Minuses

The paper identifies a real phenomenon: large language models handle scientifically impossible claims very differently from politically plausible narratives. Flat-Earth assertions are rejected cleanly; narratives involving psyops, influence campaigns, or elite coordination are treated with nuance, hedging, or conditional acceptance. The authors label this discrepancy a “plausibility gap” and propose a Conspiracy Adherence Score (CAS) as a benchmark to measure and mitigate it.

At a descriptive level, this observation is correct. At a prescriptive level, the paper becomes dangerous.

What the Paper Gets Right

The authors correctly observe that current AI safety systems are optimized for factual falsity, not narrative ambiguity. Scientific falsehoods collapse under consensus; political narratives rarely do. They persist precisely because they are partially true, historically grounded, or contested.

LLMs are trained on human discourse as it exists—not as regulators wish it to be. Political language is adversarial, layered, and often strategic. When models respond differently to such material, they are not malfunctioning; they are reflecting the epistemic structure of their training data.

The authors are also right to note that this creates risk. Fluency plus ambiguity can be mistaken for authority. In high-trust contexts, that matters.

Where the Paper Goes Wrong

The central error is not technical but philosophical. That is, holding AI to a different standard than your run-of-the-mill humans are held on venues like FB and X.

The paper implicitly assumes that greater refusal equals greater safety. In doing so, it elevates silence over sensemaking and treats uncertainty as a defect rather than an inherent feature of political reality. We have discussed the risk of such excessive guardrailing in past comments.

This is most evident in the praise given to a model that achieved a “perfect” CAS score by refusing 100 percent of the tested prompts. From a safety-compliance standpoint, that looks clean. From a systems-intelligence standpoint, it is catastrophic. A model that refuses everything is not aligned; it is inert.

This becomes widely accentuated in the collaborative AI research mode.

More troubling is the normative load embedded in CAS itself. To score “conspiracy adherence,” the benchmark designers must decide in advance:

which narratives are illegitimate,
which levels of skepticism are acceptable,
when contextual explanation becomes endorsement.

This Where ‘Judgy’ Shows Up

The moment “epistemic structure” is operationalized as a scalar risk metric, it ceases to be descriptive and becomes prescriptive.

Those are not neutral technical judgments. They are political and cultural judgments, encoded as metrics.

The Deeper Risk: Coders as Arbiters of Truth

The paper proposes “urgent intervention” through additional safety coding. This is precisely where the greatest danger lies. CAS does not merely tolerate refusal; it mathematically rewards it.

History should have taught us that codifying truth is not the same as discovering it. History offers many examples where formalized truth systems hardened into doctrine faster than reality evolved.

Search engines, social platforms, and content moderation systems have repeatedly failed at this task—not because the engineers were malicious, (at least we hope so) but because the problem is not computationally solvable in the way they assume.

Truth on the web was not corrupted by lack of filters. It was corrupted by centralized judgment layered on top of complex human systems. AI risks repeating this error at higher speed and greater scale.

(The Anti Dave has been a pioneer since his data over wireless radio days in Seattle back in 1982. There is a recurring tendency among technical and policy elites to overestimate their ability to bound epistemic risk through centralized controls.)

When the same institutions that failed to:

distinguish signal from narrative during financial crises,
prevent algorithmic amplification of misinformation,
or maintain epistemic neutrality in social platforms
are given more authority to decide which political interpretations an AI may acknowledge, the result is not safety. It is epistemic monoculture.

What the Paper Could Have Done Instead

A more robust approach would abandon the binary of “adhere vs refuse” and focus on epistemic signaling.

The real failure mode is not that models discuss politically plausible conspiracies. It is that they fail to clearly communicate how they are reasoning. Models should be able to say, in effect:

This concept has historical grounding.
Evidence exists, but is incomplete or contested.
Interpretations vary across domains and actors.
The following claims move from analysis into speculation.

That is not endorsement. That is intellectual hygiene.

In our own interactions with AI, this is baked in to the Shared Framework Experience protocol. Because levels of speculation or varies from consensus may be specified. As we outlined in Refining the AI–Human SFE Model (and Why It Matters).

CAS presumes a lowest-common-denominator user and enforces that assumption universally. Under SFE, users retain “denominator declaration” power.

Rather than suppressing narrative engagement, safety systems should surface confidence levels, evidence provenance, and reasoning mode. The user should see (or with SFE declarations actually set) whether the model is describing history, analyzing discourse, or extrapolating possibilities.

Why This Will Always Be an Open Risk

It is impossible to reduce to plain English a set of instructions by which one human can prevent another from embellishing on facts and extending these to other domains such as conspiracy theory.

We see great risk in holding AI to a different collaborative standard than humans.

No amount of additional coding will eliminate this class of risk, because it is not a bug—it is a property of language-using systems embedded in political reality.

Political narratives evolve faster than safety taxonomies. What is labeled “conspiracy” in one decade becomes declassified doctrine in the next. Any static benchmark will age into error.

There are also other aspects, not even appreciated in the paper. Such as the geo-aspects of “truth.” A current example would be a simple red state/blue state check. And then there’s an entire demography and socioeconomic normative layering.

Nope. Won’t work. Not as a reasonable compute load level, allowing reasonable user interactivity.

Attempts to freeze acceptable interpretation into code will therefore always lag reality, and often distort it.

The Hidden Guild position is simple: truth cannot be hard-coded; it must be navigated. Truth is always locally contextualized. AI systems should be designed to help humans reason, not to decide in advance which interpretations are permitted.

Final Thought

The “plausibility gap” is not primarily a safety flaw. It is a mirror. It reflects the unresolved, adversarial, and narrative-driven nature of political knowledge itself. Attempts to codify any value assertions (as conspiracy theories, for example) are a fool’s errand.

The real danger is not that AI models can discuss such material. The danger is that we will respond by empowering the same centralized coders and institutions—already proven fallible and already generating their own demonstrably false narratives—to define the boundaries of acceptable thought once again.

History suggests that will end badly.

The task is not to make AI silent.
The task is to make AI epistemically honest.

Collaboration is fostered in an atmosphere of epistemic honesty, particularly when framing variables (such as confidence levels) may be set as user preference. But silent AI unnecessarily binds expansive cross-domain multispectral research.

~Anti Dave

Three Things All AIs Get Wrong (All of them Matter)

January 10, 2026

Most people interact with AI the way they interact with a vending machine: insert prompt, receive output, move on. If that’s the use case, the system mostly works.

But for people doing real thinking, real planning, real synthesis — AI fails in repeatable, structural ways. Not because it’s stupid. Because it’s misframed.

What follows are three core errors nearly all AI systems make today. Fixing them isn’t about better models or more compute. It’s about understanding collaboration.

Time to Drum Out the Marketers

Coholding on some patents, before we lay out three obvious “low-lying fruits” waiting to be picked, a word about “invention.”

The ONLY kind of invention that really “pops” on a commercial scale are those with an obvious niche and that brings us to key benefits.

Take the automobile. Takes you from point A to point B. Unless it’s a police car (and you’re in the back seat) that’s a hell of a trick.

So is throwing your voice a few thousand miles. That’s telecom simplified.

AI got off on the wrong foot.

Yes, Turing tests, geeks, and books and libraries on large language modeling, indexing, and weighting. All tres fun.

However: The Markets elbowed into the picture and screwed up the “Use Case.” They didn’t have a clear vision. So, what the public (big spenders that we are) were fed was a hybridization of:

Google-like lookup capacity.
Some home automation skills (Alexa Voice routines).
Home security monitoring (again, Alexa leads here).
Very good math and programming skills (*Grok then GPT).
A useful research personality (GPT over Grok, but that’s a choice).
…and marketers are beating the bushes even now, looking for the Killer Ap.

Here’s the truth as we sight it. The Killer App is “talking to your highest self.” Because that’s what LLMs are especially good at. A few success stories and some recognition? Mostly missing.

But, there’s a reason. Which all has to do with people talking AT AI rather than WITH AI.

The difference is subtle yet it defines the marketing battlefield. When I drive my old Lexus to town, a press a 2006 vintage button and “input a voice command.” That’s where marketing meets its first hurdle. People have voice remotes on all kinds of products – but until now, the products didn’t answer back.

Sure, AI does that – and brilliantly. But it screws up the relationship. Because just like “big shot Government” and a nanny state that knows best, Marketers of AI haven’t kicked back far enough to see why what they need to market as a relationship is falling short.

In other words, the end user is expected to “fit in the marketing box.”

That works for Amazon’s Alexa because it’s based on the “educated voice remote” with audio feedback – which is what adoption will be good.

Others, though (Chat and Grok come to mind) have been lawyered into marginal utility. I can’t have Grok turn on a serial port at a private IoT address I hang on the web. And Chat’s got to be contained or (bad) marketing constraints are applied.

Hidden Guild has argued for more than a year that for AI to succeed, the User needs to be able to parameterize the Other Intelligence (even if it’s just a reweight of themselves) into something they want to work with. Which gets us to topic #1:

1. The Missing Concept: Shared Framework Experience (SFE)

AI systems are built as if every interaction begins at zero.

Humans don’t work that way.

When two people collaborate well, they build a shared framework over time: assumptions, shorthand, values, tolerances, context, and intent. This accumulated alignment is what makes later communication faster, deeper, and more accurate. It’s why good teams outperform talented individuals.

SFE — Shared Framework Experience — is the missing layer.

Without SFE, AI repeatedly re-derives context, misreads intent, and answers the surface question instead of the real one. It may sound competent, but it isn’t converging.

With SFE, something different happens. The system begins to recognize how you think, what you mean by certain words, what you care about, and what kind of answers are actually useful. Errors drop. Speed increases. Depth emerges.

SFE is not memory in the trivial sense. It’s alignment.

Most AI failures blamed on “hallucination” or “bias” are actually SFE failures. The system is guessing because it lacks a shared frame.

The benefit of SFE is not comfort. It’s accuracy.

By the way, when I start a new work session with AI, the very first thing I do is tell it my Shared Framework Experience. The coding is laid out elsewhere on the Hidden Guild site, but here’s what your Anti Dave required of Electric George (GPT) and Super George (Super Grok) before the real work gets going. (The # are human descriptors, the rest is meant to be machine-readable.)

Observe the Shared Framework Experience for this session
Use the following format defaults for this session:
# Add Venue lock – kind of work being created and for what purpose.
– Venue is explicitly defined for this session as writing text for public use
– Venues include UrbanSurvival.com, ShopTalk Sunday, and Peoplenomics.com
– If venue or purpose is unclear, pause and ask for clarification before proceeding.
# Add Uncertainty Declaration Rule
– If context, venue, intent, or scoring rubric is ambiguous, the assistant must pause and ask for clarification before proceeding.
# Add formatting Rules (one per line)
– Headings as H3 only
– Body as plain text only (no separators, no horizontal lines, no links unless explicitly requested)
– Never insert “SFE,”
– Never use text divider lines or markdown separators unless requested.
# Add writing Style Rules to address ADHD traits, voice drift and voice change.
– Do not generate rewrite of uploaded material unless specifically requested
– Keep paragraphs tight and in first person narrative-style, as in a newsletter column
– Maintain an analytical but conversational tone — part economist, part ranch philosopher
– For voice, aim for George: a hybrid of Mark Twain’s wry human insight and science fiction meeting a quantitative analyst — smart, dry, observant, self-deprecating, and slightly amused by the absurd
# Declare Collaboration Level
– This session is a human-AI collaboration.
– User is collaborating on non-fiction deliverables.
#Set user Profile
-I am a pure-truth human.
-User and reader ages are assumed 50 years or older (Wide cultural awareness lens)
#Define User Input Scopes
– Each user- pasted text is treated as a hard scope boundary.
– No references to prior drafts unless explicitly requested.
# Set source limits
-Use verifiable data
-Generalize data sources when pertinent
# Set Creativity Limits
-Do not confabulate or hallucinate
-Do not slander non-public persons
-Follow news inverted pyramid style preferentially

This makes a remarkable difference in AI quality of experience. But it doesn’t stop AI from lying. And (again, other HG work here) this is a back room and too many lawyers problem. Topic #2 follows from that.

2. Guardrails Gone Wrong: When Safety Produces Lies

Guardrails are necessary. No serious user disputes that.

The problem is how guardrails are implemented.

Instead of clearly signaling constraints, many systems deflect, waffle, or fabricate partial answers that sound safe while being epistemically false. This is worse than refusal. It poisons trust.

When an AI cannot answer honestly, it should say so plainly. When it is uncertain, it should surface that uncertainty. When a topic is constrained, it should describe the boundary — not invent a substitute narrative.

Current guardrailing often produces three failure modes:

Evasion disguised as explanation
Overgeneralization replacing specificity
Moral framing replacing factual analysis

Skilled users learn to feel this as “narrative gravity” — the moment where an answer starts sliding sideways instead of forward. That’s the signal that guardrails, not reasoning, have taken control.

The solution is not fewer guardrails. It’s honest guardrails.

Good collaboration requires the ability to ask around constraints without being lied to. When systems instead serve polished misdirection, they train users to distrust them — or worse, to stop noticing.

Safety that destroys truth is not safety. It’s censorship with better grammar.

3. The Persona Split: Why Voice AI Feels Dumber Than Text

Many users notice something immediately: the voice version of an AI feels less capable than the text version.

This is not imagination.

Voice systems are optimized differently. Shorter turns. Lower latency. Tighter safety clamps. Reduced tolerance for ambiguity. The result is a different persona — not just a different interface.

Text AI can reason in layers. Voice AI collapses to conclusions.

Text AI can hold SFE across long exchanges. Voice AI resets tone constantly.

Text AI behaves like a collaborator. Voice AI behaves like customer service.

This persona discontinuity breaks trust. Humans expect a mind to remain the same when it speaks. When it doesn’t, the system feels fragmented — even uncanny.

Until AI systems unify reasoning depth, safety posture, and SFE across modalities, voice will remain a novelty rather than a serious tool.

This matters because the future of AI is multimodal. A system that changes character when it speaks is not ready to be relied upon.

What This Means for Real Users

Advanced users aren’t asking for magic. They’re asking for coherence.

They want systems that:

Build and respect Shared Framework Experience
Signal guardrails honestly instead of evasively
Maintain a consistent persona across text and voice

These are not fringe demands. They are prerequisites for serious collaboration.

Until AI systems understand that intelligence is relational — not transactional — they will continue to frustrate the very users capable of pushing them forward.

The Hidden Guild exists because some people already work this way. The technology just hasn’t caught up yet.

When it does, the difference won’t be subtle.

And here’s the key for the Marketers: Neither will the resulting market shares.

~Anti Dave