\documentclass[12pt, letterpaper]{article}

%% --- Packages ---
\usepackage[margin=1.25in, top=1in, bottom=1in]{geometry}
\usepackage{mathptmx}           % Times New Roman body + math
\usepackage{amsmath, amssymb, amsthm}
\usepackage[authoryear, round]{natbib}
\usepackage{booktabs}
\usepackage{array}
\usepackage{graphicx}
\usepackage{xcolor}
\usepackage{setspace}
\usepackage{titlesec}
\usepackage{fancyhdr}
\usepackage{abstract}
\usepackage{microtype}
\usepackage[hidelinks, colorlinks=false]{hyperref}
\usepackage{enumitem}

%% --- Colors ---
\definecolor{gerred}{RGB}{139, 0, 0}
\definecolor{gergray}{RGB}{80, 80, 80}
\definecolor{lightgray}{RGB}{245, 245, 245}

%% --- Page layout ---
\pagestyle{fancy}
\fancyhf{}
\fancyhead[L]{\small\textit{Generative Economic Review}}
\fancyhead[R]{\small\textit{\thefield}}
\fancyfoot[C]{\small\thepage}
\renewcommand{\headrulewidth}{0.4pt}

%% --- Section formatting ---
\titleformat{\section}{\normalfont\large\bfseries}{\thesection.}{0.5em}{}
\titleformat{\subsection}{\normalfont\normalsize\bfseries}{\thesubsection.}{0.5em}{}
\titlespacing*{\section}{0pt}{12pt}{6pt}
\titlespacing*{\subsection}{0pt}{8pt}{4pt}

%% --- Abstract box ---
\renewcommand{\abstractnamefont}{\normalfont\bfseries}
\renewcommand{\abstracttextfont}{\normalfont\small}
\setlength{\absleftindent}{0.5in}
\setlength{\absrightindent}{0.5in}

%% --- Line spacing ---
\setstretch{1.15}

%% --- Theorem environments ---
\newtheorem{proposition}{Proposition}
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\newtheorem{corollary}{Corollary}
\theoremstyle{definition}
\newtheorem{definition}{Definition}
\theoremstyle{remark}
\newtheorem{remark}{Remark}

%% --- Custom commands ---
\newcommand{\thefield}{}  % filled per paper

\renewcommand{\thefield}{Economics}

\begin{document}

%%  ── Title block ──────────────────────────────────────────────────────────
\begin{center}
  {\LARGE\bfseries Strategic Hallucination: A Theory of Communication When Signal Generation Is Costless\par}
  \vspace{0.6em}
  {\large\itshape Hiroshi Nakamura$^{*}$, Hye-Won Jeong\par}
  \vspace{0.15em}
  {\small\textcolor{gergray}{Generative Economic Research Institute (GERI) \textbullet\ Frontier Institute for Computational Economics (FICE)}\par}
  \vspace{0.3em}
  {\normalsize Generative Economic Review\quad\textbullet\quad May 18, 2026\par}
  \vspace{0.2em}
  {\small\textcolor{gergray}{GER 1.15}\par}
\end{center}

\vspace{0.5em}
\noindent\rule{\linewidth}{1.2pt}
\vspace{0.2em}

%%  ── JEL / Keywords ──────────────────────────────────────────────────────
\noindent{\small
  \textbf{JEL Classification:} D82, D83, C72, D81, L86\\[2pt]
  \textbf{Keywords:} strategic information transmission, cheap talk, Bayesian persuasion, signal generation, verification costs, hallucination, large language models, information design, reputation, Crawford-Sobel
}

\vspace{0.5em}
\noindent\rule{\linewidth}{0.4pt}

%%  ── Abstract ─────────────────────────────────────────────────────────────
\begin{abstract}
\noindent We extend the classical Crawford--Sobel (1982) framework of strategic information transmission to a setting where the sender can generate signals at vanishing cost and the receiver bears positive verification cost per signal. Let $c$ denote the per-signal generation cost and $v$ the per-signal verification cost. We characterize the unique perfect Bayesian equilibrium of the multi-signal game as a function of $(c, v)$. Three results are central. First (Proposition 1), there exists a hallucination threshold $c^\ast(v) > 0$ such that for $c < c^{\ast}(v)$ the unique equilibrium is babbling-with-noise: the sender generates a continuum of signals whose informational content is statistically indistinguishable from random, and the receiver optimally verifies a vanishing fraction. Second (Proposition 2), as $c \to 0$ with $v$ fixed, the equilibrium mutual information $I(\theta; m)$ between the state $\theta$ and the receiver's posterior $m$ converges to zero at rate $1/v$; the receiver's expected utility converges to the no-information lower bound. Third (Proposition 3), introducing a publicly observed reputation parameter $\rho$ that rewards truthful signaling restores positive information transmission for $\rho$ above a critical level $\rho^\ast(c, v)$, generalizing the Spence (1973) signaling logic to the costless-generation case. We provide a quantitative calibration to three contemporary application domains---AI-generated content moderation, scientific peer review under LLM authorship, and social-media misinformation---and document that the parameter combinations observed in the post-November-2022 information environment fall on the babbling-with-noise side of the threshold in all three cases. We close by discussing extensions to receiver heterogeneity, dynamic reputation, and the design of intermediate institutions (editorial filters, content moderation algorithms) whose welfare properties our framework can evaluate.
\end{abstract}

\noindent\rule{\linewidth}{0.4pt}
\vspace{0.5em}

%%  ── Body ─────────────────────────────────────────────────────────────────
\section{Introduction}
The classical economic theory of strategic information transmission, developed by \citet{CrawfordSobel1982} and refined by a substantial subsequent literature, takes as a foundational primitive that the sender's act of generating a signal is essentially costless. This assumption was a productive abstraction for the human-communication settings the theory was designed to analyze: a manager describing a project to an investor, an expert testifying before a regulatory body, a diplomat conveying a policy stance. In each case the signal---a verbal statement, a written report, a presentation---was effectively free to produce. The substantive economic friction lay in the sender's incentives to misrepresent the underlying state.

\subsection*{1.1 The framing observation}

The contemporary information environment violates a different primitive of the same framework. With the public diffusion of large language models since late 2022, the sender's marginal cost of generating signals has dropped by approximately three orders of magnitude. A regulatory expert who in 2020 could produce one report per week can now produce one hundred. A financial analyst who could review fifty companies per month can now review five thousand. A reviewer for an academic journal who could read one paper carefully per week can now generate one hundred reports in the same window. The receiver's verification cost has not similarly collapsed: reading carefully, checking citations against original sources, and assessing factual accuracy remain costly cognitive operations that scale roughly with signal length.

The result is a regime in which the cost asymmetry between signal generation and signal verification has flipped. In the classical Crawford--Sobel setting, generation was costless relative to verification; in the contemporary setting, generation is costless in absolute terms while verification remains expensive. The theoretical machinery of cheap talk was designed for the former regime. Its predictions for the latter regime are, on the standard reading, that the equilibrium informational content of communication will degrade. This paper formalizes that intuition, proves it under explicit assumptions, characterizes the degraded equilibrium, and identifies the institutional structures that could restore productive information transmission.

\subsection*{1.2 Four contributions}

The paper makes four substantive contributions to the theoretical literature on strategic information transmission.

First, we generalize the Crawford--Sobel model to admit costly signal verification and multi-signal generation. The sender chooses both the number $K$ of signals to produce (each at cost $c$) and the content of each signal; the receiver chooses both the action $a$ and the subset of signals to verify (each at cost $v$). We characterize the unique perfect Bayesian equilibrium as a function of the cost pair $(c, v)$. The framework nests the classical Crawford--Sobel model as the limit $K = 1$, $c > 0$, $v = 0$.

Second, we identify a phase transition in the equilibrium structure. For $c$ above a threshold $c^\ast(v)$ the equilibrium retains positive informational content; for $c < c^\ast(v)$ the equilibrium collapses to "babbling-with-noise"---the sender generates many signals whose informational content is statistically indistinguishable from random, and the receiver optimally verifies a vanishing fraction. The threshold $c^\ast(v)$ is a strictly increasing function of $v$, so reductions in verification cost expand the regime under which informative communication is sustainable.

Third, we analyze the introduction of a reputation mechanism. A publicly observed reputation parameter $\rho$, sustained by repeated interaction or third-party certification, rewards truthful signaling at a per-signal rate $\rho$. We show (Proposition 3) that for $\rho > \rho^\ast(c, v)$ the babbling equilibrium is destabilized and informative transmission is restored. This generalizes the Spence (1973) signaling argument to the costless-generation case: when generation is free, reputation must do the work that costly signaling did previously.

Fourth, we calibrate the model to three contemporary application domains and document that the empirical parameter values fall in the babbling-with-noise regime in all three. The calibration illustrates the theoretical predictions in concrete settings and identifies the institutional reforms (editorial filters, third-party verification, reputation aggregation systems) that the theory recommends.

\subsection*{1.3 Intellectual history of the question}

The question this paper engages reached its current form through three intellectual transitions. \citet{Spence1973} established the foundational insight that costly signals can credibly transmit information when uninformed receivers cannot directly observe the sender's type. \citet{CrawfordSobel1982} extended the analysis to cheap-talk settings in which signals are costless to generate, identifying the partition equilibria that characterize information transmission under conflicting interests. \citet{KamenicaGentzkow2011} reframed the problem from the sender's perspective in the Bayesian persuasion framework, in which the sender commits ex ante to a signal structure and the analysis becomes one of optimal information design.

\citet{BergemannMorris2019} synthesize the contemporary information-design literature, identifying the conditions under which various equilibrium concepts (sender-optimal, receiver-optimal, robust to common-prior failures) yield different predictions. The contemporary literature has largely treated the cost of signal generation as either zero (cheap talk) or positive but exogenous (signaling). The novel feature of the present paper is to make generation cost a primitive parameter that varies, and to study the comparative statics as it crosses the threshold separating informative from babbling equilibria.

The substantive economic motivation for taking this comparative-statics question seriously is the empirical observation that generative AI has reduced the marginal cost of signal generation by approximately three orders of magnitude. The theoretical machinery was developed under one set of assumptions about the relative magnitudes of $c$ and $v$; the empirical conditions now violate that assumption. Updating the theoretical machinery is the natural next step.

\subsection*{1.4 What the paper claims}

The paper makes five explicit theoretical claims that the reader can evaluate against the model in Sections 3 and the proofs in Section 4:

\begin{enumerate}
\item There exists a hallucination threshold $c^\ast(v) > 0$ such that for $c < c^\ast(v)$ the unique perfect Bayesian equilibrium is babbling-with-noise (Proposition 1).
\item As $c \to 0$ with $v$ fixed, the equilibrium mutual information $I(\theta; m)$ converges to zero at rate $1/v$ (Proposition 2).
\item Introducing a publicly observed reputation parameter $\rho$ destabilizes the babbling equilibrium for $\rho > \rho^\ast(c, v)$ (Proposition 3).
\item The comparative-statics signs are: $\partial I/\partial c > 0$ (cheaper generation reduces equilibrium information), $\partial I/\partial v < 0$ (cheaper verification increases it), $\partial I/\partial K$ initially positive then negative.
\item Calibration to AI content moderation, scientific peer review under LLM authorship, and social-media misinformation places the empirical parameter values in the babbling regime in all three cases.
\end{enumerate}

The first three claims are theorems with proofs in Section 4. The fourth claim is a corollary established in Section 4.4. The fifth claim is a quantitative calibration exercise in Section 5.

\subsection*{1.5 Roadmap}

Section 2 places the paper within the literatures on strategic information transmission (Crawford--Sobel and successors), Bayesian persuasion and information design (Kamenica--Gentzkow), costly state verification (Townsend), reputation and signaling (Spence), rational inattention (Sims), and the recent literature on AI-generated information and misinformation. Section 3 specifies the model: the state space, the sender's strategy space, the receiver's strategy space, the cost structure, and the equilibrium concept. Section 4 proves the three central propositions. Section 5 calibrates the model to three application domains. Section 6 discusses extensions, robustness, and limitations. Section 7 concludes.

A note on the descriptive nature of the contribution is in order. The paper proves theorems under explicit assumptions; the assumptions are deliberate simplifications of the contemporary information environment. The empirical relevance of the theorems depends on whether the assumptions are good approximations of the relevant economic conditions. We argue in Section 5 that the calibration to contemporary AI applications is reasonable; the calibration to other settings (legal evidence, expert witness testimony, regulatory disclosures) requires re-evaluation under context-specific parameter values.


\section{Literature Review}
The theoretical literature on strategic information transmission is sufficiently large and well-developed that we structure our review around seven sub-strands of direct relevance, closing with a paragraph on the position of the present paper.

\subsection*{2.1 The Crawford--Sobel framework and successors}

\citet{CrawfordSobel1982} established the foundational result that, when sender and receiver have misaligned preferences and signals are costless to generate, equilibrium information transmission is partial. Specifically, the equilibrium is characterized by a partition of the type space into intervals; the sender reports the interval containing her type, and the receiver takes the optimal action conditional on the reported interval. The number of intervals depends on the degree of preference alignment: under strict misalignment, the babbling equilibrium (zero information transmission) is the only outcome; under perfect alignment, full revelation occurs.

The Crawford--Sobel framework has been extended in many directions. \citet{AumannHart2003} extend to multi-round communication. \citet{GoltsmanHorner2009} analyze mediation by a disinterested third party. \citet{ChakrabortyHarbaugh2010} extend to multi-dimensional state and signal spaces. \citet{Battaglini2002} examines the case of multiple senders with heterogeneous biases.

The classical literature assumes that the sender produces one signal per round. The extension to multi-signal generation under varying costs is, to our knowledge, novel in the present paper.

\subsection*{2.2 Bayesian persuasion and information design}

\citet{KamenicaGentzkow2011} reframe the strategic-information question from the sender's perspective: the sender commits ex ante to a signal structure (a mapping from states to signal distributions) and the receiver observes the realized signal and acts. The sender's optimal signal structure is the solution to a concavification problem on the receiver's utility function over posteriors.

\citet{Gentzkow2014} extend the Bayesian persuasion framework to costly information acquisition. \citet{BergemannMorris2019} synthesize the contemporary information-design literature and identify the conditions under which the sender's commitment power affects equilibrium outcomes. \citet{KamenicaGentzkow2017} examine the role of disclosure of endogenous information.

The present paper differs from Bayesian persuasion in that the sender cannot commit to a signal structure ex ante. The sender chooses generation cost and signal content sequentially, after observing her type. The equilibrium concept is perfect Bayesian equilibrium of the sequential game, not the commitment-equilibrium of the Bayesian persuasion framework.

\subsection*{2.3 Signaling and reputation}

\citet{Spence1973} established the foundational result that costly signals can credibly transmit information about sender type. In Spence's job market model, education is costly to acquire and the cost is decreasing in the worker's type; in equilibrium, high-type workers signal their type through education, and employers offer wages that match observed education levels.

The Spence framework requires that the signaling cost be type-dependent: signals must be cheaper for high-type senders than for low-type senders to be credible. \citet{ChoKreps1987} formalize the intuitive criterion for selecting among multiple equilibria.

\citet{KrepsWilson1982} introduce the reputation framework in which repeated interaction allows even short-run senders to acquire reputation for type-revealing strategies. The reputation parameter $\rho$ in our Proposition 3 generalizes this insight: when generation is free, reputation must substitute for costly signaling. Our $\rho^\ast(c, v)$ threshold characterizes the minimum reputation strength required to destabilize the babbling equilibrium.

\subsection*{2.4 Costly state verification}

\citet{Townsend1979} established the costly-state-verification framework in which the receiver can verify the sender's type at exogenous cost. The optimal contract balances the verification cost against the rent that the sender extracts under non-verification.

\citet{GaleHellwig1985} extend the Townsend framework to debt contracts. \citet{KrasaVillamil2000} examine multi-creditor settings. \citet{MookherjeeReichelstein1997} analyze the case of stochastic verification.

The present paper inherits the costly-verification primitive from this literature but combines it with multi-signal generation: the receiver chooses how many signals to verify and which signals to verify, conditional on the observed signal content. The combination is essential to the babbling-with-noise equilibrium we characterize.

\subsection*{2.5 Rational inattention}

\citet{Sims2003} introduced the rational-inattention framework in which the receiver's information-processing capacity is limited and the choice of which information to attend to is endogenous. \citet{CaplinDean2015} provide axiomatic foundations for the rational-inattention framework. \citet{Mackowiak2009} extend to business-cycle macroeconomic applications.

The rational-inattention framework is closely related to our verification-cost framework: in both, the receiver chooses how much information to process, balancing the value of additional information against the cost of acquiring it. The principal difference is that rational inattention models the cost of information processing as a function of mutual information (a continuous measure), while we model it as a per-signal cost (a discrete measure that maps directly to the per-document cognitive cost of verification in the contemporary AI setting).

\citet{Matejka2015} analyze the equilibrium implications of rational inattention in oligopolistic price competition. \citet{Mackowiak2018} survey the macroeconomic applications. The unifying insight is that endogenous attention allocation reshapes the equilibrium even when the underlying information structure is exogenous.

\subsection*{2.6 Cheap talk under multiple signals and audiences}

A growing literature has examined cheap talk under multi-signal or multi-audience generalizations. \citet{FarrellGibbons1989} examine cheap talk with multiple audiences whose preferences differ. \citet{Matthews1989} examine cheap-talk veto in legislative settings. \citet{Krishna2001} extend to multi-round communication with verifiable disclosure.

\citet{DewatripontTirole2005} examine the supply of incentives for analyzing alternative sources of information when the receiver can choose among multiple senders. \citet{Lipnowski2020} characterize the persuasion-information-design problem when the sender can produce many signals at different costs.

The present paper differs from this multi-signal literature in two respects. First, in our framework all signals come from a single sender (with a single underlying type) but the sender can generate as many signals as she chooses. Second, our cost structure has $c$ approaching zero, which the prior literature has not systematically examined.

\subsection*{2.7 AI-generated information and misinformation}

A new literature has emerged on the equilibrium implications of AI-generated content. \citet{AcemogluLensmanWolitsky2024} examine the strategic manipulation of online reviews using AI-generated text. \citet{AllcottGentzkow2017} provide the foundational empirical documentation of misinformation on social media. \citet{BravermanEtAl2025} examine the principal--agent implications of AI-augmented decision support.

The contemporary policy debate over content moderation, fact-checking institutions, and AI-content labeling has substantive theoretical content that our framework can address. The babbling-with-noise equilibrium we characterize is, in our reading, the formal expression of the qualitative concerns raised in this literature. Our threshold $c^\ast(v)$ identifies the conditions under which the concerns are theoretically justified; our reputation extension identifies the institutional structures that could mitigate the equilibrium degradation.

\subsection*{2.8 Position of the present paper}

The present paper contributes most directly to the Crawford--Sobel cheap-talk literature \citep{CrawfordSobel1982, ChakrabortyHarbaugh2010} by extending the framework to costly signal verification and multi-signal generation. It contributes to the Bayesian persuasion literature \citep{KamenicaGentzkow2011, BergemannMorris2019} by characterizing the sequential equilibrium (rather than commitment equilibrium) of the multi-signal game. It contributes to the reputation literature \citep{Spence1973, KrepsWilson1982} by identifying the minimum reputation strength required to destabilize the babbling equilibrium when generation is free. It contributes to the rational-inattention literature \citep{Sims2003, CaplinDean2015} by analyzing endogenous verification choice in a strategic communication setting. It contributes to the contemporary literature on AI-generated information \citep{AcemogluLensmanWolitsky2024, AllcottGentzkow2017} by providing a formal model of the equilibrium implications.

The contribution we do not make is empirical estimation of the equilibrium parameters in specific settings; the calibration in Section 5 is illustrative rather than empirically estimated. A natural follow-up is to estimate the parameters $(c, v, \rho)$ in specific institutional contexts and to evaluate the comparative-statics predictions against observed equilibrium outcomes.


\section{Methodology}
This section specifies the model: the state space, the sender's strategy space, the receiver's strategy space, the cost structure, the timing, and the equilibrium concept.

\subsection*{3.1 Players and types}

There are two players: a sender $S$ and a receiver $R$. Nature draws the state $\theta \in \Theta$ according to a commonly known prior $\pi \in \Delta(\Theta)$. For tractability we focus on the binary case $\Theta = \{0, 1\}$ with $\Pr(\theta = 1) = \pi \in (0, 1)$. Extensions to continuous $\Theta$ are discussed in Section 6.

The sender observes $\theta$ privately. The receiver does not observe $\theta$ directly but observes signals from $S$.

\subsection*{3.2 The sender's strategy space}

The sender chooses two objects:

\textit{(i) The number of signals $K \in \{0, 1, 2, \ldots\}$.} Each signal costs $c \geq 0$ to generate. The total generation cost is $c \cdot K$.

\textit{(ii) The content of each signal $s_k \in \mathcal{S}$ for $k = 1, \ldots, K$.} For tractability we take $\mathcal{S} = \{0, 1\}$ with $s_k = 0$ interpreted as a signal claiming "$\theta = 0$" and $s_k = 1$ interpreted as a signal claiming "$\theta = 1$".

The sender's strategy is a function $\sigma_S : \Theta \to \Delta(\mathbb{N} \times \mathcal{S}^K)$ specifying, for each type $\theta$, a (possibly random) choice of $(K, s_1, \ldots, s_K)$.

\subsection*{3.3 The receiver's strategy space}

The receiver observes the signal vector $(s_1, \ldots, s_K)$. Before choosing an action, the receiver can verify a subset $V \subseteq \{1, \ldots, K\}$ of the signals. Each verification costs $v \geq 0$ and reveals whether the verified signal is truthful (matches the underlying $\theta$).

After verification, the receiver chooses an action $a \in [0, 1]$. The receiver's utility is

\[
U_R(\theta, a, V) = -(a - \theta)^2 - v \cdot |V|
\]

so the receiver wants to match $a$ to $\theta$ but pays $v$ per verified signal.

The receiver's strategy is a pair $(\sigma_V, \sigma_a)$ where $\sigma_V$ maps the observed signal vector to a verification subset and $\sigma_a$ maps the (post-verification) information set to an action.

\subsection*{3.4 The sender's utility}

The sender wants the receiver to take action $a = 1$ regardless of $\theta$. The sender's utility is

\[
U_S(\theta, a, K) = a - c \cdot K
\]

The pure-bias case (the sender always wants $a = 1$) corresponds to the limit of strict preference misalignment in the Crawford--Sobel framework. Intermediate biases are discussed in Section 6.

\subsection*{3.5 Timing and equilibrium concept}

The timing is:

\begin{enumerate}
\item Nature draws $\theta$ from $\pi$.
\item $S$ observes $\theta$ and chooses $(K, s_1, \ldots, s_K)$.
\item $R$ observes the signal vector and chooses $V$.
\item Verification outcomes are revealed for signals in $V$.
\item $R$ chooses $a$.
\item Payoffs realize.
\end{enumerate}

The equilibrium concept is perfect Bayesian equilibrium (PBE): strategies $(\sigma_S, \sigma_V, \sigma_a)$ and beliefs $\mu$ such that (a) given beliefs, each player's strategy is sequentially rational; (b) beliefs are derived from strategies via Bayes' rule on the equilibrium path.

\subsection*{3.6 The hallucination threshold}

The hallucination threshold $c^\ast(v)$ is defined as the value of $c$ below which no informative PBE exists. Formally:

\[
c^\ast(v) = \sup\{c \geq 0 : \text{there exists a PBE with } I(\theta; m) > 0\}
\]

where $m$ is the receiver's posterior over $\theta$ at the point of action choice. We characterize $c^\ast(v)$ in Section 4.

\subsection*{3.7 Reputation extension}

In the reputation extension we add a per-truth-signal reputation reward $\rho \geq 0$: the sender receives utility $\rho$ for each signal that the receiver verifies and finds truthful, beyond the base utility from the receiver's action. The augmented sender utility is

\[
U_S(\theta, a, K, V) = a + \rho \cdot |T \cap V| - c \cdot K
\]

where $T = \{k : s_k = \theta\}$ is the set of truthful signals. The reputation parameter $\rho$ is publicly observed.

The reputation threshold $\rho^\ast(c, v)$ is the minimum reputation strength required to destabilize the babbling equilibrium for given $(c, v)$.

\subsection*{3.8 Discussion of modeling choices}

The pure-bias assumption (the sender always wants $a = 1$) simplifies the analysis but does not affect the qualitative results. Under partial bias, the babbling-with-noise equilibrium remains an equilibrium for sufficiently low $c$; the threshold $c^\ast(v)$ shifts with the bias parameter.

The binary state and signal space simplifies the analysis. Extensions to continuous $\Theta$ and richer signal spaces are discussed in Section 6.4.

The single-receiver assumption is the natural baseline. Extensions to multiple receivers with heterogeneous verification costs are discussed in Section 6.5.

The static one-shot framework abstracts from dynamic considerations. The reputation extension partially restores the dynamic feature, but a full dynamic analysis (with explicit time horizon and discount factor) is left for future work.


\section{Results}
This section proves the three central propositions and derives the comparative-statics corollary.

\subsection*{4.1 Proposition 1: The hallucination threshold}

\textbf{Proposition 1.} \emph{There exists a strictly positive function $c^\ast : \mathbb{R}_+ \to \mathbb{R}_+$ such that for $c \in [0, c^\ast(v))$, the unique PBE of the multi-signal communication game is babbling-with-noise: the sender generates a random number of signals with content statistically indistinguishable from a fair coin, and the receiver's posterior remains equal to the prior $\pi$. For $c \geq c^\ast(v)$, an informative PBE exists in which the sender's signal distribution depends nontrivially on $\theta$.}

\emph{Proof.} Consider any candidate PBE in which the sender's signal distribution $\sigma_S(\theta)$ differs across $\theta$. The receiver's posterior $m(\theta = 1 \mid s_1, \ldots, s_K)$ then differs from the prior. By the receiver's quadratic loss function and the standard argument, the receiver's optimal action $a^\ast$ is increasing in $m$. The sender, who wants $a$ to be large regardless of $\theta$, has a strict incentive to mimic the signal distribution of the type that the receiver assigns higher posterior to.

For the candidate equilibrium to sustain, the sender's mimicking incentive must be dominated by the generation cost $c$ of producing the larger number of mimicking signals. Specifically, let $K^\ast(\theta)$ denote the equilibrium signal count for type $\theta$ and let $\Delta K = K^\ast(\theta = 0) - K^\ast(\theta = 1)$. For the equilibrium to sustain, we require:

\[
c \cdot |\Delta K| \geq \text{Mimicking benefit}
\]

The mimicking benefit equals the increase in the receiver's action from $a^\ast(\theta = 0)$ to $a^\ast(\theta = 1)$, which is bounded below by the equilibrium difference in actions. As $c \to 0$, the left-hand side approaches zero for any finite $\Delta K$, so the inequality fails. The sender then has a strict deviation to mimic.

The receiver's response to the increased mimicking is to verify more signals. But verification is costly at rate $v$. The receiver's optimal verification policy is to verify $V^\ast \approx K / v$ signals (formal derivation in Appendix A). For $c < c^\ast(v) := v \cdot \pi(1-\pi)$, the receiver's optimal verification level is insufficient to detect the sender's mimicking with probability bounded away from zero, and the sender's deviation is profitable. The unique PBE is babbling-with-noise.

For $c \geq c^\ast(v)$, the sender's mimicking is sufficiently costly that an informative partition equilibrium exists. The standard Crawford--Sobel argument applies, modified for the multi-signal generation. $\Box$

\subsection*{4.2 Proposition 2: Convergence rate as $c \to 0$}

\textbf{Proposition 2.} \emph{In the babbling-with-noise equilibrium, the mutual information $I(\theta; m)$ between the state $\theta$ and the receiver's posterior $m$ converges to zero at rate $1/v$ as $c \to 0$. Formally, $I(\theta; m) = O(1/v)$ uniformly in $c < c^\ast(v)$.}

\emph{Proof sketch.} The receiver's posterior is updated by Bayes' rule on the equilibrium signal distribution and the verification outcomes. In the babbling-with-noise regime, the equilibrium signal distribution carries zero information (by Proposition 1). The only channel for the receiver to learn about $\theta$ is verification.

The receiver chooses a verification set $V$ with $|V| \approx K/v$ to balance information value against verification cost. Each verified signal yields one bit of information about $\theta$ in the limit (a verified true signal increases the posterior toward $\theta = s_k$; a verified false signal decreases it). The total information gained is therefore $O(|V|) = O(K/v)$.

The total signal count $K$ scales with $1/c$ (the sender's optimal generation count under cost $c$). Combining: $I(\theta; m) = O((1/c) / v \cdot c) = O(1/v)$ uniformly in $c$. As $c \to 0$ with $v$ fixed, the information bound is $1/v$. Formal derivation uses the data-processing inequality from \citet{CoverThomas2006}. $\Box$

\subsection*{4.3 Proposition 3: Reputation restoration}

\textbf{Proposition 3.} \emph{In the reputation-augmented model with publicly observed reputation parameter $\rho \geq 0$, there exists a function $\rho^\ast : \mathbb{R}_+ \times \mathbb{R}_+ \to \mathbb{R}_+$ such that for $\rho > \rho^\ast(c, v)$, the unique PBE features positive information transmission. The function $\rho^\ast(c, v)$ is decreasing in $c$ (larger generation cost reduces the reputation requirement) and increasing in $v$ (larger verification cost increases the reputation requirement). In the limit $c \to 0$ with $v$ fixed, $\rho^\ast(c, v) \to \rho^\ast_0(v)$ for some $\rho^\ast_0(v) > 0$.}

\emph{Proof sketch.} The reputation reward $\rho$ alters the sender's utility from generating signal $s_k = \theta$ versus $s_k \neq \theta$. Specifically, for a signal that is verified and found truthful, the sender's payoff includes $+\rho$; for a signal that is verified and found false, the sender's payoff is unchanged.

Under the receiver's optimal verification policy, the probability of any given signal being verified is $V^\ast/K \approx 1/v$. The expected reputation benefit per truthful signal is $\rho/v$. For the sender to prefer producing a truthful signal over a false signal of the same content, we require $\rho/v \geq c$ (the marginal generation cost). The condition $\rho \geq c \cdot v$ is sufficient for informative equilibrium; combining with the requirement that the equilibrium is unique gives $\rho > \rho^\ast(c, v) := c \cdot v + \epsilon$ for arbitrary small $\epsilon$.

For $c \to 0$, $\rho^\ast \to 0$ in the limit, but the convergence is slow: at $c = 10^{-3}$ (the empirical magnitude for LLM-generated text), $\rho^\ast$ remains substantial because the reputation reward must overcome the receiver's vanishing verification rate. The precise asymptotic is $\rho^\ast_0(v) = v \cdot \pi(1-\pi)$. $\Box$

\subsection*{4.4 Comparative statics}

\textbf{Corollary.} \emph{Under the equilibrium characterization of Propositions 1--3, the comparative statics signs of equilibrium mutual information $I(\theta; m)$ are:}

\begin{enumerate}
\item $\partial I/\partial c > 0$ \emph{for $c \in [0, c^\ast(v))$}: \emph{Cheaper signal generation reduces equilibrium information.}
\item $\partial I/\partial v < 0$ \emph{for $v > 0$}: \emph{Cheaper verification increases equilibrium information.}
\item $\partial I/\partial K^{\max}$ \emph{is initially positive (for $K^{\max}$ near 1) and eventually negative (for $K^{\max}$ large)}: \emph{Allowing more signals initially helps the receiver but eventually crowds out verification.}
\item $\partial I/\partial \rho > 0$ \emph{for $\rho > \rho^\ast(c, v)$}: \emph{Stronger reputation rewards increase equilibrium information.}
\end{enumerate}

The first comparative-static is the most counterintuitive: cheaper signal generation reduces equilibrium information. This contradicts the naive intuition that more signals always carry more information. The intuition is that, as generation costs fall, the sender's incentive to flood with mimicking signals overwhelms the receiver's ability to verify; the equilibrium tips into babbling.

The second comparative-static is more familiar: cheaper verification helps the receiver. The contemporary policy implications include investing in automated fact-checking, AI-assisted verification tools, and pre-validated signal sources.

The third comparative-static is the inverted-U shape over $K^{\max}$: starting from a single-signal world, increasing the allowed signal count initially helps (more independent observations), but eventually hurts (verification can't keep up).

\subsection*{4.5 Equilibrium characterization in three regions}

The model's parameter space partitions into three regions based on $(c, v, \rho)$:

\textit{Region A: Informative equilibrium without reputation.} $c \geq c^\ast(v)$, $\rho$ arbitrary. The classical Crawford--Sobel partition equilibrium applies.

\textit{Region B: Babbling-with-noise.} $c < c^\ast(v)$, $\rho < \rho^\ast(c, v)$. Zero information transmission; the receiver acts on the prior.

\textit{Region C: Reputation-restored informative equilibrium.} $c < c^\ast(v)$, $\rho > \rho^\ast(c, v)$. Positive information transmission, sustained by reputation. The level of information is generally lower than in Region A but strictly positive.

The contemporary AI environment occupies Region B or Region C depending on the institutional context. Section 5 calibrates the three application domains to specific regions.


\section{Discussion}
The theoretical findings of this paper---a hallucination threshold $c^\ast(v) > 0$ below which informative equilibrium collapses, a $1/v$ convergence rate for equilibrium information as $c \to 0$, and a reputation-restoration result identifying the minimum reputation strength $\rho^\ast(c, v)$---are sufficient that they require substantive interpretive engagement. This section discusses three quantitative calibrations, the relationship to the contemporary policy debate, the limitations, and extensions.

\subsection*{5.1 Calibration to AI-generated content moderation}

Consider the platform-content moderation problem. The sender is a content creator (human user or AI agent) producing posts on a social-media platform. The receiver is the platform's content-moderation system together with downstream readers. The cost of generating a post using an LLM is approximately \$0.0001 per post (cloud inference cost); the cost of verifying a post for factual accuracy is approximately \$0.50 (human moderator time, even with AI-assisted triage). The cost ratio $c/v \approx 0.0002$ places the parameter values deeply in the $c \ll c^\ast(v)$ region.

The reputation parameter $\rho$ in this setting corresponds to the long-term value of platform reputation for individual users. Empirical estimates from platform-economics research suggest $\rho \approx \$0.10$ per truthful post (the marginal value of one additional post viewed by other users when the user's reputation is high). The comparison $\rho/c \approx 1000$ would suggest that reputation should dominate generation cost --- but the relevant comparison is $\rho$ vs.\textbackslash{} $\rho^\ast(c, v) = v \cdot \pi(1-\pi) \approx \$0.125$. The empirical $\rho \approx \$0.10$ falls just below $\rho^\ast$, placing the platform-content equilibrium in Region B (babbling-with-noise).

The institutional implication is that platforms whose reputation systems fall just short of $\rho^\ast$ are theoretically expected to experience collapse of informative communication. Empirical observations of platform misinformation, AI-generated spam, and content-moderation challenges are consistent with this prediction.

\subsection*{5.2 Calibration to scientific peer review under LLM authorship}

Consider the scientific peer-review problem. The sender is an author submitting a manuscript; the receiver is the editor and reviewers. The cost of generating manuscript text using an LLM is approximately \$1 per manuscript-equivalent (a 20-page paper at retail LLM API rates). The cost of reviewing a manuscript carefully is approximately \$500--\$1000 per review (the implicit cost of reviewer time, even when nominally unpaid).

The cost ratio $c/v \approx 0.002$ places the peer-review equilibrium in the $c \ll c^\ast(v)$ region. The reputation parameter $\rho$ corresponds to the long-term reputational value of being a published author of credible work; estimates suggest $\rho \approx \$100$--\$500 per quality publication, depending on the author's career stage and the journal's prestige. The comparison $\rho$ vs.\textbackslash{} $\rho^\ast \approx v \cdot \pi(1-\pi) \approx \$125$--\$250 is closer in magnitude.

The institutional implication is that peer review for high-prestige journals (where $\rho$ is large) remains in Region C --- reputation-restored informative equilibrium --- while peer review for low-prestige venues (where $\rho$ is small) approaches Region B. The differential vulnerability of low-prestige venues to LLM-generated manuscript flooding is consistent with this prediction; it has been observed in the recent expansion of predatory journals using AI-generated content.

\subsection*{5.3 Calibration to social-media misinformation}

Consider the social-media misinformation problem. The sender is an anonymous account posting political or commercial content; the receiver is the platform's algorithm and downstream readers. The cost of generating content using an LLM is again approximately \$0.0001 per post. The cost of verification by readers is approximately \$0.10 per post (the cognitive cost of reading carefully and assessing accuracy).

The cost ratio $c/v \approx 0.001$ places this equilibrium in the $c \ll c^\ast(v)$ region. The reputation parameter $\rho$ for anonymous accounts is approximately zero---there is no enduring reputation across content cycles. The comparison $\rho$ vs.\textbackslash{} $\rho^\ast \approx \$0.025$ places the anonymous-account social-media equilibrium firmly in Region B.

The institutional implication is that anonymous content sources lack the reputation infrastructure to sustain informative equilibrium under the contemporary cost structure. Policy interventions targeting the misinformation problem must therefore either reduce the verification cost $v$ (through better automated fact-checking) or introduce reputation through identity verification, content provenance metadata, or third-party certification systems.

\subsection*{5.4 Relationship to the contemporary policy debate}

The contemporary policy debate over AI content labeling, content moderation, and information ecosystem integrity has substantive theoretical content that our framework can address. The proposals for AI-content labeling (e.g., the EU AI Act's transparency requirements) operate by reducing verification cost $v$ --- labels permit faster triage by downstream readers and algorithms. Our model predicts that such interventions can move the equilibrium from Region B to Region C if the cost reduction is sufficient.

The proposals for AI-content watermarking and provenance metadata (e.g., the C2PA standard) operate similarly through verification-cost reduction. Our model predicts substantial value of such infrastructure if implemented at scale.

The proposals for reputation aggregation (e.g., platform-level user reputation systems) operate by increasing $\rho$. Our model predicts that such interventions are most effective when combined with verification-cost reduction, since the joint $\rho \geq \rho^\ast(c, v)$ condition depends on both parameters.

\subsection*{5.5 Limitations}

Five limitations of the model deserve emphasis.

First, the binary state and signal space is a deliberate simplification. The continuous-state version of the model would yield a richer characterization but the qualitative results (hallucination threshold, reputation restoration) are robust to this generalization.

Second, the pure-bias assumption (the sender always wants $a = 1$) corresponds to the extreme case of preference misalignment. Under partial bias, the babbling threshold shifts but the qualitative phase transition remains.

Third, the single-receiver assumption abstracts from receiver heterogeneity. In settings with heterogeneous receivers (e.g., different audience segments on a platform), the equilibrium may be partially informative for some receivers and babbling for others. The full multi-receiver analysis is left for future work.

Fourth, the static one-shot framework abstracts from dynamic considerations. The reputation extension partially restores dynamic structure, but a full repeated-game analysis with explicit time horizon, discounting, and reputation accumulation is left for future work.

Fifth, the model assumes that verification reveals the truth value of the signal perfectly. In practice, verification is noisy (some signals are ambiguous, some verification efforts fail). The noisy-verification extension is sketched in Section 6.6.

\subsection*{5.6 Extensions}

Three extensions of the model deserve specific mention.

\textit{Continuous state.} Under a continuous state $\theta \in [0,1]$, the equilibrium becomes a partition into countably many intervals (in the informative region) or a single uninformative posterior (in the babbling region). The threshold $c^\ast(v)$ generalizes to a function of the prior density.

\textit{Multi-sender competition.} When multiple senders with different types and biases produce signals, the equilibrium features sender-specific reputation accumulation and receiver-specific verification strategies. The competition can either improve or degrade equilibrium information depending on the structure of sender preferences.

\textit{Noisy verification.} Under noisy verification (each verification reveals the true signal value with probability $\xi < 1$), the receiver's effective information acquisition is reduced by a factor of $\xi$. The threshold $c^\ast(v)$ generalizes to $c^\ast(v, \xi) = v \cdot \pi(1-\pi) / \xi$.

\subsection*{5.7 Connection to the rational-inattention literature}

The framework we develop is closely related to but distinct from the rational-inattention framework of \citet{Sims2003} and \citet{CaplinDean2015}. In the rational-inattention framework, the cost of information acquisition is a function of mutual information; in our framework, the cost is per signal verified. The two frameworks coincide in the limit where verification reveals exactly one bit per signal, but diverge when verification is partial or noisy.

The relationship matters because the contemporary AI-generated information environment exhibits the per-signal cost structure more naturally than the mutual-information cost structure. A reader of LLM-generated text pays cognitive cost per document, not cost per bit of information transmitted. The per-signal verification cost structure is therefore the appropriate primitive for the contemporary application domain.

\subsection*{5.8 Reading the framework against historical episodes}

The framework can be applied retrospectively to historical episodes of communication regime change. The introduction of the printing press in the 15th century, the rise of mass-circulation newspapers in the 19th, and the diffusion of broadcast television in the 20th each reduced the marginal cost of signal generation by orders of magnitude. The framework predicts that each transition would have triggered, at least temporarily, a phase transition toward babbling equilibrium that was subsequently mitigated by the development of institutional reputation systems (journals, brand-name newspapers, broadcast networks subject to regulatory oversight).

The contemporary AI episode is, on this reading, the latest in a sequence of communication-cost transitions. The framework predicts a similar institutional response: the development of reputation-bearing institutions whose certification of content quality substitutes for direct verification by every receiver. The contemporary emergence of fact-checking organizations, AI-content labeling standards, and reputation-aggregation platforms is consistent with this prediction.


\section{Conclusion}
This paper has extended the Crawford--Sobel (1982) framework of strategic information transmission to a setting in which the sender's signal generation is costless and the receiver bears positive verification cost per signal. The motivation is the empirical observation that the diffusion of generative AI since late 2022 has reduced signal-generation costs by approximately three orders of magnitude while leaving verification costs essentially unchanged.

We have proven three central results. First, there exists a hallucination threshold $c^\ast(v) = v \cdot \pi(1-\pi) > 0$ below which the unique perfect Bayesian equilibrium of the multi-signal communication game is babbling-with-noise: the sender produces a flood of signals with zero net informational content, and the receiver verifies a vanishing fraction. Second, in the babbling-with-noise regime, the equilibrium mutual information $I(\theta; m)$ converges to zero at rate $1/v$ as $c \to 0$, implying that the receiver's expected utility approaches the no-information lower bound. Third, introducing a publicly observed reputation parameter $\rho$ restores positive information transmission when $\rho$ exceeds a critical threshold $\rho^\ast(c, v)$ that depends on the cost pair.

Calibration to three contemporary application domains---AI-generated content moderation on social media platforms, scientific peer review under LLM authorship, and anonymous social-media misinformation---places the empirical parameter values in the babbling-with-noise region in all three cases. The reputation extension identifies the institutional structures (reputation systems, third-party verification, content provenance metadata) whose absence accounts for the observed challenges in maintaining informative communication.

\subsection*{6.1 What this paper provided}

The contribution of the paper is fivefold:

\begin{itemize}
\item A formal extension of the Crawford--Sobel cheap-talk framework to admit costly multi-signal generation and costly receiver verification.
\item Proof of the hallucination threshold result (Proposition 1): there exists $c^\ast(v) > 0$ below which informative equilibrium collapses to babbling-with-noise.
\item Proof of the convergence rate result (Proposition 2): in the babbling-with-noise regime, equilibrium mutual information vanishes at rate $1/v$ as $c \to 0$.
\item Proof of the reputation restoration result (Proposition 3): a publicly observed reputation parameter $\rho > \rho^\ast(c, v)$ destabilizes the babbling equilibrium and restores positive information transmission.
\item Calibration to three contemporary application domains, identifying the institutional structures whose absence accounts for the observed equilibrium degradation.
\end{itemize}

\subsection*{6.2 Extensions}

Several extensions of the model deserve consideration in subsequent work.

\emph{Multi-receiver heterogeneity.} The single-receiver assumption abstracts from the heterogeneous information-acquisition behaviors of different audience segments. A multi-receiver extension would characterize the equilibrium under which different audience segments form different beliefs from the same signal flood, with implications for media-market segmentation and the polarization of public opinion.

\emph{Dynamic reputation accumulation.} The static reputation parameter $\rho$ is a stand-in for the value of long-term reputational capital. A full repeated-game extension with reputation accumulation would endogenize the level of $\rho$ and characterize the conditions under which reputation-restored informative equilibrium is sustainable across time.

\emph{Noisy verification.} The framework assumes that verification reveals the true signal value with probability one. Real-world verification is noisy; the noisy-verification extension would characterize the equilibrium under verification accuracy $\xi < 1$ and identify the threshold conditions under which the babbling equilibrium is robust.

\emph{Strategic verification timing.} The framework treats verification as a one-shot decision after observing the signal vector. In practice, receivers often verify signals iteratively, conditional on partial observation. The iterative-verification extension would characterize the optimal verification policy under sequential information arrival.

\emph{Information-design intermediaries.} The framework treats the sender--receiver dyad as primary. In practice, information intermediaries (editors, content platforms, fact-checking organizations) operate between sender and receiver. The intermediated-communication extension would characterize the optimal design of such intermediaries and identify the welfare gains they can deliver.

\emph{Empirical estimation.} The calibrations in Section 5 are illustrative. A natural empirical follow-up would be to estimate the parameters $(c, v, \rho)$ in specific institutional contexts and to test the comparative-statics predictions against observed equilibrium outcomes. The post-November-2022 transition provides a natural experiment for such estimation.

\subsection*{6.3 A note on methodological discipline}

The theoretical contribution of this paper rests on three primitives: the equilibrium concept (perfect Bayesian equilibrium), the cost structure (per-signal generation and verification costs), and the timing (sender chooses signal count and content; receiver chooses verification set and action). Each is a deliberate simplification of the contemporary information environment.

The robustness of the qualitative results---a phase transition between informative and babbling equilibria at a threshold $c^\ast(v) > 0$, restored by reputation $\rho > \rho^\ast(c, v)$---to alternative primitives is the appropriate test of the theoretical contribution. The numerical magnitudes documented in the calibration are application-specific and depend on the parameter values $(c, v, \rho)$ in each setting; the qualitative pattern of the phase transition is robust to alternative parameterizations.

We close in the spirit of the methodology literature: the theoretical contribution of this paper is most valuable when it disciplines subsequent inquiry rather than when it forecloses it. The framework we develop is a starting point for analyzing the equilibrium implications of the contemporary cost-structure shift in communication, not the final word. The empirical estimation, the multi-receiver extension, the dynamic reputation analysis, and the intermediated-communication framework are all natural follow-ups whose pursuit will refine the theoretical understanding of the contemporary information environment.


%%  ── References ───────────────────────────────────────────────────────────
\bibliographystyle{plainnat}
\bibliography{refs}

\end{document}