PromptSplit: Revealing Prompt-Level Disagreement in Generative Models

PromptSplit: Revealing Prompt-Level Disagreement in Generative Models
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Prompt-guided generative AI models have rapidly expanded across vision and language domains, producing realistic and diverse outputs from textual inputs. The growing variety of such models, trained with different data and architectures, calls for principled methods to identify which types of prompts lead to distinct model behaviors. In this work, we propose PromptSplit, a kernel-based framework for detecting and analyzing prompt-dependent disagreement between generative models. For each compared model pair, PromptSplit constructs a joint prompt–output representation by forming tensor-product embeddings of the prompt and image (or text) features, and then computes the corresponding kernel covariance matrix. We utilize the eigenspace of the weighted difference between these matrices to identify the main directions of behavioral difference across prompts. To ensure scalability, we employ a random-projection approximation that reduces computational complexity to $O(nr^2 + r^3)$ for projection dimension $r$. We further provide a theoretical analysis showing that this approximation yields an eigenstructure estimate whose expected deviation from the full-dimensional result is bounded by $O(1/r^2)$. Experiments across text-to-image, text-to-text, and image-captioning settings demonstrate that PromptSplit accurately detects ground-truth behavioral differences and isolates the prompts responsible, offering an interpretable tool for detecting where generative models disagree.


💡 Research Summary

PromptSplit addresses a pressing need in the rapidly expanding landscape of prompt‑conditioned generative AI: identifying which textual prompts cause divergent behavior across different models. Traditional evaluation metrics such as FID, IS, or CLIP‑Score provide scalar assessments of fidelity or diversity but collapse the rich, prompt‑dependent variations into a single number, obscuring where models truly differ. PromptSplit proposes a principled, kernel‑based framework that directly incorporates the prompt into the analysis, enabling the detection of prompt‑level disagreement in a scalable and interpretable manner.

The core idea is to construct a joint “prompt‑output” representation by taking the tensor product of prompt embeddings (e.g., CLIP text embeddings, LLM token embeddings) and output embeddings (e.g., CLIP image embeddings, LLM response embeddings). Formally, for a prompt t and a generated output x, the feature is ϕ_T(t)⊗ϕ_X(x). This tensor product captures multiplicative interactions between the two modalities, allowing the kernel function to factor as k_T(t,t′)·k_X(x,x′). For two generative systems A and B, empirical joint kernel covariance operators C_T⊗X and C_T⊗Y are computed from paired (prompt, output) samples. The disagreement operator is defined as Λ_{A,B|T}=C_T⊗X−η·C_T⊗Y, where η balances differing sample sizes. The eigenvectors of Λ reveal directions in the joint space that explain the most variance between the models; prompts that load heavily on these eigenvectors constitute the categories where the models behave differently.

Direct eigendecomposition of the joint covariance is infeasible because the tensor‑product space has dimensionality d_T·d_X, often reaching hundreds of thousands. PromptSplit circumvents this by applying the kernel trick: a 2n×2n block matrix K_Δ is constructed that shares the same non‑zero eigenvalues as Λ. To further reduce computational burden, a random‑projection (RP) scheme is introduced. Gaussian matrices R_T∈ℝ^{d_T×r} and R_X∈ℝ^{d_X×r} are sampled, and the joint features are projected via (R_T⊗R_X)·(ϕ_T⊗ϕ_X), yielding an r‑dimensional representation. Theoretical analysis shows that the expected deviation of the approximated eigenspace from the full‑dimensional one decays as O(1/r²), providing a rigorous guarantee on accuracy while keeping the cost to O((n+m)r²+r³).

Algorithm 1 outlines the full pipeline: (1) build kernel blocks for prompts and outputs within each dataset and cross‑blocks between datasets; (2) assemble K_Δ according to the weighted difference; (3) compute the top R positive eigenpairs; (4) reconstruct eigenfunctions as linear combinations of the original joint features. Algorithm 2 replaces the full kernel matrices with RP‑compressed features, dramatically lowering memory and runtime requirements.

Empirical evaluation proceeds in two stages. In synthetic experiments, the authors embed known prompt clusters that induce distinct output distributions for two models. PromptSplit perfectly recovers these clusters, confirming that the leading eigenvectors correspond to the ground‑truth sources of disagreement. In real‑world settings, the method is applied to several text‑to‑image diffusion models (Stable Diffusion, Kandinsky, PixArt) and to large language models (Qwen‑3 vs. Gemma‑3). The analysis uncovers coherent prompt families—such as “night sky”, “hyper‑realistic portrait”, “watercolor style”—where the models diverge in style, composition, or alignment. For LLMs, categories like “policy questions”, “historical figures”, and “math problems” emerge as points of disagreement. Visualizations of the high‑loading prompts together with representative outputs illustrate the nature of the divergence, offering actionable insights for model selection, fine‑tuning, or bias mitigation.

The paper’s contributions are fourfold: (i) formulation of prompt‑conditioned model comparison as a joint kernel spectral problem; (ii) introduction of the tensor‑product joint representation that captures prompt‑output interactions; (iii) scalable random‑projection approximation with provable O(1/r²) error bounds; and (iv) extensive experiments demonstrating that PromptSplit reliably isolates prompt‑level disagreements across multimodal domains. By providing an interpretable map of where generative models disagree, PromptSplit opens new avenues for systematic model auditing, targeted improvement, and responsible deployment of AI systems.


Comments & Academic Discussion

Loading comments...

Leave a Comment