What If Moderation Didn't Mean Suppression? A Case for Personalized Content Transformation
Centralized content moderation paradigm both falls short and over-reaches: 1) it fails to account for the subjective nature of harm, and 2) it acts with blunt suppression in response to content deemed harmful, even when such content can be salvaged. We first investigate this through formative interviews, documenting how seemingly benign content becomes harmful due to individual life experiences. Based on these insights, we developed DIY-MOD, a browser extension that operationalizes a new paradigm: personalized content transformation. Operating on a user’s own definition of harm, DIY-MOD transforms sensitive elements within content in real-time instead of suppressing the content itself. The system selects the most appropriate transformation for a piece of content from a diverse palette–from obfuscation to artistic stylizing–to match the user’s specific needs while preserving the content’s informational value. Our two user studies demonstrate that this approach increases users’ sense of agency and safety, enabling them to engage with content and communities they previously needed to avoid.
💡 Research Summary
The paper critiques the prevailing centralized content moderation model for two fundamental shortcomings: it assumes a universal definition of “harm” and it responds to harmful content by bluntly suppressing entire posts or accounts. Through qualitative interviews with twelve participants who experience personal sensitivities such as pregnancy loss, eating‑disorder triggers, PTSD, and specific phobias, the authors demonstrate that seemingly innocuous content can be deeply distressing depending on an individual’s lived experience.
To address these issues, the authors introduce DIY‑MOD (Do‑It‑Yourself Moderation), a browser extension that enables personalized content transformation. Users create “filters” in natural language or with keyword/visual specifications that describe what they find harmful. DIY‑MOD scans webpages in real time, detects matching elements with the help of large language models (LLMs) and vision models (VLMs), and applies the most appropriate transformation from a diverse palette: semantic inpainting to remove or replace objects, style‑transfer to render images in an abstract or artistic manner, blurring or rewriting of text, and color/contrast adjustments. Each transformed element is marked with a transparency indicator so users know exactly what has been altered.
The system leverages recent multimodal AI advances. LLMs parse user‑defined triggers and generate structured representations, while VLMs locate visual triggers within images. For transformation selection, the authors adopt an “LLM‑as‑judge” approach: the model receives the original content, candidate transformations, and detailed user context (e.g., sensitivity level, current emotional state) and predicts which transformation best balances harm reduction with information preservation.
Two user studies evaluate DIY‑MOD. An in‑situ study on Reddit with 20 participants over a week shows that personalized transformation markedly increases participants’ sense of agency, perceived safety, and willingness to engage with previously avoided communities. A controlled laboratory study presents multiple transformation variants (blur, stylization, abstraction, etc.) for the same stimuli and elicits preferences. Results reveal a principle of “cognitive closure”: users favor transformations that obscure triggering details while still conveying the overall meaning, whereas overly aggressive alterations lead to frustration and perceived loss of information.
The contributions are: (1) an empirical account of subjective online harm, (2) the design and implementation of a functional DIY‑MOD system, (3) evidence that personalized transformation improves user agency and safety in real‑world browsing, and (4) design guidelines for effective transformations, especially the need for cognitive closure.
Limitations include dependence on the quality of underlying generative models—semantic inpainting can produce artifacts in complex scenes, and text rewriting may unintentionally change nuance. Ethical concerns arise around self‑imposed censorship, potential misuse of transformed content, and the need for transparent disclosure to other users who might encounter altered media.
Future work should develop objective metrics for transformation quality, incorporate adaptive learning to track evolving user sensitivities over time, and design platform‑level mechanisms for indicating and consenting to transformed content, thereby enhancing transparency and accountability. Overall, the paper proposes a novel paradigm—personalized content transformation—that shifts moderation from suppression toward safe, context‑preserving engagement, offering a valuable intersection for HCI, AI, and digital well‑being research.
Comments & Academic Discussion
Loading comments...
Leave a Comment