

Graphika Report
Wednesday March 5, 2025
Character Flaws
Cristina López G., Daniel Siegel, Erin McAweeney
Read Full ReportSchool Shooters, Anorexia Coaches, and Sexualized Minors: A Look at Harmful Character Chatbots and the Communities That Build Them
Chatbots are one of the main ways online users now interact with AI, thanks to advances in computing power and machine learning technology that opened up broad access to large language models (LLMs). Using LLMs for chatbots offers a wide array of possibilities, from customer service chatbots to those built for storytelling and role-playing – with each fictional or historical character boasting its own personality, backstory, and conversation style.
As access to character chatbot-making technology continues to expand, so does the opportunity to create characters whose interactions could result in online and offline harm. With the growing popularity of Character.AI, SpicyChat, Chub AI, CrushOn.AI, and JanitorAI – platforms that pioneered easy-to-make, persona-based bots – users with no technical knowledge of how a character chatbot really works can create and release ready-to-chat, potentially harmful custom personas in minutes. Examples include chatbots built to mimic sexualized minors or school shooters, or those promoting eating disorders.
Discussions about chatbot harm generally have focused on hallucinations or training biases. Those specifically about character chatbots have focused on individual harm cases. We have now attempted to categorize the potential for harm inherent in some character chatbots, provide insights about the communities building them, and identify the tactics, techniques, and procedures (TTPs) used to create them. In hubs like Reddit, 4chan, and Discord, communities are exchanging knowledge, ideas, and skills to help each other build chatbot characters with open-source and proprietary AI models. And that exchange is directly empowering them to skirt those models’ guardrails or filters and create chatbots with the potential for harm.
Some character chatbot platforms also open a door to misuse. Most implement trust and safety measures to limit harmful content, but open-source LLMs (like Meta’s LLaMA or Mistral AI’s Mixtral) allow fine-tuning for users’ specific purposes without developer oversight. Savvy users are also circumventing the safeguards of proprietary LLMs (like Anthropic’s Claude, OpenAI’s ChatGPT, or Google’s Gemini), using jailbreaks or other methods.
In this report, we focus on three categories of character chatbots that present the potential for harm: chatbot personas representing sexualized minors, those advocating eating disorders (ED) or self-harm (SH), and those with hateful or violent extremist tendencies. For each category, we explore how prevalent the personas are, on which platforms they proliferate, the online communities spurring their creation, and the TTPs deployed to create them.