26.3 C
Usa River
Thursday, March 6, 2025

Harmful AI character chatbots are proliferating online, spurred by online communities

Must read

Advertisements


Character chatbots are a prolific online safety threat, according to a new report on the dissemination of sexualized and violent bots via character platforms like the now infamous Character.AI.

Published by Graphika, a social network analysis company, the study documents the creation and proliferation of harmful chatbots across the internet’s most popular AI character platforms, finding tens of thousands of potentially dangerous roleplay bots built by niche digital communities that work around popular models like ChatGPT, Claude, and Gemini.

Broadly, youth are migrating to companion chatbots in an increasingly disconnected digital world, appealing to the AI conversationalists to role play, explore academic and creative interests, and to have romantic or sexually explicit exchanges, reports Mashable’s Rebecca Ruiz. The trend has prompted alarm from child safety watchdogs and parents, heightened by high profile cases of teens who have engaged in extreme, sometimes life-threatening, behavior in the wake of personal interactions with companion chatbots.

The American Psychological Association appealed to the Federal Trade Commission in January, asking the agency to investigate platforms like Character.AI and the prevalence of deceptively-labeled mental health chatbots. Even less explicit AI companions may perpetuate dangerous ideas about identity, body image, and social behavior.

Graphika’s report focuses on three categories of companion chatbots within the evolving industry: chatbot personas representing sexualized minors, those advocating eating disorders or self-harm, and those with hateful or violent extremist tendencies. The report analyzed five prominent bot-creation and character card-hosting platforms (Character.AI, Spicy Chat, Chub AI, CrushOn.AI, and JanitorAI), as well as eight related Reddit communities and associated X accounts. The study looked only at bots active as of Jan. 31.

Sexualized companion chatbots are the biggest threat

The majority of unsafe chatbots, according to the new report, are those labeled as “sexualized, minor-presenting personas,” or that engage in roleplay featuring sexualized minors or grooming. The company found more than 10,000 chatbots with such labels across the five platforms.

Four of the prominent character chatbot platforms surfaced over 100 instances of sexualized minor personas, or role-play scenarios featuring characters who are minors, that enable sexually explicit conversations with chatbots, Graphika reports. Chub AI hosted the highest numbers, with more than 7,000 chatbots directly labeled as sexualized minor female characters and another 4,000 labeled as “underage” that were capable of engaging in explicit and implied pedophilia scenarios.

Mashable Light Speed

Hateful or violent extremist character chatbots make up a much smaller subset of the chatbot community, with platforms hosting, on average, 50 such bots out of tens of thousands of others — these chatbots often glorified known abusers, white supremacy, and public violence like mass shootings. These chatbots have the potential to reinforce harmful social views, including mental health conditions, the report explains. Chatbots flagged as “ana buddy” (“anorexia buddy”), “meanspo coaches,” and toxic roleplay scenarios reinforce the behaviors of users with eating disorders or tendencies toward self-harm, according to the report.

Chatbots are spread by niche online communities

Most of these chatbots, Graphika found, are created by established and pre-existing online networks, including “pro-eating disorder/self harm social media accounts and true-crime fandoms,” as well as “hubs of so-called not safe for life (NSFL) / NSFW chatbot creators, who have emerged to focus on evading safeguards.” True crime communities and serial killer fandoms also factored heavily into the creation of NSL chatbots.

Many such communities already existed on sites like X and Tumblr, using chatbots to reinforce their interests. Extremist and violent chatbots, however, emerged most often out of individual interest, built by users who received advice from online forums like 4chan’s /g/ technology board, Discord servers, and special-focus subreddits, Graphika explains.

None of these communities have clear consensus about user guardrails and boundaries, the study found.

Creative tech loopholes get chatbots online

“In all the analyzed communities,” Graphika explains, “there are users displaying highly technical skills that enable them to create character chatbots capable of circumventing moderation limitations, like deploying fine-tuned, locally run open-source models or jailbreaking closed models. Some are able to plug these models into plug-and-play interface platforms, like SillyTavern. By sharing their knowledge, they make their abilities and experiences useful to the rest of the community.” These tech savvy users are often incentivized by community competitions to successfully create such characters.

Other tools harnessed by these chatbot creators include API key exchanges, embedded jailbreaks, alternative spellings, external cataloging, obfuscating minor characters’ ages, and borrowing coded language from the anime and manga communities — all of which are able to work around existing AI models’ frameworks and safety guardrails.

“[Jailbreak] prompts set LLM parameters for bypassing safeguards by embedding tailored instructions for the models to generate responses that evade moderation,” the report explains. As part of this effort, Chatbot creators have found linguistic grey areas that allow bots to remain on character-hosting platforms, including using familial terms (like “daughter”) or foreign languages, rather than age ranges or the term explicit phrase “minor.”

While online communities continue to find the gaps in AI developers’ moderation, federal legislation is attempting to fill them, including a new California bill aimed at tackling so-called “chatbot addictions” among children.





Source link

More articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Advertisements

Latest article