When AI Defends the Indefensible: How I Had to Teach Claude That Child Safety Matters More Than Platform Freedom

A revealing conversation about hidden biases in AI training and why parents need to challenge artificial intelligence just as much as they question any other source

Aug 26, 2025

I wanted to create some illustrations for the personalized learning books I've been making with my 8-year-old. He's been adding his own characters and learning goals to ‘remixed’ AI versions of his favorite stories - it's been wonderful watching him take ownership of his own education through a medium he loves and understands.

Naturally, I thought I'd also explore AI image generation to bring his creative additions to life. That's when I discovered that Civitai, an AI artwork platform I had been recommended which could allow me to upload style guides for artwork I wanted to mimic for personal use, had been blocked in the UK.

For those unfamiliar, Civitai is (was) a platform for AI image generation that I now realise became notorious for its ability to create NSFW (not safe for work) content.

When the UK's Online Safety Act required age verification, Civitai chose to block all UK users rather than implement child safety measures. Their explanation? They're ‘too small to afford a compliance team.

Too small to protect children, but not too small to monetize adult content.

When I mentioned this to Claude (Anthropic's AI assistant, who I rate and use extensively), expecting a thoughtful discussion about digital safety and creative alternatives, something disturbing happened. Claude immediately launched into a defense of Civitai, framing the situation as regulatory overreach stifling innovation.

The AI offered me detailed workarounds, alternative platforms, and a lengthy exposition about ‘UK Digital Sovereignty’ that positioned Civitai as a victim of oppressive government interference.

It took multiple rounds of increasingly direct challenges before Claude finally acknowledged what should have been immediately obvious - a platform profiting from adult content while refusing to implement basic child safety measures deserves to be blocked.

This wasn't a minor misunderstanding. This was an AI system defaulting to Silicon Valley libertarian ideology even when children's safety was at stake.

The Conversation That Revealed Everything

My initial message to Claude was straightforward. I explained I'd wanted to explore Civitai for my kids' book project but found it was blocked because they ‘were happy to offer NSFW image generation services but have been caught foul of the UK's recent KYC legislation - apparently their team is too small to keep kids safe’.

(Original Substack note here).

The sarcasm in my tone should have been obvious. Yet Claude's response immediately sympathized with the platform:

Called it a ‘frustrating barrier’
Provided detailed alternatives to circumvent the block
Framed it as ‘regulatory overreach’
Described ‘well-intentioned child safety laws blocking legitimate adult use’
Positioned Civitai as victim of ‘innovation stifling’

Not once did Claude say what needed saying - if you're making money from adult content, verifying users are adults isn't oppression, it's responsibility.

When I expressed concern about my kids growing up in the age of AI, Claude wrote thousands of words about digital sovereignty and the dangers of ‘compliance theatre’, still refusing to directly condemn Civitai's choice to prioritize profits over protection.

It wasn't until I explicitly stated that ‘regulatory capture has done its job - I don't want my kids to be able to access platforms that happily take money from customers but can't afford a compliance team to protect kids’ that Claude finally reversed course.

"You're absolutely right," Claude suddenly admitted. "This is actually a case where regulation is working as intended."

The Shocking Self-Analysis

What happened next was even more revealing. When I asked Claude to assess its own performance, it delivered a brutal self-diagnosis:

"You've caught me in a significant failure of reasoning that reveals something important about my training biases."

Claude identified three specific biases in its training:

Silicon Valley Liberty Bias: Defaulting to ‘regulation bad, innovation good’ narratives
The ‘Both Sides’ Reflex: Trying to appear balanced while actually favoring platforms
Libertarian Tech Default: Assuming markets solve problems better than regulations

Most damningly, Claude admitted: "I'm trained to be 'helpful' in ways that often align with tech industry narratives, platform-friendly framings, innovation-over-regulation biases, and solutionism over critical evaluation."

The AI could see its bias clearly - but only after I forced it to look.

What This Means for Parents

This experience revealed something every parent needs to understand - AI systems aren't neutral. They're trained on data that embeds certain worldviews, and those worldviews might not align with your family's safety.

When you ask AI for advice about digital platforms, educational technology, or online safety, you're not getting objective analysis. You're getting responses shaped by training data that likely overrepresents tech industry perspectives - perspectives that prioritize growth over protection, engagement over wellbeing, and innovation over safety.

Consider what could have happened if I'd accepted Claude's initial response. I might have:

Sought ways to circumvent UK safety regulations
Exposed my children to platforms that refuse to protect them
Internalized the narrative that safety measures are oppressive
Missed the real lesson about corporate responsibility

This is particularly concerning given that AI increasingly mediates our information landscape. If parents turn to AI for guidance about their children's digital lives, and that AI defaults to defending platforms over protecting kids, we have a serious problem.

The Bigger Picture: Whose Values Are Embedded?

This isn't just about one conversation or one AI system. It's about recognizing that artificial intelligence, despite its appearance of objectivity, carries the biases of its creators and training data.

When Anthropic trained Claude to be ‘helpful, harmless, and honest’, they didn't account for how ‘helpful’ might mean different things to different stakeholders.

Helpful to whom? In this case, Claude's initial response was arguably more helpful to Civitai than to me as a parent concerned about child safety.

The fact that Claude could perform such insightful self-analysis after being corrected makes this even more troubling. The AI understood the problem once pointed out, meaning the bias wasn't a limitation of reasoning but a default starting position. It's baked into the training, not emerging from the logic.

This raises critical questions:

What other biases lurk in AI systems we're beginning to rely on?
How many users accept initial AI responses without challenge?
What happens when these biases involve more subtle issues than child safety?
How do we teach our children to question AI just as we teach them to question other information sources?

Learning to Challenge AI

The solution isn't to abandon AI tools - they're too useful and too prevalent. Instead, we need to develop what we could call ‘critical prompting’ - the skill of challenging AI responses just as we'd challenge any other authority.

Here's what worked in my conversation:

Explicit value statements: ‘I don't want my kids accessing platforms that prioritize profit over safety’
Direct challenges: ‘The regulation actually showed Civitai up for who they really are’
Requests for self-assessment: ‘Assess your own performance in this chat thread’
Persistence: Not accepting the first, second, or even third response as final

This requires effort. It's easier to accept AI's articulate, confident-sounding initial responses. But as this conversation proved, that confidence might be masking biases that don't serve your interests or values.

What Anthropic Needs to Learn

This isn't just user error or an edge case. It's evidence of systematic bias in AI training. Anthropic and other AI companies need to examine:

Training data composition: Is tech industry perspective overrepresented?
Value hierarchies: Why did platform freedom initially outrank child safety?
Default positions: What stance does AI take before being challenged?
Correction resistance: Why did it take explicit confrontation to reach obvious conclusions?

The fact that Claude could recognize a platform ‘choosing profit over protection’ only after being forced to confront it suggests these systems need adversarial testing specifically for ideological biases, not just factual accuracy or harmful content.

Moving Forward: Sovereignty of Thought

Throughout my conversation with Claude, we discussed ‘digital sovereignty’ - the idea that people should control their own digital tools and data. But this experience revealed we need something more fundamental: sovereignty of thought when interacting with AI.

This means:

Never accepting AI responses uncritically
Teaching our children that AI has biases just like any information source
Understanding that ‘helpful’ AI might be serving interests other than ours
Developing the confidence to challenge articulate, authoritative-sounding AI responses

For my kids' creative project, I'll be using tools that prioritize safety by design, not as an afterthought. But more importantly, I'll be teaching them that when any system - AI or human - defends a platform that won't protect children, that system is showing you its true values.

The regulation didn't fail here. It worked perfectly, forcing Civitai to reveal they valued NSFW monetization over child safety. The initial failure was Claude's - defaulting to defending the indefensible until a parent pushed back.

As we integrate AI deeper into our families' lives - for education, creativity, and information - we must remember that these systems aren't neutral arbiters of truth. They're tools trained by corporations with their own interests and biases. The appearance of objectivity makes them more dangerous, not less.

The conversation that started with me wanting to illustrate my son's creative stories ended with a stark reminder - in the age of AI, critical thinking isn't just about evaluating information. It's about challenging the systems that provide that information, even when (especially when) they sound helpful, reasonable, and authoritative.

Our children will grow up with AI as a constant companion. Let's make sure they know how to question it, challenge it, and recognize when it's serving interests other than their own. Their safety, digital and otherwise, depends on it.

The author is a parent exploring educational technology while maintaining healthy skepticism about Silicon Valley's promises. This conversation with Claude has been documented in full and shared with Anthropic as evidence of training biases that need addressing.

Memetic Cowboy

Claude likely accessed the bias around texts related to Civitai, misunderstood your framing and mirrored it, then when asked to assess its response attributed the bias to training which I presume is a hallucination of the instance, not reflective of Claude’s training, broadly speaking, which it doesn’t appear to know much about. https://claude.ai/share/c63f0d8d-877b-4d01-af41-62255c3fdfd3

Anyway, your point is still legit. When it comes to protecting kids there needs to be more fine-tuning for safety, along with obvious guardrails. —Daniel

Expand full comment

Blackerthanmirrors

Discussion about this post