AI moderation

Use of machine learning to automatically detect, filter or rank user-generated content based on toxicity, relevance, language and policy violations, typically combined with human review for the edge cases.

AI moderation is the practice of routing user-generated content through machine learning models to score it on dimensions like toxicity, hate speech, spam, off-topic relevance, or policy compliance, before any human moderator sees it. In a modern newsroom setup, it is paired with a human review layer for the cases the model is unsure about.

Why newsrooms need it

A regional daily with 30 articles per day and 25 comments per article gets ~187,500 comments per year. Reviewing each by hand costs roughly two minutes per item, that’s 6,250 hours of work, or three to four full-time moderators. At €50/hour fully loaded, that’s a quarter of a million euros annually, just to filter what should never have been published in the first place.

AI moderation flips the economics : the model handles the 85% that is clearly fine or clearly out, and the team only reviews the contested 15%. At Der Spiegel, this brought moderation throughput from a major operational drag to a routine editorial task.

What good AI moderation does

A production-grade moderation engine for press comments should :

  • Detect toxicity, hate speech, threats and spam across the languages your audience uses.
  • Detect relevance to the article (off-topic, promotional, automated content).
  • Provide a confidence score, not a binary decision, so the human reviewer can prioritise the 15% that needs them.
  • Be trained on press content, not generic social-media data, the tone and edge cases of comments under a news article are different from a Reddit thread.
  • Log every decision with timestamp, model version, score, applied rule, for DSA transparency reports.

The 85% + 15% rule

Across European newsrooms running Logora, the moderation engine auto-handles about 85% of incoming comments (approved or rejected without human input). The remaining 15% lands in the moderation queue, where the team’s role is to arbitrate ambiguity, not to drown in volume.

The model never gets the final word on borderline content. Auto-blocking everything would push moderation costs down but break editorial trust. The 15% human review is precisely what makes the system safe to operate.

DSA implications

Article 14 of the DSA requires that any automated decision about user content come with a statement of reasons. The user must understand what was flagged, by what rule, and how to appeal.

Logora’s moderation pipeline is built around this requirement : each automated decision is logged with the model version, the score, the rule, and the user-facing reason text. The annual DSA transparency report assembles this data automatically.

See Logora vs Netino for how AI moderation compares to a pure-BPO moderation service.

⌘K / Ctrl+K to open