Moderation & safety
AI moderation
Use of machine learning to automatically detect, filter or rank user-generated content based on toxicity, relevance, language and policy violations, typically combined with human review for the edge cases.
AI moderation is the practice of routing user-generated content through machine learning models to score it on dimensions like toxicity, hate speech, spam, off-topic relevance, or policy compliance, before any human moderator sees it. In a modern newsroom setup, it is paired with a human review layer for the cases the model is unsure about.
Why newsrooms need it
A regional daily with 30 articles per day and 25 comments per article gets ~187,500 comments per year. Reviewing each by hand costs roughly two minutes per item, that’s 6,250 hours of work, or three to four full-time moderators. At €50/hour fully loaded, that’s a quarter of a million euros annually, just to filter what should never have been published in the first place.
AI moderation flips the economics : the model handles the 85% that is clearly fine or clearly out, and the team only reviews the contested 15%. At Der Spiegel, this brought moderation throughput from a major operational drag to a routine editorial task.
What good AI moderation does
A production-grade moderation engine for press comments should :
- Detect toxicity, hate speech, threats and spam across the languages your audience uses.
- Detect relevance to the article (off-topic, promotional, automated content).
- Provide a confidence score, not a binary decision, so the human reviewer can prioritise the 15% that needs them.
- Be trained on press content, not generic social-media data, the tone and edge cases of comments under a news article are different from a Reddit thread.
- Log every decision with timestamp, model version, score, applied rule, for DSA transparency reports.
The 85% + 15% rule
Across European newsrooms running Logora, the moderation engine auto-handles about 85% of incoming comments (approved or rejected without human input). The remaining 15% lands in the moderation queue, where the team’s role is to arbitrate ambiguity, not to drown in volume.
The model never gets the final word on borderline content. Auto-blocking everything would push moderation costs down but break editorial trust. The 15% human review is precisely what makes the system safe to operate.
DSA implications
Article 14 of the DSA requires that any automated decision about user content come with a statement of reasons. The user must understand what was flagged, by what rule, and how to appeal.
Logora’s moderation pipeline is built around this requirement : each automated decision is logged with the model version, the score, the rule, and the user-facing reason text. The annual DSA transparency report assembles this data automatically.
Related concepts
- Content moderation, the broader practice
- Toxicity detection, one of the model’s signals
- Multilingual moderation, across the languages
- DSA Article 14, statement of reasons for automated decisions
See Logora vs Netino for how AI moderation compares to a pure-BPO moderation service.