Moderation & safety
Blocklist (and profanity filter)
A list of terms that automatically reject or flag a contribution, distinct from a watchlist of suspicious words that routes content to human review based on context.
A blocklist (or profanity filter) is a list of terms that automatically reject, or flag, a contribution as soon as one of them appears. It is the oldest and simplest moderation tool : a slur or a banned word triggers a fixed action without any judgment of context.
Blocklist vs watchlist of suspicious words
These two lists do very different jobs.
- Blocklist (auto-reject) : terms so rarely acceptable that their presence justifies blocking the contribution outright. The action is automatic.
- Watchlist of suspicious words (route to humans) : terms that are ambiguous and depend on context. A word like “victim” should not be blocked, it can signal a sensitive testimony as easily as an insult, so it routes the contribution to the human queue for a closer read.
The distinction matters : auto-rejecting an ambiguous word silences legitimate speech, while sending every banned slur to a human wastes moderator time.
Limits
Keyword lists are easy to defeat. Users bypass them with spacing, accents, leetspeak, or homoglyphs (writing around the filter). They also produce false positives : the classic case is a banned substring hidden inside an innocent longer word. A blocklist is therefore a first filter, not a complete moderation strategy.
How Logora handles it
Logora gives each editor an editable blocklist : the newsroom decides which terms auto-reject, and can adapt the list to its audience and editorial line. Alongside it, Logora maintains a contextual watchlist of suspicious words that does not block but routes flagged contributions to the moderation queue, where a human reads them in context. Both lists complement the AI moderation models rather than replacing them.
See spam detection, content moderation, toxicity detection, and moderation queue.