Moderation & safety
Spam detection
Moderation sub-task focused on identifying promotional, automated, link-laden, or off-topic contributions, separate from toxicity detection, but typically handled by the same pipeline.
Spam detection is the moderation sub-task that catches contributions that aren’t toxic in the hate-speech sense but still don’t belong, promotional content, copy-paste advertising, link farms, AI-generated filler text, off-topic political payloads.
What spam looks like on a press site
- Promotional : “Buy from XYZ.com !!” with affiliate links.
- Link farms : the same URL posted under 50 articles by a fresh account.
- Coordinated : same paragraph reposted across dozens of articles within minutes (often automated).
- AI-generated filler : recently, an uptick of bland LLM-generated comments designed to age accounts before more aggressive abuse.
How Logora handles it
Logora’s spam model runs on the same pipeline as toxicity detection, but with different signals : URL reputation, posting velocity per account, similarity across recent contributions, account age, language fingerprint. Above a threshold, contributions are auto-blocked. Borderline cases land in the moderation queue.
For coordinated attacks (a flood of similar content across multiple accounts), Logora’s rate-limiting + similarity scoring catches the pattern within the first 5-10 posts and applies a temporary cooling period.
See AI moderation, content moderation, and toxicity detection.