AI Content Moderation

Create safer online communities with intelligent, automated content moderation. social.plus leverages advanced AI to scan and filter inappropriate content across text, images, and video, ensuring community standards are maintained without constant manual oversight.

Pre-Moderation

Block inappropriate content before it’s published with proactive AI scanning

Post-Moderation

Monitor and review published content with intelligent flagging and automated actions

Overview

social.plus offers two complementary AI moderation approaches:

Pre-Moderation

Proactive Content Filtering

Content is scanned before publication
AI generates confidence scores for detected violations
Content blocked if confidence exceeds configured threshold
User must modify content to proceed with posting

Post-Moderation

Reactive Content Review

Content is scanned after publication
Uses flagConfidence and blockConfidence thresholds
Automatically flags content for review or removes violations
Maintains community safety without blocking legitimate content

Getting Started

Enable AI Moderation

Contact our support team to enable AI content moderation for your application.

Configure Settings

Set up confidence levels and moderation categories through the social.plus Console.

Test & Monitor

Test with sample content and monitor moderation effectiveness through analytics.

AI Pre-Moderation

Prevent inappropriate content from reaching your community with proactive AI scanning. Pre-moderation ensures all content meets your standards before publication.

Current Availability: Pre-moderation is currently available for image content, with text and video support coming soon.

Image Content Detection

Our AI pre-moderation scans all uploaded images for inappropriate content across four key categories:

Content Categories

Nudity: Detection of explicit or inappropriate nudity
Suggestive Content: Sexually suggestive or provocative imagery
Violence: Violent or graphic content detection
Disturbing Content: Content that may be psychologically disturbing

Configuration

Enable Image Moderation

Navigate to Moderation > Image Moderation in your social.plus Console and toggle “Enable image moderation” to ON.

Set Confidence Levels

Configure confidence thresholds for each category based on your community standards.

Test Configuration

Upload test images to verify your confidence settings work as expected.

Understanding Confidence Levels

Important: Confidence levels significantly impact moderation accuracy. Default settings may produce false positives.

Confidence levels represent the AI’s certainty in detecting specific content types:

Low Confidence (0-30): High sensitivity, may block legitimate content
Medium Confidence (40-70): Balanced approach for most communities
High Confidence (80-100): Conservative filtering, may miss some violations

Recommendation: Start with medium confidence levels (40-60) and adjust based on your community’s needs and false positive rates.

AI Post-Moderation

Monitor and moderate published content with intelligent detection and automated response workflows. Post-moderation provides comprehensive scanning across all content types while maintaining user experience. All AI post-moderation results are surfaced through the Moderation Feed in the social.plus Console, located under Moderation > Moderation feed.

Text Moderation

Detect inappropriate language, hate speech, and harmful text content

Image & Video

Scan visual content for policy violations and harmful imagery

User Profile Moderation

AI moderation for display names, avatars, and descriptions — see dedicated page

Moderation Feed

The Moderation Feed is the central hub for reviewing AI-flagged content. It is organized into two main workflow tabs:

To Review
Reviewed

The To review tab displays all content that requires moderator attention, organized into sub-tabs:

Posts and comments — Flagged posts and comments from communities and user timelines
Messages — Flagged messages from channels and direct conversations
Users — Flagged user profiles — see AI User Profile Moderation

Each flagged item displays:

The AI moderation label and detected categories (e.g., “AI Mod: Harassment or bullying”)
PII detection results when applicable (e.g., URLs, person types)
The number of user reports (e.g., “1 user”, “4 users”)
The last flagged timestamp
Available moderation actions

Available actions for posts and messages:

Delete post / Delete message — Remove the content
Clear flag — Dismiss the flag and approve the content

Use the filter dropdowns (All feeds / All Channels, Select creator / Select sender) to narrow down the moderation queue by community, channel, or content creator.

Content Coverage

Posts and Comments

Text, images, videos, clips, files, and livestream content
Posts across all communities and user timelines
Comments and reply chains on posts
Filter by specific community feed or content creator

Messages

Text, image, video, audio, and file messages
Messages from group channels, direct messages, and live chat
Filter by specific channel or message sender

Text Content Detection

The AI text moderation identifies and handles various types of inappropriate text content:

Detection Categories

Harassment or Bullying: Targeted abuse, intimidation, or bullying behavior
Sexual Content or Nudity: Adult content and explicit sexual references
Violence or Threatening Content: Violent threats, graphic descriptions, or dangerous activities
Hate: Hate speech targeting protected groups
Fraudulent Intent and Scam Promotion: Scam tactics, phishing, and deceptive content
Self Harm or Suicide: Content related to self-harm or suicidal ideation

PII Detection

In addition to content policy categories, the AI performs Personally Identifiable Information (PII) detection to flag sensitive data:

URL — Links and web addresses embedded in content
PersonType — References to specific person types or identities

Content that passes all AI checks displays “AI Mod: Passed” in the moderation feed. Content with detected violations shows the specific category labels.

Multimedia Content Detection

Comprehensive Scanning: Our AI analyzes both static images and video content frame-by-frame for maximum protection.

Advanced visual content analysis covers extensive categories:

Adult Content

Adult Toys, Explicit Nudity, Graphic Nudity
Sexual Activity, Sexual Situations, Suggestive Content
Female Swimwear or Underwear, Swimwear or Underwear
Non-Explicit Nudity of Intimate Parts and Kissing
Partial Nudity, Illustrated Explicit Nudity, Revealing Clothes

Violence & Harmful Content

Violence, Graphic Violence, Gore, Physical Violence
Weapons, Weapon Violence, Explosions
Self Injury, Hanging, Corpses
Emaciated Bodies, Visually Disturbing Content

Substance-Related Content

Extremist & Hate Content

Hate, Extremist, Nazi Party, White Supremacy
Hate Symbols, Rude Gestures, Middle Finger

Understanding Confidence Scores

Confidence Thresholds

Flag Confidence (Default: 40)

Content scoring above this level gets flagged for review
Lower values = more content flagged (higher sensitivity)
Recommended range: 30-60 depending on community standards

Block Confidence (Default: 80)

Content scoring above this level gets automatically removed
Higher values = fewer false positives
Recommended range: 70-90 for balanced protection

Score Ranges

0-39: Content passes moderation (approved)
40-79: Content flagged for human review
80-100: Content automatically blocked/removed

Note: These ranges use default thresholds and can be customized

Default Configuration: All categories start with flagConfidence: 40 and blockConfidence: 80. Monitor your community’s content patterns and adjust these values to optimize for your specific needs.

Configuration Parameters

Parameter Reference

Parameter	Type	Description
`category`	String	Name of the moderation category
`flagConfidence`	Number	Threshold for flagging content (0-100)
`blockConfidence`	Number	Threshold for blocking content (0-100)
`moderationType`	String	Type of content: “text” or “media”

API Configuration

Regional Endpoints
Configuration APIs

Select the appropriate API endpoint for your region to ensure optimal performance:

Region	API Endpoint
Europe	`https://api-eu.social.plus/`
Singapore	`https://api-sg.social.plus/`
United States	`https://api-us.social.plus/`

GET /v1/content-moderation/confidences

Response:
{
  "categories": [
    {
      "category": "explicit_content",
      "flagConfidence": 40,
      "blockConfidence": 80,
      "moderationType": "media"
    }
  ]
}

API Reference

For detailed administration workflows, see the Moderation Overview and analytics export documentation.

Best Practices

Configuration Strategy

Start Conservative: Begin with moderate confidence levels and adjust based on results
Monitor Performance: Track false positive and false negative rates
Community-Specific: Tailor settings to your community’s content standards
Regular Review: Periodically review and update thresholds as your community evolves

Human Oversight

Review Queue Management: Ensure consistent review of flagged content
Moderator Training: Train team on community standards and edge cases
Appeal Process: Provide clear paths for users to contest moderation decisions
Transparency: Communicate moderation policies clearly to users

Performance Optimization

Batch Processing: Handle high-volume content efficiently
Regional APIs: Use geographically appropriate endpoints
Webhook Integration: Implement real-time event handling for flagged content
Monitoring: Set up alerts for unusual moderation patterns

Overview

Admin Console

Admin Portal

Analytics Dashboard

Support

AI Content Moderation

Pre-Moderation

Post-Moderation

Overview

Getting Started

AI Pre-Moderation

Image Content Detection

Configuration

Understanding Confidence Levels

AI Post-Moderation

Text Moderation

Image & Video

User Profile Moderation

Moderation Feed

Content Coverage

Text Content Detection

Multimedia Content Detection

Understanding Confidence Scores

Configuration Parameters

API Configuration

API Reference

Best Practices

Overview

Admin Console

Admin Portal

Analytics Dashboard

Support

Pre-Moderation

Post-Moderation

​Overview

​Getting Started

​AI Pre-Moderation

​Image Content Detection

​Configuration

​Understanding Confidence Levels

​AI Post-Moderation

Text Moderation

Image & Video

User Profile Moderation

​Moderation Feed

​Content Coverage

​Text Content Detection

​Multimedia Content Detection

​Understanding Confidence Scores

​Configuration Parameters

​API Configuration

​API Reference

​Best Practices

Overview

Getting Started

AI Pre-Moderation

Image Content Detection

Configuration

Understanding Confidence Levels

AI Post-Moderation

Moderation Feed

Content Coverage

Text Content Detection

Multimedia Content Detection

Understanding Confidence Scores

Configuration Parameters

API Configuration

API Reference

Best Practices