Skip to main content
Open Notes uses a two-tier moderation model. Tier 1 is fully automated — the platform takes action without human review. Tier 2 sends the content to community raters and only acts once consensus is reached. Tiers are configurable per content category so you can apply stricter automation to egregious content while leaving nuanced cases to the community.

The two tiers at a glance

Tier 1 — Auto action

The server applies the moderation action immediately when its score crosses the configured threshold. No rater involvement is required.Best for: spam, illegal content, slurs — categories where false-negative cost is low and false-positive cost is acceptable.

Tier 2 — Community review

Content is queued for community raters. The platform waits until a statistically significant consensus is reached before acting.Best for: nuanced policy violations, context-dependent content, high-profile users.

How scoring works

When your integration calls POST /api/public/v1/requests, the server:
  1. Scores the content with its AI model.
  2. Compares the score to the Tier 1 threshold for each matched category.
  3. If the score exceeds the Tier 1 threshold → auto-action (Tier 1).
  4. If the score exceeds the Tier 2 threshold but not Tier 1 → queue for community review.
  5. Once raters reach consensus, the server delivers the decision via webhook.

Configuring tiers

Tier configuration is currently managed through the Open Notes team for new communities. Self-serve tier configuration will be available in a future platform release. Contact the team via the Discourse community to update your thresholds.
When requesting configuration, specify:
  • Category names — e.g., spam, harassment, misinformation
  • Tier 1 threshold (0.0–1.0) — the auto-action confidence floor
  • Tier 2 threshold (0.0–1.0) — the community-review confidence floor (must be ≤ Tier 1 threshold)
  • Action typehide, delete, flag, or notify (integration-defined)

Example configuration

CategoryTier 2 thresholdTier 1 thresholdTier 1 action
spam0.700.95delete
harassment0.650.90hide
misinformation0.60(Tier 2 only)
In the example above, misinformation never auto-acts — all flagged content goes to community review.

Integration responsibilities

Your integration must handle both outcomes: Tier 1 auto-action — the server calls your registered webhook with a moderation.decision event immediately. Your integration should apply the action without waiting for community consensus. Tier 2 community review — the server queues the content. Your integration can optionally mark the content as “under review” in the UI. When consensus is reached the server sends the same moderation.decision webhook event.
POST https://api.opennotes.ai/api/public/v1/moderation-actions \
  -H "Authorization: Bearer <api_key>" \
  -H "X-Adapter-Platform: discourse" \
  -H "X-Adapter-User-Id: 1" \
  -H "X-Adapter-Username: system" \
  -H "X-Adapter-Trust-Level: 4" \
  -H "X-Adapter-Admin: true" \
  -H "X-Adapter-Scope: <community_server_id>" \
  -H "Content-Type: application/json" \
  -d '{
    "action_type": "hide",
    "request_id": "<request_id>",
    "community_server_id": "<community_server_id>",
    "reason": "Tier 1 auto-action: spam score 0.97"
  }'

Discourse reference

The Discourse plugin reads the moderation decision from the webhook payload and calls the Discourse Silence/Delete API accordingly. See the Discourse plugin architecture for how the plugin maps Open Notes decisions to Discourse actions.

See also