đŸ‡Ŧ🇧 English — Full Documentation v2.6.2

⚡ Feature Overview

🔍
Automatic URL Filtering The bot monitors all text messages in monitored rooms. If a message contains a URL or domain, it immediately checks it against blacklist and whitelist.
🔀
Three-Way Routing Each recognized domain falls into one of three categories: Whitelist (allowed), Blacklist (blocked), or Unknown (for review). Specificity determines priority: Exact matches > Wildcard matches > Apex matches.
âš™ī¸
Moderation Workflow Unknown domains are automatically routed to a configured moderation room. Moderators decide via emoji reaction (✅ / ❌) or text command.
💾
Database-Backed Persistence All runtime moderation decisions and open moderation requests are stored in Maubot's native database. Bot restarts are fully stateful.
🔐
GDPR / Privacy by Design Matrix user IDs are stored exclusively as SHA-256(secret_salt:user_id) hashes. A background retention loop automatically deletes violation data after 24 hours.
đŸŽ¯
Command Room Restriction The command_rooms key can specify which rooms the bot responds to commands in. The moderation room and DMs are always allowed.
🌐
Wildcard Support Entries like *.evil-site.com block all subdomains at once. Works equally in blacklist and whitelist.
đŸ”ī¸
Apex Domain Matching If evil.com is directly in the blacklist, subdomains like sub.evil.com are automatically blocked — without a separate wildcard entry.
🔗
URL Shortener Resolution Known URL shorteners (bit.ly, t.co, tinyurl.com, etc.) are resolved via HEAD request. The final target domain is checked instead of the shortener host.
đŸ›Ąī¸
Spam Protection & Auto-Mute A configurable warning cooldown prevents notification flooding. Users can optionally be automatically muted (powerlevel -1) when reaching a violation threshold.
đŸ–ŧī¸
Link Previews For shared URLs, the bot can optionally fetch Open Graph metadata and post a preview (title + description) in the room.
🔄
Event ID Deduplication An internal LRU cache (1,000 entries) prevents double processing during Matrix sync replays after bot restarts.

âŒ¨ī¸ Available Commands

Public Commands (for all users)

Command Description
!urlstatus <domain> Shows whether a domain is whitelisted, blacklisted, or unknown — including wildcard and apex matches. Also accepts full URLs.
!stats Outputs the number of loaded domains, wildcards, and open reviews.
!hilfe Shows the full command overview — only via direct message (DM). In group rooms, the bot is completely silent.
!status Shows current bot status including database connection, uptime, and version.

Moderation Commands (require permission)

â„šī¸
Permissions

A user is considered a moderator if their power level in the moderation room is at least min_power_level (default: 50) or their user ID is listed in allowed_users.

Command Description
!allow <domain> Immediately whitelist a domain. Supports wildcards: !allow *.trusted-site.de
!unallow <domain> Remove a domain or wildcard entry from the whitelist.
!block <domain> Immediately blacklist a domain. Supports wildcards: !block *.evil-site.com
!unblock <domain> Remove a domain or wildcard entry from the blacklist.
!reloadlists Reload all .txt files — no restart needed.
!pending Shows all domains currently awaiting a moderation decision.
!sendpending Resends all open review alerts to the mod room.
!mute <@user:server> [-t min.] Manually mute a user (powerlevel -1). Optional -t parameter sets duration in minutes.
!unmute <@user:server> Manually unmute a user.
!ignore <domain> Add a domain to the preview ignore list.
!unignore <domain> Remove a domain from the preview ignore list.

Emoji Reactions in Moderation Room

When an unknown domain is detected, the bot posts an alert message in the moderation room with two reaction buttons:

  • ✅ — Add domain to whitelist and approve the original message.
  • ❌ — Add domain to blacklist.
âš ī¸
Note

Reactions from users without sufficient permission are silently ignored.

🔎 How the Bot Recognizes URLs

The bot uses four stages applied sequentially to each message:

0ī¸âƒŖ
Stage 0 — Matrix ID Cleanup Matrix user IDs (@user:homeserver) and room IDs (!room:homeserver) are removed from the message text before URL detection kicks in. Prevents false positives.
1ī¸âƒŖ
Stage 1 — HTML Links Matrix clients send formatted messages with <a href="...">. These links are evaluated first as the most reliable source.
2ī¸âƒŖ
Stage 2 — Classic URLs Explicit URLs with http://, https://, or www. prefix are recognized via regex.
3ī¸âƒŖ
Stage 3 — Bare Domains Domains without protocol like example.com are recognized if their TLD belongs to a known set of ~230 valid top-level domains. Prevents false positives.
🔐
Unicode Normalization

All extracted domains are normalized before list comparison — full-width characters (like īŊ‡īŊīŊīŊ‡īŊŒīŊ….com) and Unicode domains are converted to their Punycode form via IDNA.

🎭 Wildcard Entries

With the *. prefix, entire domain families can be captured at once:

!block *.evil-site.com     → blocks sub.evil-site.com, api.evil-site.com, ...
!allow *.trusted-site.de → whitelists all subdomains
Checked Domain Entry Match?
sub.banned.com *.banned.com ✅ Yes — Wildcard
api.banned.com *.banned.com ✅ Yes — Wildcard
banned.com *.banned.com ❌ No — Wildcard only covers subdomains
sub.banned.com banned.com (exact) ✅ Yes — Apex Match
a.b.banned.com banned.com (exact) ✅ Yes — Apex Match
💡
Tip

To also block the main domain itself, banned.com and *.banned.com must be added as two separate entries.

âš™ī¸ Configuration Options

These options are defined in base-config.yaml and can be overridden per instance in the Maubot dashboard.

Basic Configuration

Parameter Default Description
blacklist_dir /data/blacklists/ Directory containing blacklist .txt files.
whitelist_dir /data/whitelists/ Directory containing whitelist .txt files.
mod_room_id (Required) Matrix room ID of the moderation room (format: !xxx:homeserver).
mod_permissions.min_power_level 50 Minimum power level for moderation actions. 100 = admin only, 0 = everyone.
mod_permissions.allowed_users [] YAML array of user IDs with fixed moderation permission.
command_rooms [] List of room IDs where the bot responds to commands.

Privacy / GDPR

Parameter Default Description
secret_salt (Required) Random secret key for SHA-256 user hashing. Generate with: python3 -c "import secrets; print(secrets.token_hex(32))"

Link Previews

Parameter Default Description
enable_link_previews true Enable link previews for whitelisted URLs.
link_preview_timeout 5 HTTP timeout in seconds for preview fetches.

Spam Protection & Auto-Mute

Parameter Default Description
warn_cooldown 60 Warning cooldown in seconds.
mute_enabled false Enable automatic muting.
mute_threshold 5 Number of violations within the observation window.
mute_window_minutes 5 Observation window in minutes.
mute_duration_minutes 60 Duration of automatic mute (0 = unlimited).
mute_commands_enabled true Enable manual !mute and !unmute commands.
global_mute true Global muting across all bot rooms.

🚀 Docker Installation

Prerequisites

  • Maubot running in Docker (standard image: dock.mau.dev/maubot/maubot)
  • Maubot runs as UID/GID 1337 in the container by default

Step 1 — Package Plugin

cd /path/to/plugin
zip -r url_filter.mbp \
    maubot.yaml base-config.yaml main.py \
    blacklists/custom.txt blacklists/ignore.txt whitelists/custom.txt

Step 2 — Upload Plugin

  1. Open Maubot dashboard: https://your-server/_matrix/maubot/#/plugins
  2. Click "Upload plugin" and upload the .mbp file
  3. Create new instance, assign bot client, and save

Step 3 — Create Directory Structure

mkdir -p ./data/blacklists ./data/whitelists
touch ./data/blacklists/custom.txt
touch ./data/whitelists/custom.txt

Step 4 — Place Blacklist Files

Place hostfile-formatted .txt files in ./data/blacklists/.

Suitable sources:

./data/blacklists/
├── malware.txt
├── phishing.txt
├── scam.txt
└── custom.txt      ← managed by bot

Step 5 — Set Permissions âš ī¸

âš ī¸
Important

Since Maubot in the container runs as UID 1337, the directories must be owned by this user. Without correct permissions, the bot cannot write to custom.txt.

chown -R 1337:1337 ./data/blacklists ./data/whitelists

Step 6 — Configure Instance

In the Maubot dashboard, set at least the following values:

blacklist_dir: /data/blacklists/
whitelist_dir: /data/whitelists/
mod_room_id: "!YOUR_MOD_ROOM_ID:homeserver.example"
secret_salt: "your-random-salt-here"

Step 7 — Invite Bot to Rooms

  1. Invite bot to all rooms to monitor
  2. Set bot to Powerlevel 50 (Moderator) in these rooms
  3. Invite bot to the moderation room and grant write permissions
  4. If using Auto-Mute: set bot to Powerlevel 100 (Admin)

🔧 Troubleshooting

Bot doesn't write to custom.txt
Correct permissions on the host: chown -R 1337:1337 ./data/blacklists ./data/whitelists
Bot can't delete messages
Set bot to powerlevel 50 in the affected room.
Auto-Mute doesn't work
Bot needs higher power level than the target user (recommended: PL 100).
Bot doesn't send to moderation room
Check if the bot was invited and has write permissions.
No .txt files found at startup
Create directory and files (Step 3), set permissions (Step 5).
!botstatus reports DB errors
Ensure database: true and database_type: asyncpg are set in maubot.yaml.
Warning "secret_salt is not set"
Set the secret_salt in the instance configuration to a secure random value.

đŸ“Ļ Packaging & Deployment

Repackage after source file changes:

zip -r url_filter.mbp \
    maubot.yaml base-config.yaml main.py \
    blacklists/custom.txt blacklists/ignore.txt whitelists/custom.txt

Upload the updated .mbp in the Maubot dashboard and restart the instance.

💡
Manually Update Lists

Place new .txt files in ./data/blacklists/, check permissions, then enter !reloadlists in the moderation room — no restart needed.

📋 Requirements

Component Minimum Version
Maubot >= 0.4.0
mautrix-python >= 0.20.0
Python >= 3.10

Keywords

Matrix Maubot URL filtering Blacklist Whitelist Phishing Anti-Spam Moderation Link-Filter Security GDPR Auto-Mute URL-Shortener Wildcard Apex-Domain
← Back to Home