Morpheus Link Bot — Advanced URL Filtering for Matrix

⚡ Feature Overview

🔍

Automatic URL Filtering The bot monitors all text messages in monitored rooms. If a message contains a URL or domain, it immediately checks it against blacklist and whitelist.

🔀

Three-Way Routing Each recognized domain falls into one of three categories: Whitelist (allowed), Blacklist (blocked), or Unknown (for review). Specificity determines priority: Exact matches > Wildcard matches > Apex matches.

⚙️

Moderation Workflow Unknown domains are automatically routed to a configured moderation room. Moderators decide via emoji reaction (✅ / ❌) or text command.

💾

Database-Backed Persistence All runtime moderation decisions and open moderation requests are stored in Maubot's native database. Bot restarts are fully stateful.

🔐

GDPR / Privacy by Design Matrix user IDs are stored exclusively as SHA-256(secret_salt:user_id) hashes. A background retention loop automatically deletes violation data after 24 hours.

🎯

Command Room Restriction The command_rooms key can specify which rooms the bot responds to commands in. The moderation room and DMs are always allowed.

🌐

Wildcard Support Entries like *.evil-site.com block all subdomains at once. Works equally in blacklist and whitelist.

🏔️

Apex Domain Matching If evil.com is directly in the blacklist, subdomains like sub.evil.com are automatically blocked — without a separate wildcard entry.

🔗

URL Shortener Resolution Known URL shorteners (bit.ly, t.co, tinyurl.com, etc.) are resolved via HEAD request. The final target domain is checked instead of the shortener host.

🛡️

Spam Protection & Auto-Mute A configurable warning cooldown prevents notification flooding. Users can optionally be automatically muted (powerlevel -1) when reaching a violation threshold.

🖼️

Link Previews For shared URLs, the bot can optionally fetch Open Graph metadata and post a preview (title + description) in the room.

🔄

Event ID Deduplication An internal LRU cache (1,000 entries) prevents double processing during Matrix sync replays after bot restarts.

⌨️ Available Commands

Public Commands (for all users)

Command	Description
`!urlstatus <domain>`	Shows whether a domain is whitelisted, blacklisted, or unknown — including wildcard and apex matches. Also accepts full URLs.
`!stats`	Outputs the number of loaded domains, wildcards, and open reviews.
`!hilfe`	Shows the full command overview — only via direct message (DM). In group rooms, the bot is completely silent.
`!status`	Shows current bot status including database connection, uptime, and version.

Moderation Commands (require permission)

ℹ️

Permissions

A user is considered a moderator if their power level in the moderation room is at least min_power_level (default: 50) or their user ID is listed in allowed_users.

Command	Description
`!allow <domain>`	Immediately whitelist a domain. Supports wildcards: `!allow *.trusted-site.de`
`!unallow <domain>`	Remove a domain or wildcard entry from the whitelist.
`!block <domain>`	Immediately blacklist a domain. Supports wildcards: `!block *.evil-site.com`
`!unblock <domain>`	Remove a domain or wildcard entry from the blacklist.
`!reloadlists`	Reload all .txt files — no restart needed.
`!pending`	Shows all domains currently awaiting a moderation decision.
`!sendpending`	Resends all open review alerts to the mod room.
`!mute <@user:server> [-t min.]`	Manually mute a user (powerlevel -1). Optional -t parameter sets duration in minutes.
`!unmute <@user:server>`	Manually unmute a user.
`!ignore <domain>`	Add a domain to the preview ignore list.
`!unignore <domain>`	Remove a domain from the preview ignore list.

Emoji Reactions in Moderation Room

When an unknown domain is detected, the bot posts an alert message in the moderation room with two reaction buttons:

✅ — Add domain to whitelist and approve the original message.
❌ — Add domain to blacklist.

⚠️

Note

Reactions from users without sufficient permission are silently ignored.

🔎 How the Bot Recognizes URLs

The bot uses four stages applied sequentially to each message:

0️⃣

Stage 0 — Matrix ID Cleanup Matrix user IDs (@user:homeserver) and room IDs (!room:homeserver) are removed from the message text before URL detection kicks in. Prevents false positives.

1️⃣

Stage 1 — HTML Links Matrix clients send formatted messages with <a href="...">. These links are evaluated first as the most reliable source.

2️⃣

Stage 2 — Classic URLs Explicit URLs with http://, https://, or www. prefix are recognized via regex.

3️⃣

Stage 3 — Bare Domains Domains without protocol like example.com are recognized if their TLD belongs to a known set of ~230 valid top-level domains. Prevents false positives.

🔐

Unicode Normalization

All extracted domains are normalized before list comparison — full-width characters (like ｇｏｏｇｌｅ.com) and Unicode domains are converted to their Punycode form via IDNA.

🎭 Wildcard Entries

With the *. prefix, entire domain families can be captured at once:

!block *.evil-site.com     → blocks sub.evil-site.com, api.evil-site.com, ...
!allow *.trusted-site.de → whitelists all subdomains

Checked Domain	Entry	Match?
`sub.banned.com`	`*.banned.com`	✅ Yes — Wildcard
`api.banned.com`	`*.banned.com`	✅ Yes — Wildcard
`banned.com`	`*.banned.com`	❌ No — Wildcard only covers subdomains
`sub.banned.com`	`banned.com` (exact)	✅ Yes — Apex Match
`a.b.banned.com`	`banned.com` (exact)	✅ Yes — Apex Match

💡

Tip

To also block the main domain itself, banned.com and *.banned.com must be added as two separate entries.

⚙️ Configuration Options

These options are defined in base-config.yaml and can be overridden per instance in the Maubot dashboard.

Basic Configuration

Parameter	Default	Description
`blacklist_dir`	`/data/blacklists/`	Directory containing blacklist .txt files.
`whitelist_dir`	`/data/whitelists/`	Directory containing whitelist .txt files.
`mod_room_id`	(Required)	Matrix room ID of the moderation room (format: `!xxx:homeserver`).
`mod_permissions.min_power_level`	`50`	Minimum power level for moderation actions. 100 = admin only, 0 = everyone.
`mod_permissions.allowed_users`	`[]`	YAML array of user IDs with fixed moderation permission.
`command_rooms`	`[]`	List of room IDs where the bot responds to commands.

Privacy / GDPR

Parameter	Default	Description
`secret_salt`	(Required)	Random secret key for SHA-256 user hashing. Generate with: `python3 -c "import secrets; print(secrets.token_hex(32))"`

Link Previews

Parameter	Default	Description
`enable_link_previews`	`true`	Enable link previews for whitelisted URLs.
`link_preview_timeout`	`5`	HTTP timeout in seconds for preview fetches.

Spam Protection & Auto-Mute

Parameter	Default	Description
`warn_cooldown`	`60`	Warning cooldown in seconds.
`mute_enabled`	`false`	Enable automatic muting.
`mute_threshold`	`5`	Number of violations within the observation window.
`mute_window_minutes`	`5`	Observation window in minutes.
`mute_duration_minutes`	`60`	Duration of automatic mute (0 = unlimited).
`mute_commands_enabled`	`true`	Enable manual !mute and !unmute commands.
`global_mute`	`true`	Global muting across all bot rooms.

🚀 Docker Installation

Prerequisites

Maubot running in Docker (standard image: dock.mau.dev/maubot/maubot)
Maubot runs as UID/GID 1337 in the container by default

Step 1 — Package Plugin

cd /path/to/plugin
zip -r url_filter.mbp \
    maubot.yaml base-config.yaml main.py \
    blacklists/custom.txt blacklists/ignore.txt whitelists/custom.txt

Step 2 — Upload Plugin

Open Maubot dashboard: https://your-server/_matrix/maubot/#/plugins
Click "Upload plugin" and upload the .mbp file
Create new instance, assign bot client, and save

Step 3 — Create Directory Structure

mkdir -p ./data/blacklists ./data/whitelists
touch ./data/blacklists/custom.txt
touch ./data/whitelists/custom.txt

Step 4 — Place Blacklist Files

Place hostfile-formatted .txt files in ./data/blacklists/.

Suitable sources:

oisd.nl — various categories
StevenBlack/hosts — consolidated lists
The Block List Project — sorted by category

./data/blacklists/
├── malware.txt
├── phishing.txt
├── scam.txt
└── custom.txt      ← managed by bot

Step 5 — Set Permissions ⚠️

⚠️

Important

Since Maubot in the container runs as UID 1337, the directories must be owned by this user. Without correct permissions, the bot cannot write to custom.txt.

chown -R 1337:1337 ./data/blacklists ./data/whitelists

Step 6 — Configure Instance

In the Maubot dashboard, set at least the following values:

blacklist_dir: /data/blacklists/
whitelist_dir: /data/whitelists/
mod_room_id: "!YOUR_MOD_ROOM_ID:homeserver.example"
secret_salt: "your-random-salt-here"

Step 7 — Invite Bot to Rooms

Invite bot to all rooms to monitor
Set bot to Powerlevel 50 (Moderator) in these rooms
Invite bot to the moderation room and grant write permissions
If using Auto-Mute: set bot to Powerlevel 100 (Admin)

🔧 Troubleshooting

Bot doesn't write to custom.txt

Correct permissions on the host: chown -R 1337:1337 ./data/blacklists ./data/whitelists

Bot can't delete messages

Set bot to powerlevel 50 in the affected room.

Auto-Mute doesn't work

Bot needs higher power level than the target user (recommended: PL 100).

Bot doesn't send to moderation room

Check if the bot was invited and has write permissions.

No .txt files found at startup

Create directory and files (Step 3), set permissions (Step 5).

!botstatus reports DB errors

Ensure database: true and database_type: asyncpg are set in maubot.yaml.

Warning "secret_salt is not set"

Set the secret_salt in the instance configuration to a secure random value.

📦 Packaging & Deployment

Repackage after source file changes:

zip -r url_filter.mbp \
    maubot.yaml base-config.yaml main.py \
    blacklists/custom.txt blacklists/ignore.txt whitelists/custom.txt

Upload the updated .mbp in the Maubot dashboard and restart the instance.

💡

Manually Update Lists

Place new .txt files in ./data/blacklists/, check permissions, then enter !reloadlists in the moderation room — no restart needed.

📋 Requirements

Component	Minimum Version
Maubot	>= 0.4.0
mautrix-python	>= 0.20.0
Python	>= 3.10

Keywords

Matrix Maubot URL filtering Blacklist Whitelist Phishing Anti-Spam Moderation Link-Filter Security GDPR Auto-Mute URL-Shortener Wildcard Apex-Domain