This page documents the technical protections built into Guardian Bot following the security audit of March 26, 2026. It is aimed at administrators who want to understand the bot’s internal workings.
Overview
Guardian Bot implements defense-in-depth with multiple independent layers. If one layer is bypassed, the next takes over.
Incoming message
│
▼
┌─────────────────────┐
│ Whitelist check │ ◄── Trusted domains/IPs
└──────┬──────────────┘
│ not whitelisted
▼
┌─────────────────────┐
│ Safe decompression │ ◄── Decompression bomb prevention
└──────┬──────────────┘
│
▼
┌─────────────────────┐
│ Recursive resolution│ ◄── Redirect chain resolution (max 3 hops)
└──────┬──────────────┘
│
▼
┌─────────────────────┐
│ Multi-source scan │ ◄── PhishTank + GSB + VirusTotal
└──────┬──────────────┘
│ threat detected
▼
┌─────────────────────┐
│ Trust Score & Action│ ◄── Score-adapted sanction
└─────────────────────┘
1. Recursive Anti-Phishing
Problem addressed
Attackers use redirect chains to hide malicious URLs. A naïve scanner checks bit.ly/xyz (harmless) but never sees evil-phishing.com it redirects to.
Implementation
Guardian recursively resolves each URL by following HTTP redirects until the final destination.
# Resolution with cycle detection and depth limit
MAX_HOPS = 3
TIMEOUT_PER_HOP = 5 # seconds
async def resolve_redirects(url: str) -> tuple[str, list[str]]:
visited = set()
chain = [url]
current = url
for _ in range(MAX_HOPS):
if current in visited:
break # Cycle detected
visited.add(current)
response = await http_head(current, timeout=TIMEOUT_PER_HOP,
allow_redirects=False)
if response.status not in (301, 302, 303, 307, 308):
break
next_url = response.headers.get("Location")
if not next_url:
break
current = normalize_url(next_url, base=current)
chain.append(current)
return current, chain
Opaque domain handling
Some redirect domains do not reveal the destination URL without user interaction (e.g., get-qr.com, captcha gates). Guardian handles this specifically:
- Resolution successful → scan the final URL against phishing databases
- Resolution impossible (opaque domain) → warning generated without automatic blocking to avoid false positives
False positives are the priority to avoid. A warning without blocking on an opaque domain is preferable to unjustifiably blocking a legitimate URL.
Cycle detection
A visited set is maintained for each resolution. If a URL appears twice in the chain, resolution stops immediately to prevent infinite loops.
2. Decompression Bomb Prevention
Problem addressed
A decompression bomb is a small compressed file (a few KB) that expands to several gigabytes. If an attacker sends a .zip image or encoded file with malicious content, a naïve scanner might attempt to decompress the content into memory and crash the bot.
Implementation
Guardian enforces strict limits when processing attachments:
MAX_FILE_SIZE = 10 * 1024 * 1024 # 10 MB maximum
MAX_DECOMPRESSED_SIZE = 50 * 1024 * 1024 # 50 MB decompression limit
MAX_IMAGE_PIXELS = 4096 * 4096 # 16 MP rendering limit
Applied controls:
-
Size check before download: The
Content-Length header is verified. If the size exceeds MAX_FILE_SIZE, the file is ignored without downloading.
-
Pixel limit for QR images: Before passing an image to the QR decoder, dimensions are checked. An image of 1px × 4 billion pixels would be rejected.
-
Decompression timeout: Each decompression operation runs in a context with timeout (
asyncio.wait_for). If decompression exceeds the time limit, the operation is cancelled.
-
Error isolation: Decompression exceptions are caught locally and logged without crashing the main worker.
3. API Anti-Spam Protection
Problem addressed
Guardian queries up to 3 external APIs (PhishTank, Google Safe Browsing, VirusTotal) per scanned URL. Without limiting, an attacker could send thousands of messages containing URLs to exhaust API quotas or overload the bot.
Implementation
An asyncio.Semaphore limits the number of simultaneous API requests:
# Limited concurrent scanning
api_semaphore = asyncio.Semaphore(5) # Max 5 parallel requests
async def scan_url_with_ratelimit(url: str) -> ScanResult:
async with api_semaphore:
return await _scan_url_internal(url)
Combined mechanisms:
| Mechanism | Implementation | Goal |
|---|
| Global semaphore | Semaphore(5) | Limit simultaneous API calls |
| Result cache | TTL 1h per URL | Avoid scanning the same URL twice |
| Probabilistic Trust Score | Score × 0.3 | Reduce scans for trusted members |
| Preemptive whitelist | Check before network | Short-circuits all scans |
| Per-request timeout | 5s max per hop | Prevents infinite waits |
Probabilistic scanning
To avoid scanning every URL sent by a trusted member, Guardian applies probabilistic sampling based on Trust Score:
Scan probability = max(0.1, 1.0 - (trust_score / 100) * 0.7)
A member with score 90 has only a 37% chance of being scanned on each message. A member with score 10 is scanned 93% of the time.
4. Memory Leak Prevention
Problem addressed
Long-running sessions (captcha, anti-spam) accumulate data in memory if not cleaned up. An attacker can create thousands of unfinished captcha sessions to exhaust the bot’s RAM.
Implementation
Periodic cleanup tasks run in the background for each affected cog:
# Example in captcha.py
@tasks.loop(minutes=10)
async def cleanup_expired_sessions(self):
now = asyncio.get_event_loop().time()
expired = [
session_id
for session_id, session in self.active_sessions.items()
if now - session.created_at > SESSION_TIMEOUT
]
for session_id in expired:
del self.active_sessions[session_id]
Cogs with automatic cleanup:
| Cog | Data cleaned | Interval |
|---|
captcha.py | Expired captcha sessions | 10 minutes |
automod.py | Anti-spam message history | 5 minutes |
report.py | Expired report cooldowns | 30 minutes |
antiraid.py | Raid detection windows | 1 minute |
5. Database Pool Isolation
Problem addressed
Direct database access from multiple cogs simultaneously can create race conditions and hanging connections on error.
Implementation
All queries go through a centralized pool with context management:
# utils/database.py
class Database:
def __init__(self, pool):
self._pool = pool # Access restricted via property
@property
def pool(self):
return self._pool
async def fetch_one(self, query: str, *args):
async with self._pool.acquire() as conn:
return await conn.fetchrow(query, *args)
Independent queries are parallelized with asyncio.gather to reduce latency:
# Parallel loading instead of sequential
trust_score, warnings, infractions = await asyncio.gather(
db.fetch_trust_score(guild_id, user_id),
db.fetch_warnings(guild_id, user_id),
db.fetch_infractions(guild_id, user_id)
)
6. Captcha Verification
Algorithm
The math captcha uses secrets.choice (CSPRNG) instead of random to prevent answer prediction:
import secrets
def generate_captcha() -> tuple[str, int]:
a = secrets.choice(range(1, 20))
b = secrets.choice(range(1, 20))
op = secrets.choice(['+', '-', '×'])
# ...
return question, correct_answer
A per-session asyncio.Lock prevents race conditions if the user clicks multiple times simultaneously.
Limits
- 3 attempts maximum per session
- 5-minute timeout per session
- Automatic expiration: unfinished sessions are cleaned up every 10 minutes
- Result: +15 Trust Score (success) or -20 Trust Score + kick (failure)
7. Role Hierarchy and Escalation Prevention
Guardian systematically verifies the role hierarchy before any moderation action:
Check before ban/kick/mute:
1. Is the target a bot? → Refuse
2. Is the target the server owner? → Refuse
3. Target's highest role ≥ bot's highest role? → Refuse
4. Target's highest role ≥ moderator's highest role? → Refuse
5. Action authorized ✓
Moderator abuse detection
If a moderator performs too many actions in a short time (configurable threshold, default: 3 actions/10s), Guardian:
- Logs the event as suspicious
- Notifies administrators
- Can restrict the moderator account’s permissions if the threshold is exceeded
8. Global Ban Confidence Scores
Every entry in the global blacklist carries a confidence score:
| Category | Confidence | Description |
|---|
scammer | 90% | Confirmed scammer |
raider | 85% | Identified raider |
spammer | 75% | Documented spammer |
other | 60% | Generic category |
Servers can configure a minimum confidence threshold below which automatic banning on join is not triggered, avoiding false positives on low-confidence entries.
Security Parameters Summary
| Protection | Key parameter | Default value |
|---|
| Max redirects | MAX_HOPS | 3 |
| Timeout per hop | TIMEOUT_PER_HOP | 5s |
| Max file size | MAX_FILE_SIZE | 10 MB |
| Max image pixels | MAX_IMAGE_PIXELS | 16 MP (4096²) |
| API semaphore | api_semaphore | 5 concurrent |
| URL scan cache | TTL | 1 hour |
| Max captcha attempts | — | 3 attempts / 5 min |
| Session cleanup | Interval | 10 minutes |
| Default spam threshold | messages/window | 5 msgs / 5s |
| PBKDF2 iterations | Backups | 480,000 |