Security & Threat Detection with Server Logs

A practical guide to identifying threats, detecting fake bots, and building an effective security monitoring practice using server log analysis.

The Fake Bot Problem

User-Agent strings are one of the most commonly relied-upon signals for identifying crawlers, but they are trivially spoofed. Any HTTP client can set its User-Agent to anything it wants, and many attackers take advantage of this.

Attackers impersonate Googlebot and other legitimate crawlers for several reasons:

Warning

Never trust User-Agent strings alone. Always verify crawler identity by checking the request's source IP against the bot operator's officially published IP ranges. LogLens does this verification automatically for all major crawlers.

Identifying Content Scrapers

Content scrapers have distinctive patterns in server logs that set them apart from legitimate users and crawlers. Once you know what to look for, they become straightforward to spot.

Rapid sequential requests

Scrapers typically make dozens or hundreds of requests per minute from a single IP or a small cluster of IPs. Legitimate users rarely exceed a few pages per minute, and even aggressive crawlers like Googlebot respect crawl-delay directives.

No asset loading

Scrapers request HTML pages only. They do not load CSS, JavaScript, images, fonts, or any other assets that a real browser would need to render the page. If you see an IP fetching page after page with zero corresponding asset requests, it is almost certainly a scraper.

Systematic URL traversal

Look for request patterns that follow a predictable structure: alphabetical ordering, paginated sequences (/page/1, /page/2, /page/3...), or the exact order URLs appear in your sitemap. Human browsing is inherently irregular; machine traversal is not.

Missing referrer and cookie data

Scrapers rarely send referrer headers or manage cookies. A stream of requests with empty Referer headers and no cookies, combined with the patterns above, is a strong signal.

Cloud provider IP addresses

Most scrapers run on cloud infrastructure. Requests originating from AWS, Google Cloud Platform, or Microsoft Azure IP ranges — especially when combined with the patterns above — are highly likely to be automated scraping.

Tip

LogLens automatically flags IPs that exhibit scraping patterns. Use the IP Analysis page to filter by cloud provider ASN and cross-reference with request rates for fast identification.

Vulnerability Scanning Patterns

Automated vulnerability scanners probe for common weaknesses by requesting paths associated with known exploits, exposed configuration files, and popular admin interfaces. These paths should never appear in legitimate traffic.

Watch for request surges to sensitive paths

Key insight

A surge of 404 responses to sensitive paths is one of the clearest indicators of automated vulnerability scanning. These requests typically arrive in bursts — tens or hundreds within a few minutes — and often originate from a single IP or a small set of rotating IPs.

Suspicious IP Pattern Detection

Beyond specific attack signatures, certain IP-level patterns warrant immediate investigation.

Volume anomalies

A single IP making 1,000 or more requests per hour is abnormal for almost any website. Legitimate users rarely exceed a few hundred page views in a session, even on high-engagement sites.

Rotating IPs from the same subnet

Sophisticated attackers rotate through IP addresses within the same /24 or /16 subnet to avoid per-IP rate limits. When you see multiple IPs from the same block all making elevated requests, treat them as a single actor.

Traffic from unexpected regions

If your site primarily serves users in North America and Europe, a sudden spike of requests from a region where you have no audience is worth investigating — especially if paired with other suspicious signals.

Unusual timing patterns

Requests arriving between 2:00 AM and 5:00 AM local time with User-Agent strings that claim to be mainstream browsers (Chrome, Firefox, Safari) are suspicious. Real users are largely asleep; automated tools are not.

Tip

Use LogLens IP Analysis to sort by request volume and filter by time of day. Combine with geographic filters to quickly surface the patterns described above.

Setting Up Effective Alerts

Detection is only useful if it surfaces threats quickly enough to act on them. Here are the alerts every site should configure.

Traffic spike alert

Trigger when any 15-minute window exceeds 5x above the baseline for total requests. This catches DDoS attempts, aggressive scraping bursts, and scanning campaigns.

Error rate alert

Trigger when 5xx errors exceed 5% of total requests over a rolling window. Server errors during a traffic spike often indicate that an attack is impacting availability.

New high-volume bot alert

Trigger when an unverified bot makes 500 or more requests per hour. This catches new scrapers and impersonators before they can do significant damage.

Alert tuning advice

Start with thresholds set high — 10x above baseline for traffic spikes — and tighten gradually as you learn your normal patterns. Use rolling baselines (e.g., same hour last week) rather than static thresholds to account for natural traffic variation.

Response Strategies

Different threat types call for different responses. Acting quickly with the right approach prevents damage while avoiding collateral impact on legitimate traffic.

Threat Recommended Response Details
Fake bots Block at CDN/edge Use Cloudflare WAF rules or AWS WAF to block IPs that claim a bot User-Agent but fail IP verification
Content scrapers Rate limit Enforce 60 requests per minute for unverified clients; tighten further for repeat offenders
Vulnerability scanners Block + alert Block the source IPs and investigate the specific paths being targeted to confirm no exposure
Credential stuffing Rate limit + CAPTCHA Apply strict rate limits to authentication endpoints and require CAPTCHA after failed attempts

Security Review Cadence

Effective security monitoring is a habit, not a one-time setup. Establish a regular review cadence to stay ahead of evolving threats.

Weekly

Monthly

Quarterly

Next guide
Bot Management & AI Crawlers