New MCP tool `preview` combines goto + screenshot with viewport control. Optionally POSTs result to any HTTP endpoint (e.g. HUD/visor) via previewUrl. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
12 KiB
Anti-Bot Stealth Reference
Research notes on making Playwright WebKit less detectable by anti-bot systems. Compiled April 2026.
NOTICE — Scope & Intent
This is a web automation research document covering browser fingerprinting and bot detection evasion techniques. It is compiled as technical reference for the @saiden/browse npm package -- a published, open-source headless browser tool for Claude Code.
Context
Bot detection evasion is standard practice across several legitimate industries:
- QA and test automation -- Automated tests that trigger bot detection produce false failures, making stealth patches essential for reliable CI/CD pipelines
- Price monitoring and comparison -- E-commerce price tracking services (Honey, Camelcamelcamel, Google Shopping) routinely work around anti-bot measures
- Academic web research -- Researchers scraping public data for studies in computational social science, NLP corpus building, and web accessibility auditing
- Accessibility testing -- Automated accessibility scanners must behave like real browsers to accurately assess how sites serve content to assistive technologies
- Search engine indexing -- Crawlers from Google, Bing, and others implement similar techniques to access content behind bot detection layers
The Playwright, Puppeteer, and Selenium communities have maintained stealth plugins as standard tooling for years. This document compiles publicly known techniques specific to WebKit, which is less covered than Chromium in existing literature.
What This Covers
- Browser fingerprinting vectors and their detection severity
- JavaScript-level patches for
addInitScript()(Playwright's official API) - Context configuration for realistic browser profiles
- Limitations of JS-level evasion (TLS fingerprinting, IP reputation, behavioral analysis)
- Anti-bot provider comparison for automated testing planning
What This Is NOT
- Not a guide for bypassing paywalls or accessing content you are not authorised to view
- Not a credential stuffing or account takeover toolkit
- Not intended for scraping personal data in violation of privacy regulations (GDPR, CCPA)
All techniques referenced are publicly documented in the Playwright ecosystem, security research literature, and the web automation community. Sources are cited at the end of this document.
Current State
Browse uses Playwright WebKit with a bare context — no stealth patches. This is trivially detected by every major anti-bot system (Cloudflare, DataDome, PerimeterX/HUMAN, Akamai).
Detection Vectors
| Vector | Severity | Fixable from JS? |
|---|---|---|
navigator.webdriver set to true |
Critical | Yes |
Empty navigator.plugins / mimeTypes |
High | Yes |
| Default viewport (800x600-ish) | High | Yes |
| Missing/generic User-Agent | High | Yes |
| WebGL renderer = SwiftShader / generic | Medium | Yes |
| Permissions API inconsistencies | Medium | Yes |
| iframe cross-frame fingerprinting | Medium | Yes |
| TLS fingerprint (JA3/JA4) | Critical | No |
| IP reputation (datacenter IPs) | Critical | No |
| ML behavioral analysis | High | No |
| Cloudflare Turnstile / JS challenges | High | No |
Stealth Ecosystem & WebKit
The two main stealth libraries only support Chromium:
playwright-stealth(Python) — patches ~12 Chrome-specific APIsplaywright-extra+ stealth plugin (Node.js) — ~17 evasion modules targeting Chrome internals
WebKit and Firefox have entirely different internals. No stealth plugin exists for either. All patches for WebKit must be applied manually via addInitScript().
Recommended Patches
All patches use context.addInitScript() which runs before any page script in any Playwright engine (WebKit included).
1. WebDriver Flag
The single most important patch. Set to undefined, not false — some detectors specifically check for false as a signal of patching.
await context.addInitScript(() => {
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined,
});
});
2. Context Hardening
Configure the browser context to look like a real Safari session:
const context = await browser.newContext({
viewport: { width: 1920, height: 1080 },
userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 14_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.5 Safari/605.1.15',
locale: 'en-US',
timezoneId: 'Europe/Warsaw',
colorScheme: 'light',
extraHTTPHeaders: {
'Accept-Language': 'en-US,en;q=0.9',
},
});
Key points:
- Viewport should be realistic (1920x1080, 1440x900, 1536x864)
- User-Agent must match the engine — use a Safari UA for WebKit
- Locale, timezone, and Accept-Language should be consistent with each other
3. Plugins & MimeTypes
Headless reports empty arrays. Fake them:
await context.addInitScript(() => {
Object.defineProperty(navigator, 'plugins', {
get: () => [1, 2, 3, 4, 5],
});
Object.defineProperty(navigator, 'mimeTypes', {
get: () => [1, 2],
});
});
A more sophisticated version would create proper PluginArray and MimeTypeArray objects with item(), namedItem(), and refresh() methods, but the simple version passes most checks.
4. Permissions API
Fix the inconsistency between Notification.permission and navigator.permissions.query:
await context.addInitScript(() => {
const originalQuery = window.navigator.permissions.query;
window.navigator.permissions.query = (parameters: any) =>
parameters.name === 'notifications'
? Promise.resolve({ state: Notification.permission } as PermissionStatus)
: originalQuery(parameters);
});
5. WebGL Renderer
Mask the GPU vendor/renderer strings. Parameters 37445 and 37446 are UNMASKED_VENDOR_WEBGL and UNMASKED_RENDERER_WEBGL:
await context.addInitScript(() => {
const getParameter = WebGLRenderingContext.prototype.getParameter;
WebGLRenderingContext.prototype.getParameter = function (parameter) {
if (parameter === 37445) return 'Apple GPU';
if (parameter === 37446) return 'Apple M1 Pro';
return getParameter.call(this, parameter);
};
});
Choose values that match the User-Agent. Apple GPU + Apple Silicon for Safari on macOS.
6. iframe ContentWindow Isolation
Some fingerprinters check navigator.webdriver inside iframes to catch incomplete patches:
await context.addInitScript(() => {
const desc = Object.getOwnPropertyDescriptor(HTMLIFrameElement.prototype, 'contentWindow');
Object.defineProperty(HTMLIFrameElement.prototype, 'contentWindow', {
get: function () {
const win = desc?.get?.call(this);
if (win) {
try {
Object.defineProperty(win.navigator, 'webdriver', {
get: () => undefined,
});
} catch (_) {}
}
return win;
},
});
});
7. Session Persistence
Fresh browser contexts with no cookies or history are a strong bot signal. Use browse's existing session_save / session_restore tools to persist cookies, localStorage, and sessionStorage across runs.
What Cannot Be Fixed from JavaScript
TLS Fingerprinting (JA3/JA4)
Anti-bot systems fingerprint the TLS Client Hello handshake — cipher suites, extensions, and their ordering. WebKit's TLS stack is compiled C++; no amount of JavaScript can change it. Playwright WebKit's JA3 hash doesn't match any shipping Safari release.
Workarounds:
- Residential proxies with TLS relay (proxy terminates TLS with its own stack)
curl-impersonatefor non-browser HTTP requests- Switch to Chromium where TLS fingerprint matches real Chrome more closely
IP Reputation
Datacenter IPs (Hetzner, AWS, GCP, etc.) are pre-flagged in commercial anti-bot databases.
Workarounds:
- Residential proxy rotation (BrightData, Oxylabs, etc.)
- Mobile proxies
- Running from a real residential IP (home connection)
Behavioral Analysis
DataDome, Cloudflare, and PerimeterX use ML models trained on billions of real sessions. They analyze:
- Mouse movement patterns (speed, acceleration, curves)
- Scroll behavior (chunked vs smooth, pause patterns)
- Typing cadence
- Navigation timing
- Click patterns (direct element clicks vs natural approach)
Workarounds:
- Add realistic delays between actions (
page.waitForTimeout(random)) - Simulate mouse movements before clicks
- Scroll in chunks with pauses
- Type character by character with variable delays
CAPTCHA / JavaScript Challenges
Cloudflare Turnstile, hCaptcha, and reCAPTCHA require real interaction or solving services.
Workarounds:
- CAPTCHA solving APIs: CapSolver, 2Captcha (~$2-5 per 1,000 solves)
- Wait for challenge resolution: 3-8 seconds after navigation
- Detect challenge pages by checking for known markers (
"Just a moment",cf-challenge,_cf_chl_opt)
Implementation Strategy
Recommended: Stealth Flag
Add an opt-in stealth option to launch():
async launch(options?: { stealth?: boolean }): Promise<void> {
this.browser = await webkit.launch({ headless: this.options.headless });
this.context = await this.browser.newContext({
viewport: { width: this.options.width, height: this.options.height },
...(options?.stealth && {
userAgent: SAFARI_USER_AGENT,
locale: 'en-US',
timezoneId: Intl.DateTimeFormat().resolvedOptions().timeZone,
colorScheme: 'light',
extraHTTPHeaders: { 'Accept-Language': 'en-US,en;q=0.9' },
}),
});
if (options?.stealth) {
await this.applyStealthPatches();
}
this.page = await this.context.newPage();
}
This keeps the default clean for testing while allowing stealth for real-world browsing.
Nuclear Option: Chromium Engine
If stealth becomes a core requirement, add a browser engine option:
launch({ engine: 'chromium', stealth: true })
Chromium has the richest stealth ecosystem:
playwright-extra+ stealth plugin (17 evasion modules)playwright-with-fingerprints(full fingerprint replacement)- Better TLS fingerprint match to real Chrome
- Most anti-bot systems are tuned for Chrome, so evasions are better tested
Trade-off: Chromium is ~200MB heavier than WebKit.
Anti-Bot Provider Cheat Sheet
| Provider | Primary Detection | Difficulty |
|---|---|---|
| Cloudflare (standard) | TLS + JS challenge | Medium |
| Cloudflare (Turnstile) | Interactive challenge | Hard |
| DataDome | Behavioral analysis | Hard |
| PerimeterX / HUMAN | Deep fingerprinting (_px scripts) |
Hard |
| Akamai Bot Manager | TLS + sensor data | Hard |
| Kasada | Obfuscated JS challenge | Very Hard |
| Basic WAFs | User-Agent + rate limiting | Easy |
References
- Playwright Anti-Bot Detection: What Works (2026) | AlterLab
- Playwright Stealth: Bypass Bot Detection | Scrapfly
- Playwright Stealth Mode: The 7 Patches That Matter | DEV Community
- How to Avoid Bot Detection with Playwright | BrowserStack
- How To Make Playwright Undetectable | ScrapeOps
- Detecting Vanilla Playwright | ScrapingAnt
- Playwright Fingerprinting: Explained & Bypass | ZenRows