🛡️ Methodology Checklist
- Site restriction:
site:[DOMAIN]— enumerate indexed content - File type search:
site:[DOMAIN] filetype:pdf OR xlsx OR docx - Login pages:
site:[DOMAIN] inurl:login OR inurl:admin - Error pages with stack traces:
site:[DOMAIN] "error" "Exception" - Camera/device interfaces:
intitle:"webcam" OR "AXIS" OR "hikvision" - Default creds pages:
intitle:"default password" site:[DOMAIN] - Shodan dork:
org:"[ORG]" port:22 product:"OpenSSH" - GHDB (Exploit-DB Google Hacking DB) for target-relevant dorks
🎯 Operational Context
Use when: Passive recon phase — extract exposed pages, credentials, config files, and login portals without touching the target. Think Dumber First: Before running any active scanner, run 10 minutes of dorks. Credentials in GitHub, exposed admin panels, and juicy PDFs are often found this way with zero footprint. Skip when: Target has strict no-passive-recon rules; use only active enumeration per scope agreement.
⚡ Tactical Cheatsheet
| Command | Tactical Outcome |
|---|---|
site:[DOMAIN] | All indexed pages for the domain |
site:[DOMAIN] -site:www.[DOMAIN] | Filter to subdomains only |
site:[DOMAIN] inurl:login | Find login portals |
site:[DOMAIN] (inurl:login OR inurl:admin OR inurl:portal) | Find admin panels |
site:[DOMAIN] filetype:pdf | Find exposed PDF documents |
site:[DOMAIN] (filetype:xls OR filetype:xlsx OR filetype:docx) | Find Office documents |
site:[DOMAIN] (ext:conf OR ext:cnf OR ext:ini OR ext:env) | Find config files |
site:[DOMAIN] (filetype:sql OR filetype:bak OR inurl:backup) | Find backups/DB dumps |
site:[DOMAIN] intitle:"index of" "parent directory" | Find directory listings |
site:[DOMAIN] intext:"error" OR intext:"warning" | Find stack traces / debug info |
site:[DOMAIN] (inurl:dev OR inurl:stg) | Find dev/staging environments |
intext:[COMPANY] inurl:amazonaws.com | AWS S3 bucket search |
intext:[COMPANY] inurl:blob.core.windows.net | Azure blob search |
cache:[DOMAIN] | View Google’s cached version (shows deleted content) |
🔬 Deep Dive & Workflow
Why Use Search Engine Discovery
- Completely passive — target has zero visibility of your recon
- Breadth — search engines indexed forgotten subdomains and files
- Vulnerability finding — exposes configs, credentials, logic errors
Google Search Operators
| Operator | Function | Example |
|---|---|---|
site: | Restrict to domain | site:target.com |
inurl: | Terms in URL | inurl:admin |
filetype: | File extension | filetype:pdf |
intitle: | Terms in page title | intitle:"index of" |
intext: | Terms in page body | intext:"password" |
cache: | Google’s cached copy | cache:target.com |
- (minus) | Exclude term | -site:www.target.com |
Exam Strategy (5-Step Workflow)
site:[DOMAIN]— see what’s indexedsite:[DOMAIN] -site:www.[DOMAIN]— find obscure subdomainssite:[DOMAIN] filetype:pdf(doc, xls) — useexiftoolon downloadssite:[DOMAIN] intext:"error" OR intext:"warning"— stack traces / debugsite:[DOMAIN] inurl:dev OR inurl:stg— non-production environments
GHDB: When stuck, search the Google Hacking Database for the specific tech you identified (e.g., “Joomla Google Dorks”) for technology-specific queries.
🛠️ Troubleshooting & Edge Cases
| Problem | Cause | Fix |
|---|---|---|
| Google returns no results for dork | Rate limiting or CAPTCHA | Switch to Bing, DuckDuckGo, or use site:target.com without operators first |
| site: dork shows irrelevant pages | Domain too broad | Narrow with inurl:admin site:target.com or filetype:sql site:target.com |
| filetype:pdf dork finds nothing | PDFs not indexed | Try ext:docx OR ext:xlsx site:target.com — Office docs often indexed |
| intitle: operator returns wrong results | Query string too generic | Combine operators: intitle:"index of" inurl:backup site:target.com |
| Shodan search returns 0 results | Target uses CDN masking real IP | Filter by org:"Target Corp" or ssl.cert.subject.cn:target.com |
📝 Reporting Trigger
Finding Title: Sensitive Information Exposed via Search Engine Indexing
Impact: An attacker can discover credentials, backup files, and internal paths without any direct interaction with the target, enabling targeted exploitation with zero detection.
Root Cause: Web server or application misconfiguration allows sensitive file types (.sql, .bak, .env, .config) or directory listings to be indexed by search engines.
Recommendation: Implement robots.txt restrictions, configure web server to deny directory listing, and audit publicly accessible file types. Use Google Search Console to request removal of indexed sensitive pages.