🛡️ Methodology Checklist

  • Site restriction: site:[DOMAIN] — enumerate indexed content
  • File type search: site:[DOMAIN] filetype:pdf OR xlsx OR docx
  • Login pages: site:[DOMAIN] inurl:login OR inurl:admin
  • Error pages with stack traces: site:[DOMAIN] "error" "Exception"
  • Camera/device interfaces: intitle:"webcam" OR "AXIS" OR "hikvision"
  • Default creds pages: intitle:"default password" site:[DOMAIN]
  • Shodan dork: org:"[ORG]" port:22 product:"OpenSSH"
  • GHDB (Exploit-DB Google Hacking DB) for target-relevant dorks

🎯 Operational Context

Use when: Passive recon phase — extract exposed pages, credentials, config files, and login portals without touching the target. Think Dumber First: Before running any active scanner, run 10 minutes of dorks. Credentials in GitHub, exposed admin panels, and juicy PDFs are often found this way with zero footprint. Skip when: Target has strict no-passive-recon rules; use only active enumeration per scope agreement.


⚡ Tactical Cheatsheet

CommandTactical Outcome
site:[DOMAIN]All indexed pages for the domain
site:[DOMAIN] -site:www.[DOMAIN]Filter to subdomains only
site:[DOMAIN] inurl:loginFind login portals
site:[DOMAIN] (inurl:login OR inurl:admin OR inurl:portal)Find admin panels
site:[DOMAIN] filetype:pdfFind exposed PDF documents
site:[DOMAIN] (filetype:xls OR filetype:xlsx OR filetype:docx)Find Office documents
site:[DOMAIN] (ext:conf OR ext:cnf OR ext:ini OR ext:env)Find config files
site:[DOMAIN] (filetype:sql OR filetype:bak OR inurl:backup)Find backups/DB dumps
site:[DOMAIN] intitle:"index of" "parent directory"Find directory listings
site:[DOMAIN] intext:"error" OR intext:"warning"Find stack traces / debug info
site:[DOMAIN] (inurl:dev OR inurl:stg)Find dev/staging environments
intext:[COMPANY] inurl:amazonaws.comAWS S3 bucket search
intext:[COMPANY] inurl:blob.core.windows.netAzure blob search
cache:[DOMAIN]View Google’s cached version (shows deleted content)

🔬 Deep Dive & Workflow

Why Use Search Engine Discovery

  • Completely passive — target has zero visibility of your recon
  • Breadth — search engines indexed forgotten subdomains and files
  • Vulnerability finding — exposes configs, credentials, logic errors

Google Search Operators

OperatorFunctionExample
site:Restrict to domainsite:target.com
inurl:Terms in URLinurl:admin
filetype:File extensionfiletype:pdf
intitle:Terms in page titleintitle:"index of"
intext:Terms in page bodyintext:"password"
cache:Google’s cached copycache:target.com
- (minus)Exclude term-site:www.target.com

Exam Strategy (5-Step Workflow)

  1. site:[DOMAIN] — see what’s indexed
  2. site:[DOMAIN] -site:www.[DOMAIN] — find obscure subdomains
  3. site:[DOMAIN] filetype:pdf (doc, xls) — use exiftool on downloads
  4. site:[DOMAIN] intext:"error" OR intext:"warning" — stack traces / debug
  5. site:[DOMAIN] inurl:dev OR inurl:stg — non-production environments

GHDB: When stuck, search the Google Hacking Database for the specific tech you identified (e.g., “Joomla Google Dorks”) for technology-specific queries.


🛠️ Troubleshooting & Edge Cases

ProblemCauseFix
Google returns no results for dorkRate limiting or CAPTCHASwitch to Bing, DuckDuckGo, or use site:target.com without operators first
site: dork shows irrelevant pagesDomain too broadNarrow with inurl:admin site:target.com or filetype:sql site:target.com
filetype:pdf dork finds nothingPDFs not indexedTry ext:docx OR ext:xlsx site:target.com — Office docs often indexed
intitle: operator returns wrong resultsQuery string too genericCombine operators: intitle:"index of" inurl:backup site:target.com
Shodan search returns 0 resultsTarget uses CDN masking real IPFilter by org:"Target Corp" or ssl.cert.subject.cn:target.com

📝 Reporting Trigger

Finding Title: Sensitive Information Exposed via Search Engine Indexing Impact: An attacker can discover credentials, backup files, and internal paths without any direct interaction with the target, enabling targeted exploitation with zero detection. Root Cause: Web server or application misconfiguration allows sensitive file types (.sql, .bak, .env, .config) or directory listings to be indexed by search engines. Recommendation: Implement robots.txt restrictions, configure web server to deny directory listing, and audit publicly accessible file types. Use Google Search Console to request removal of indexed sensitive pages.