🛡️ Methodology Checklist

  • PDF: pdf2john [FILE].pdf > hash.txt → crack with John
  • Office (old .doc/.xls): office2john [FILE].doc > hash.txt
  • Office (modern .docx/.xlsx): john --format=Office hash.txt
  • SSH private key: ssh2john id_rsa > hash.txt → crack
  • PEM/PFX certificate: pfx2john cert.pfx > hash.txt
  • BitLocker: bitlocker2john [DISK].img > hash.txt
  • Verify decrypted file is accessible after cracking

🎯 Operational Context

Use when: Password-protected Office documents, PDFs, or other files found — extract hash and crack for sensitive content access. Think Dumber First: office2john doc.docx > doc.hash && john doc.hash --wordlist=rockyou.txt — Office 2016+ uses AES-256 so CPU cracking is slow, but try rockyou first since many users choose simple passwords. Skip when: File uses modern Office encryption (2016+) with bcrypt rounds — cracking takes days on CPU; look for unencrypted version or author’s system instead.


⚡ Tactical Cheatsheet

CommandTactical Outcome
for ext in $(echo ".xls .xls* .xltx .od* .doc .doc* .pdf .pot .pot* .pp*"); do find / -name *$ext 2>/dev/null | grep -v "lib|fonts|share|core"; doneHunt for sensitive file types
grep -rnE '^\-{5}BEGIN [A-Z0-9]+ PRIVATE KEY\-{5}$' /* 2>/dev/nullHunt for SSH private keys by header
ssh-keygen -yf ~/.ssh/id_rsaCheck if SSH key is passphrase-protected
locate *2john*List all hash extraction tools
python3 ssh2john.py [KEY_FILE] > ssh.hashExtract hash from SSH private key
john --wordlist=/usr/share/wordlists/rockyou.txt ssh.hashCrack SSH key passphrase
john ssh.hash --showDisplay cracked passphrase
python3 office2john.py [FILE].docx > office.hashExtract hash from Word/Excel doc
john --wordlist=rockyou.txt office.hashCrack Office document password
python3 pdf2john.py [FILE].pdf > pdf.hashExtract hash from PDF
john --wordlist=rockyou.txt pdf.hashCrack PDF password
zip2john [ARCHIVE].zip > zip.hashExtract hash from ZIP archive
john --wordlist=rockyou.txt zip.hashCrack ZIP password
keepass2john [DB].kdbx > keepass.hashExtract hash from KeePass database
john --wordlist=rockyou.txt keepass.hashCrack KeePass master password

🔬 Deep Dive & Workflow

Encryption Context

File encryption is becoming standard practice due to GDPR and corporate security policies. Two types encountered:

  • Symmetric (AES-256) — one key for both encrypt and decrypt; common for local file storage
  • Asymmetric — public key encrypts, private key decrypts; common for transmission

Hunting for Encrypted Files

Hunt by extension (office docs, PDFs):

for ext in $(echo ".xls .xls* .xltx .od* .doc .doc* .pdf .pot .pp*"); do
  echo -e "\nExtension: $ext"
  find / -name "*$ext" 2>/dev/null | grep -v "lib\|fonts\|share\|core"
done

Hunt SSH keys by content (no extension needed):

grep -rnE '^\-{5}BEGIN [A-Z0-9]+ PRIVATE KEY\-{5}$' /* 2>/dev/null
# Matches: BEGIN RSA PRIVATE KEY, BEGIN OPENSSH PRIVATE KEY, etc.

Check if key is encrypted:

ssh-keygen -yf id_rsa
# Outputs pubkey → NOT encrypted
# Prompts "Enter passphrase" → ENCRYPTED → crack it

The 2john Workflow Pattern

John cannot directly crack binary files. Extract the hash representation first:

binary file → *2john tool → text hash → john crack → plaintext password

All converter tools follow the same pattern:

python3 [FILE_TYPE]2john.py target_file > hash.txt
john --wordlist=rockyou.txt hash.txt
john hash.txt --show

Converter Quick Reference

File TypeConverterNotes
SSH private keyssh2john.pyAlso handles encrypted PEM
Office doc (docx/xlsx)office2john.pyWord, Excel, PowerPoint
PDFpdf2john.py
ZIP archivezip2johnBinary tool, no .py
RAR archiverar2johnBinary tool
KeePass .kdbxkeepass2johnHigh-value — often contains many creds
BitLocker VHDbitlocker2johnSee Password_Cracking_Archives

Why SSH Keys Are High Value

A cracked SSH key passphrase + the private key file = authenticated access to any server that trusts that key. During post-exploitation, look for:

  • ~/.ssh/id_rsa (user’s default key)
  • ~/.ssh/*.pem (AWS/cloud keys)
  • /etc/ssh/ssh_host_*key (server host keys)
  • Backup directories with .ssh folders

🛠️ Troubleshooting & Edge Cases

ProblemCauseFix
office2john not foundOld john installationInstall john-jumbo package: apt install john installs older version; git clone john-jumbo for full format support
PDF hash mode unknownMultiple PDF encryption versionsUse pdfcrack or hashcat modes 10400/10410/10420/10500 for different PDF versions
Office 2016+ document cracks too slowAES-256 + bcryptGPU required: hashcat modes 9400/9410/9420/9500/9600; AWS P3 for reasonable speed
Cracked password but document won’t openSpecial character encodingTry password with backtick-quote variants; UTF-8 vs ANSI encoding differences
john shows ‘Almost done: Processing the remaining buffered candidate passwords’Near completionLet it finish; ‘almost done’ is a hashcat/john artifact — not actually almost done on large wordlists

📝 Reporting Trigger

Finding Title: Password-Protected Office Document Cracked — Sensitive Content Exposed Impact: Cracking document passwords reveals sensitive business documents (contracts, financial data, strategic plans, credentials) that employees protected under the assumption that document passwords provide adequate data security. Root Cause: Document password used as primary data protection mechanism without enterprise DRM or proper access controls. Weak password selected based on personal patterns. Recommendation: Replace document-level password protection with enterprise DRM (Azure Information Protection, Microsoft Purview). Implement data classification and enforce appropriate access controls based on sensitivity. Store sensitive documents in systems with proper access logging.