🛡️ Methodology Checklist
- PDF:
pdf2john [FILE].pdf > hash.txt→ crack with John - Office (old .doc/.xls):
office2john [FILE].doc > hash.txt - Office (modern .docx/.xlsx):
john --format=Office hash.txt - SSH private key:
ssh2john id_rsa > hash.txt→ crack - PEM/PFX certificate:
pfx2john cert.pfx > hash.txt - BitLocker:
bitlocker2john [DISK].img > hash.txt - Verify decrypted file is accessible after cracking
🎯 Operational Context
Use when: Password-protected Office documents, PDFs, or other files found — extract hash and crack for sensitive content access.
Think Dumber First: office2john doc.docx > doc.hash && john doc.hash --wordlist=rockyou.txt — Office 2016+ uses AES-256 so CPU cracking is slow, but try rockyou first since many users choose simple passwords.
Skip when: File uses modern Office encryption (2016+) with bcrypt rounds — cracking takes days on CPU; look for unencrypted version or author’s system instead.
⚡ Tactical Cheatsheet
| Command | Tactical Outcome |
|---|---|
for ext in $(echo ".xls .xls* .xltx .od* .doc .doc* .pdf .pot .pot* .pp*"); do find / -name *$ext 2>/dev/null | grep -v "lib|fonts|share|core"; done | Hunt for sensitive file types |
grep -rnE '^\-{5}BEGIN [A-Z0-9]+ PRIVATE KEY\-{5}$' /* 2>/dev/null | Hunt for SSH private keys by header |
ssh-keygen -yf ~/.ssh/id_rsa | Check if SSH key is passphrase-protected |
locate *2john* | List all hash extraction tools |
python3 ssh2john.py [KEY_FILE] > ssh.hash | Extract hash from SSH private key |
john --wordlist=/usr/share/wordlists/rockyou.txt ssh.hash | Crack SSH key passphrase |
john ssh.hash --show | Display cracked passphrase |
python3 office2john.py [FILE].docx > office.hash | Extract hash from Word/Excel doc |
john --wordlist=rockyou.txt office.hash | Crack Office document password |
python3 pdf2john.py [FILE].pdf > pdf.hash | Extract hash from PDF |
john --wordlist=rockyou.txt pdf.hash | Crack PDF password |
zip2john [ARCHIVE].zip > zip.hash | Extract hash from ZIP archive |
john --wordlist=rockyou.txt zip.hash | Crack ZIP password |
keepass2john [DB].kdbx > keepass.hash | Extract hash from KeePass database |
john --wordlist=rockyou.txt keepass.hash | Crack KeePass master password |
🔬 Deep Dive & Workflow
Encryption Context
File encryption is becoming standard practice due to GDPR and corporate security policies. Two types encountered:
- Symmetric (AES-256) — one key for both encrypt and decrypt; common for local file storage
- Asymmetric — public key encrypts, private key decrypts; common for transmission
Hunting for Encrypted Files
Hunt by extension (office docs, PDFs):
for ext in $(echo ".xls .xls* .xltx .od* .doc .doc* .pdf .pot .pp*"); do
echo -e "\nExtension: $ext"
find / -name "*$ext" 2>/dev/null | grep -v "lib\|fonts\|share\|core"
doneHunt SSH keys by content (no extension needed):
grep -rnE '^\-{5}BEGIN [A-Z0-9]+ PRIVATE KEY\-{5}$' /* 2>/dev/null
# Matches: BEGIN RSA PRIVATE KEY, BEGIN OPENSSH PRIVATE KEY, etc.Check if key is encrypted:
ssh-keygen -yf id_rsa
# Outputs pubkey → NOT encrypted
# Prompts "Enter passphrase" → ENCRYPTED → crack itThe 2john Workflow Pattern
John cannot directly crack binary files. Extract the hash representation first:
binary file → *2john tool → text hash → john crack → plaintext password
All converter tools follow the same pattern:
python3 [FILE_TYPE]2john.py target_file > hash.txt
john --wordlist=rockyou.txt hash.txt
john hash.txt --showConverter Quick Reference
| File Type | Converter | Notes |
|---|---|---|
| SSH private key | ssh2john.py | Also handles encrypted PEM |
| Office doc (docx/xlsx) | office2john.py | Word, Excel, PowerPoint |
pdf2john.py | ||
| ZIP archive | zip2john | Binary tool, no .py |
| RAR archive | rar2john | Binary tool |
KeePass .kdbx | keepass2john | High-value — often contains many creds |
| BitLocker VHD | bitlocker2john | See Password_Cracking_Archives |
Why SSH Keys Are High Value
A cracked SSH key passphrase + the private key file = authenticated access to any server that trusts that key. During post-exploitation, look for:
~/.ssh/id_rsa(user’s default key)~/.ssh/*.pem(AWS/cloud keys)/etc/ssh/ssh_host_*key(server host keys)- Backup directories with
.sshfolders
🛠️ Troubleshooting & Edge Cases
| Problem | Cause | Fix |
|---|---|---|
| office2john not found | Old john installation | Install john-jumbo package: apt install john installs older version; git clone john-jumbo for full format support |
| PDF hash mode unknown | Multiple PDF encryption versions | Use pdfcrack or hashcat modes 10400/10410/10420/10500 for different PDF versions |
| Office 2016+ document cracks too slow | AES-256 + bcrypt | GPU required: hashcat modes 9400/9410/9420/9500/9600; AWS P3 for reasonable speed |
| Cracked password but document won’t open | Special character encoding | Try password with backtick-quote variants; UTF-8 vs ANSI encoding differences |
| john shows ‘Almost done: Processing the remaining buffered candidate passwords’ | Near completion | Let it finish; ‘almost done’ is a hashcat/john artifact — not actually almost done on large wordlists |
📝 Reporting Trigger
Finding Title: Password-Protected Office Document Cracked — Sensitive Content Exposed Impact: Cracking document passwords reveals sensitive business documents (contracts, financial data, strategic plans, credentials) that employees protected under the assumption that document passwords provide adequate data security. Root Cause: Document password used as primary data protection mechanism without enterprise DRM or proper access controls. Weak password selected based on personal patterns. Recommendation: Replace document-level password protection with enterprise DRM (Azure Information Protection, Microsoft Purview). Implement data classification and enforce appropriate access controls based on sensitivity. Store sensitive documents in systems with proper access logging.