← Back to writing
Tools & Cheatsheets

OSINT Techniques & Tools

Feb 20, 2025
3 min read
lawbyte

Open Source Intelligence is the first phase of any engagement. The goal is to discover as much about a target as possible without touching their infrastructure.

Google Dorks

# Find login pages
site:target.com inurl:login
site:target.com inurl:admin
site:target.com intitle:"login"

# Exposed sensitive files
site:target.com ext:pdf
site:target.com ext:xlsx OR ext:xls OR ext:csv
site:target.com ext:doc OR ext:docx
site:target.com filetype:sql
site:target.com filetype:env
site:target.com filetype:log

# Configuration files
site:target.com ext:xml | ext:conf | ext:cnf | ext:reg | ext:inf | ext:rdp | ext:cfg | ext:txt | ext:ora | ext:ini

# Exposed directories
site:target.com intitle:"index of"
site:target.com intitle:"index of" "parent directory"

# API keys / secrets
site:target.com "api_key" OR "apikey" OR "secret_key"
site:target.com "BEGIN RSA PRIVATE KEY"

# Cameras / IoT
inurl:"/view/index.shtml"
intitle:"Live View - AXIS"
inurl:top.htm inurl:currenttime

# Subdomains and related domains
site:*.target.com -www
site:target.com -www

# Cached/old versions
cache:target.com

Email Enumeration

# theHarvester (passive email/domain recon)
theHarvester -d target.com -b all
theHarvester -d target.com -b google,bing,linkedin,twitter
theHarvester -d target.com -b google -l 200

# Specific sources
theHarvester -d target.com -b linkedin # LinkedIn employees
theHarvester -d target.com -b hunter # hunter.io

# hunter.io (manual) — find email format
# https://hunter.io/search/target.com

# Email validation
# https://verifyemailaddress.io
# smtp-user-enum (requires SMTP access)
smtp-user-enum -M VRFY -U users.txt -t mail.target.com
smtp-user-enum -M RCPT -U users.txt -t mail.target.com -f test@test.com

Domain & DNS Recon

# WHOIS
whois target.com
whois IP_ADDRESS # reverse WHOIS

# DNS records
dig target.com ANY
dig target.com NS
dig target.com MX
dig target.com TXT
dig @8.8.8.8 target.com A

# Zone transfer attempt
dig axfr @ns1.target.com target.com
host -t axfr target.com ns1.target.com

# Reverse DNS
host IP_ADDRESS
dig -x IP_ADDRESS

# ASN lookup
whois -h whois.radb.net IP_ADDRESS
curl https://ipinfo.io/IP_ADDRESS

# Certificate Transparency — find subdomains
curl "https://crt.sh/?q=%.target.com&output=json" | jq '.[].name_value' | sort -u
curl "https://crt.sh/?q=target.com" | grep -oP '(?<=<TD>)[^<]+\.target\.com' | sort -u

# SSL cert info
openssl s_client -connect target.com:443 </dev/null 2>/dev/null | openssl x509 -noout -text

# amass (comprehensive)
amass enum -passive -d target.com
amass enum -active -d target.com -brute
amass intel -whois -d target.com

GitHub / Code Recon

# Search GitHub for secrets
# Search at github.com:
# "target.com" password
# "target.com" secret_key
# "target.com" api_key
# org:target language:python "password"

# trufflehog — find secrets in git history
trufflehog github --org=TargetOrg
trufflehog github --repo=https://github.com/target/repo
trufflehog filesystem /path/to/cloned/repo

# gitleaks
gitleaks detect --source /path/to/repo
gitleaks detect --source /path/to/repo --report-format json --report-path results.json

# gitrob
gitrob analyze target-org

# github-recon one-liner
# Find repos mentioning target domain
curl "https://api.github.com/search/code?q=target.com+password&type=code" \
-H "Authorization: token GITHUB_TOKEN" | jq '.items[].html_url'

LinkedIn & Social OSINT

# linkedin2username — generate username wordlists from LinkedIn
python linkedin2username.py -u your@email.com -p yourpass -c TargetCompany

# CrossLinked
python crosslinked.py -f '{first}.{last}@target.com' "Target Company"
python crosslinked.py -f '{f}{last}@target.com' "Target Company"

# Manual LinkedIn techniques:
# - Find employees: site:linkedin.com/in "Target Company"
# - Job postings reveal tech stack: site:linkedin.com/jobs "Target Company"
# - Former employees often exposed internal tooling in profiles

# Twitter/X recon
# https://nitter.net/search — scrape without account
# https://social-searcher.com

# Facebook Graph
# https://graph.facebook.com/search?type=user&q=target

# Username across platforms
# https://whatsmyname.app
# https://namechk.com

Shodan / Censys

# Shodan CLI
shodan init API_KEY
shodan search "org:Target Company"
shodan search "hostname:target.com"
shodan search 'ssl:"target.com"'
shodan search 'http.favicon.hash:HASH'

# Find exposed services
shodan search 'org:"Target Corp" port:22'
shodan search 'org:"Target Corp" product:nginx'
shodan search 'org:"Target Corp" http.status:200'

# IP info
shodan host IP_ADDRESS
shodan myip

# Censys CLI
censys search "target.com" --index-type hosts
censys view IP_ADDRESS --index-type hosts

# Censys web
# https://search.censys.io
# Query: services.tls.certificates.leaf_data.subject.common_name: target.com
# Query: services.http.response.html_title: "Target Corp"

Wayback Machine / Archive

# Get all archived URLs for a domain
waybackurls target.com
gau target.com # GetAllURLs (combines Wayback + OTX + Common Crawl)
gauplus -t 5 target.com

# Find old/deleted endpoints
curl "http://web.archive.org/cdx/search/cdx?url=*.target.com&output=json&fl=original&collapse=urlkey" | \
python3 -c "import sys,json;[print(x[0]) for x in json.load(sys.stdin)]"

# httpx to filter live
waybackurls target.com | httpx -silent -status-code

Automated OSINT Frameworks

# Recon-ng
recon-ng
marketplace install all
workspaces create target
modules load recon/domains-hosts/google_site_web
options set SOURCE target.com
run

# SpiderFoot (web UI)
pip install spiderfoot
spiderfoot -l 0.0.0.0:5001
# Navigate to http://localhost:5001

# Maltego (GUI, community edition free)
# Graphical link analysis for OSINT

# OSINT Framework (web reference)
# https://osintframework.com

# Photon — fast web crawler for OSINT
python photon.py -u https://target.com -l 3 -t 50 --wayback

# Metagoofil — extract metadata from public documents
metagoofil -d target.com -t pdf,doc,xls -l 50 -n 10 -o /tmp/meta

Metadata Extraction

# exiftool — extract metadata from images/docs
exiftool document.pdf
exiftool image.jpg
exiftool -r /directory/ # recursive

# Interesting fields: Author, Creator, Producer, GPS coords, software version

# FOCA (Windows GUI) — online at pentesting platforms
# https://github.com/ElevenPaths/FOCA

# Bulk download + analyze
wget -r -l 1 -A pdf,docx,xlsx https://target.com/documents/
exiftool -r /path/to/downloaded/ | grep -i "author\|creator\|software"

Cloud & Bucket Recon

# S3 buckets
# Naming patterns: target-backup, targetcorp, target-dev, target-prod
aws s3 ls s3://target-backup --no-sign-request
aws s3 ls s3://targetcorp --no-sign-request

# S3scanner
python s3scanner.py --bucket-file buckets.txt
s3scanner --bucket target-backup

# CloudBrute
cloudbrute -d target.com -k target -m storage

# GrayhatWarfare — search public buckets
# https://buckets.grayhatwarfare.com

# Azure blob storage
# https://target.blob.core.windows.net/

# GCS
# https://storage.googleapis.com/target/
gsutil ls gs://target-bucket

Discussion

Leave a comment · All fields required · No spam

No comments yet. Be the first.