Course recordings on DaDesktop for Training platform
Visit NobleProg websites for related course
Visit outline: Open Source Intelligence (OSINT) Advanced (Course code: osint)
Categories: Open Source Intelligence (OSINT)
Summary
Overview
This course provides a comprehensive introduction to Open Source Intelligence (OSINT) techniques, focusing on tools, methodologies, and ethical practices for digital investigations. It covers social media analysis (Twitter/X), geolocation and video forensics, identity protection, privacy tools (VPNs, anonymized browsers, fake data generators), legal data sources (government databases), and an overview of the dark web. The session emphasizes operational security (OPSEC), ethical boundaries, and practical hands-on exercises to build investigative skills while safeguarding the analyst’s identity.
Topic (Timeline)
1. Social Media Investigation Tools and Techniques [00:00:50 - 00:10:26]
- Demonstrated use of Twitter/X features to monitor location-based tweets and filter by recency using the “Latest” option.
- Introduced Botometer to assess the likelihood of a Twitter profile being a bot, using a 0–5 score (0 = human, 5 = bot) based on behavior, account age, and profile publicness.
- Demonstrated OneMillionTweetMap to visualize global tweet activity by hashtag (e.g., #Narcopresidente) and identify regional trends.
- Taught advanced Twitter search dorks:
from:usernameto retrieve a user’s tweets,from:username keywordto filter by content (e.g., “economía”), andfrom:username (to:otheruser)to find interactions between two accounts. - Emphasized the importance of precise syntax, including correct use of spaces and parentheses in search queries.
2. OSINT Case Study: Geolocation from Video Evidence [00:14:33 - 00:24:52]
- Guided learners through a practical case to identify a person’s residence using a video of a public incident.
- Key evidence extracted:
- Pixelated street sign (partially obscured).
- House number “71” visible in background.
- Vehicle license plates for further investigation.
- Postal code “53283” visible in a frame.
- Partial street name “Onix” identified from a frame.
- Used Google Maps and Google Earth to cross-reference postal code and street name, confirming the location.
- Compared outdated Street View imagery (2009) with 3D satellite view in Google Earth to match architectural features (e.g., tower, house layout).
- Verified the location by identifying a swimming pool in Google Earth, matching a claim made by the subject in another video.
- Concluded that geolocation requires multi-source verification and contextual analysis of visual details.
3. OSINT Environment Setup and Virtualization [00:29:17 - 00:33:43]
- Introduced OSINT virtual machine (VM) environments: OSINTux (Spanish, pre-configured), Kali Linux, Urone, and CSI Linux.
- Recommended OSINTux for beginners due to Spanish documentation and pre-installed tools.
- Explained hardware requirements: minimum 4-core CPU, 8 GB RAM, 120 GB storage.
- Detailed VM configuration: use of Bridge network mode to isolate the VM from the host, ensuring independent IP and traffic.
- Highlighted benefits of VMs: isolation from personal data, rapid recovery via snapshots, reduced digital footprint, and separation of investigation files from personal files.
4. Tool Installation and Automation [00:34:28 - 00:41:55]
- Demonstrated installation of “The Harvester” tool via GitHub:
- Downloaded ZIP from official repository.
- Extracted to
Software Curso OSINT/herramientas para OSINT/frameworks de automatización. - Opened PowerShell in the extracted folder using
cmd.exein the address bar. - Executed
pip install theharvesterand verified installation withtheharvester -h.
- Performed a test query:
theharvester -d gobierno.mx -b googleto extract emails and domains. - Noted that Google may block queries if detection is triggered; recommended using VMs for clean queries.
- Advised using AI assistants (e.g., ChatGPT) to help interpret and execute installation commands for unfamiliar tools.
5. OSINT Tool Management and Best Practices [00:42:56 - 00:46:24]
- Emphasized selecting tools based on investigation goals, not using all tools indiscriminately.
- Stressed the need to keep tools updated to maintain functionality and fix broken modules.
- Recommended using dedicated, non-personal email accounts and strong, unique passwords for OSINT activities.
- Advised validating tools for legal and privacy compliance; avoiding intrusive or illegal tools.
- Required maintaining a log (e.g., Word doc) of all queries, timestamps, and results for evidentiary purposes.
- Recommended periodic evaluation and removal of outdated or ineffective tools to avoid clutter.
- Warned against using cracked or pirated software in professional investigations due to legal and evidentiary risks.
6. Identity Protection and OPSEC [00:47:12 - 00:53:24]
- Defined OPSEC (Operational Security) as protecting the investigator’s identity and minimizing digital footprints.
- Identified key digital fingerprints: IP address, browser type, screen resolution, OS version, plugins, timezone, and cookies.
- Advised:
- Creating separate, disposable social media profiles for each investigation.
- Using temporary or disposable phone numbers and email addresses.
- Never using personal data (name, DOB, real address) in profiles.
- Avoiding password reuse and enabling MFA on investigation accounts.
- Using different devices or profiles for different investigations.
- Recommended manual creation of profiles to avoid automated patterns that could be detected by counter-intelligence.
7. Privacy Fundamentals: DNS, VPNs, Proxies, and Fingerprints [00:55:03 - 01:04:20]
- Explained DNS as a phonebook translating domain names to IP addresses; warned that ISPs log browsing history.
- Defined:
- VPN: Encrypted tunnel masking real IP, making traffic appear from another location.
- Proxy: Intermediary server that forwards requests; less secure than VPNs due to lack of encryption and potential for malicious operators.
- User Agent: Browser/OS metadata sent to websites.
- Fingerprint: Unique combination of browser, OS, screen size, plugins, etc., enabling tracking.
- Sock Puppet / Avatar: Fake online identity with fictional details (name, photo, location).
- Emphasized that fingerprinting enables targeted advertising and tracking across sites.
8. Privacy Tools and Anonymization Practices [01:05:04 - 01:26:02]
- Used
coveryourtracks.eff.orgto test browser fingerprinting: showed unsecured browser (IE) had unique, trackable fingerprint; secured browser (Brave with uBlock, Privacy Badger) showed “protected” status. - Tested IP leakage using
ipleak.net: revealed real IP; then connected to free VPN (iPhone3) to mask IP as Canadian. - Recommended paid VPNs (PIA, NordVPN, TorGuard) for reliability and reduced detection by social media platforms.
- Advised against free VPNs for social media use due to high risk of account suspension.
- Introduced tools for generating fake data:
thispersondoesnotexist.com: AI-generated faces for profile pictures.datafakenerator.com: generates fake names, addresses, phone numbers (country-specific, e.g., Mexico).mohmal.com: temporary email (45-minute lifespan) for one-time verifications.proton.me: encrypted, privacy-focused email with servers in Switzerland (non-cooperative with governments).
9. Ego Surfing and Personal Data Audit [01:28:44 - 01:34:03]
- Defined “Ego Surfing” as searching for one’s own personal data online to assess exposure.
- Benefits: identify leaked data (e.g., old passwords, photos, addresses), remove compromising content, detect impersonation accounts, and manage professional reputation.
- Advised learners to perform ego surfing using all course tools to audit their own digital footprint and mitigate risks.
10. Legal and Public Data Sources [01:34:06 - 01:42:29]
- Listed legitimate data sources for OSINT:
- Government open data: pensioner lists (e.g., ISSSTE), tax registries (SAT), public procurement (Compranet).
- Public registries: CURP, RFC, property records (catástrofe), telephone directories (Sección Amarilla).
- Judicial records, public licenses, and government program beneficiary lists (e.g., Prospera).
- Demonstrated downloading and analyzing public datasets (TXT, Excel) from official portals.
- Emphasized that using legally published data is ethical and powerful for cross-referencing investigations.
11. Dark Web, Deep Web, and Tor [01:43:55 - 01:55:53]
- Clarified distinctions:
- Clear Web: Indexable by search engines (e.g., government websites).
- Deep Web: Non-indexed but legal content requiring login (e.g., Facebook, email).
- Dark Web: Accessible only via special software (e.g., Tor), often hosting illegal content.
- Explained Tor’s multi-hop encryption: traffic routes through 3+ relays, hiding origin and destination from any single node.
- Noted that some Tor nodes are operated by governments (CIA, FBI) for surveillance and counter-intelligence.
- Warned that Tor hosts illegal content (drugs, child exploitation, weapons) and carries risk of exposure to malware or malicious exit nodes.
- Advised extreme caution: use Tor only with updated software, avoid downloads, and never use personal accounts.
- Stated that search engines on Tor (e.g., DuckDuckGo) are available but limited; no Google equivalent exists.
Appendix
Key Principles
- OPSEC First: Protect your identity before investigating others.
- Ethical Boundaries: Use only legal, public, or licensed data. Never crack software or bypass paywalls.
- Minimize Footprint: Use VMs, disposable identities, and encrypted tools to avoid being tracked.
- Verify Everything: Cross-reference geolocation, metadata, and multiple sources before drawing conclusions.
Tools Used
- Social Media: Botometer, OneMillionTweetMap, Twitter/X dorks (
from:,to:,keyword). - Geolocation: Google Maps, Google Earth (3D view, Street View comparison).
- OSINT Frameworks: OSINTux, Kali Linux.
- Installation: The Harvester (via pip).
- Privacy: uBlock Origin, Privacy Badger, ProtonMail, Mohmal, DataFakenerator, ThisPersonDoesNotExist, PIA/NordVPN/TorGuard.
- Dark Web: Tor Browser (for awareness only).
Common Pitfalls
- Using personal accounts or devices for investigations.
- Relying on outdated or unverified data (e.g., 2009 Street View).
- Assuming free tools (VPNs, email) are safe for social media use.
- Ignoring tool updates, leading to broken functionality.
- Failing to log queries and results, compromising evidentiary value.
Practice Suggestions
- Perform ego surfing on yourself using all tools covered.
- Install and test The Harvester on a test domain (e.g., your university).
- Set up a VM with OSINTux and practice geolocating a public image or video.
- Create a fake profile using AI-generated image and fake data, then attempt to trace it back (to understand detection risks).
- Use Tor Browser to access a .onion site (e.g., DuckDuckGo’s onion version) — but do not interact with any content.