Inside the Mind of a Digital Forensics Expert (Guest Post)

What you reveal online is more than you think. A digital forensics expert walks us through how OSINT turns innocent data into a detailed profile.

Secrets of Privacy and Dystopian Think Tank

Jul 23, 2025

What if someone could build a surprisingly complete profile of your life just by connecting a few digital breadcrumbs?

This guest post by Dystopian Think Tank, a digital forensics practitioner, pulls back the curtain on how investigators do exactly that, without ever “hacking” anything.

At Secrets of Privacy, we talk a lot about how to become a harder target. That starts with understanding what kind of data you leave behind and how that data can be used against you.

This isn’t a theoretical breakdown of Open-Source Intelligence (OSINT). It’s a practical walk-through from someone who uses these techniques in the real world.

Why publish this?

Because when you see how your IP address, usernames, purchase history, and even the metadata in your photos can be stitched together to reveal your identity, you start to think differently about your everyday online behavior.

If you’ve ever used the same username on multiple websites or posted/shared a photo online, you won’t want to miss this.

As a Digital Forensics practitioner, part of my trade involves tracking individuals. Not by hacking, but by following the information they leave openly available online. Although this practice is (for the most part) legal, some still view it as ethically questionable. The legal specifics actually vary from country to country, but there is saying: once it’s out there, it’s out there.

This article focuses on Digital Open-Source Intelligence (“OSINT”), specifically its practical applications in investigating individual.

In everyday terms, OSINT is the data you inadvertently leave behind when interacting online. Though over the years the term “open” has become somewhat in dispute. Some examples:

When you log in to your bank, the bank collects your IP address, browser type, and device information. This could become OSINT.
Making a purchase online leaves behind your IP address, delivery address, and a browser “fingerprint” used by advertisers. This could become OSINT.
Registering a website requires providing personal details to domain registrars, which is often publicly viewable. This data is OSINT.
Connecting to a gaming server leaves records of your IP address and device details. This could become OSINT.
Photos taken with digital cameras like a Canon Rebel 5D contain metadata, potentially linking you to locations or devices. This metadata and the pictures themselves (including the maths that put the images together) are OSINT.
Emails leaked from an old employer or account that become publicly searchable. This leaked data is OSINT.

Combinations of the above are fertile seeds that grow into full profiles of information that are later used by investigators, scammers, and data brokers alike. Often these databases of collected information are cleaned to the point where even the original owner’s are uncertain where they collected the information from, particularly when automated techniques are used for the collection.

In an ideal world, OSINT would be defined by the word “open” and thus legal and available. However, there has always been an imbalance of functionality versus privacy with the World Wide Web. A certain amount of data exchange between devices is required for applications and websites to function, and many countries do not consider that information to be sensitive or personal. Even when they do, they often lack the legislative and practical means to enforce any policies.

Most users inadvertently leave behind digital "breadcrumbs" that investigators can follow. With recent developments in artificial intelligence and data analytics this process is becoming simplified. AI models are filled with data that users did not envisage being practicably readable to ordinary users. Some of my recent work has revolved around tricking AI into performing OSINT tasks inappropriately, and I can say it’s reasonably effective.

What was once protected behind expert knowledge and the black arts of scripting will soon be widely available. With this in mind, it doesn’t hurt to know what your data looks like from an “investigator perspective”. AI technology will, after all, model its approach on tried and tested techniques created by practitioners. In light of this, let’s take a look at a few breadcrumbs you might leave behind and how they might lead to more information about you.

Disclaimer: This article is not encouraging the use of breach data for unlawful purposes, but does discuss the use of such data. This article is also not intended for veteran OSINT practitioners and uses plain language as much as possible.

IP Addresses

An IP address functions like the return address on a letter. It's where the internet sends information intended for your computer or device. When collecting data about a person, investigators typically prioritize historical IP addresses over current ones. If you know a person has used a certain IP address over a long period of time you can work with that breadcrumb to collect more information.

A historical IP addresses can link to leaked emails, passwords, and phone numbers. IP addresses assigned by ISPs or VPN providers appear in public registries like ARIN, APNIC, or RIPE, making large-scale searches relatively straightforward; particularly since your IP address will be linked to your ISP, location, and other data. Even if your ISP cycles your IP address regularly, it’s trivial in this day and age to check all of the IP addresses in a block for relevant information.

While it’s possible to use VPNs and similar services to mask your IP address, or have your ISP change your IP address; these services often purchase their IP addresses in “netblocks” that follow patterns. These blocks are identifiable and visible via the aforementioned public registries (ARIN, APNIC, or RIPE). The big thing to keep in mind is an investigator just needs to know *when* you used an IP address, not if you’re currently using it now. Weak operational security or carelessness will often provide this information and can often provide "leaps" between on IP address and another.

Online games can be one unusual source. Sometimes it’s from a server breach, but some games (especially older titles) will leave this information available for various reasons. Online service games, and hacked servers will at times have leaks or breaches. In other instances, some games publish the addresses of everyone in their servers to web services as part of administration tools used by game staff which provides another source of data. Larger data breaches will contain usernames or email addresses used for logins. Some will even include further information such as details regarding your web browser which further simplify tracking you. Wikipedia also publishes IP addresses with user submitted comments, and in some instances, this type of data can lead back to delivered food transactions, purchases, forum logins, or more.

Searching large amounts of breach data for collections of these IPs is trivial; minutes not days or hours. From there, a single data point such as a name, email address, date of birth, or username we can start to build a web out of a target.

This is often how hackers target specific companies or entities. Each piece of data collected can be “rerun” against all the other data in your set to build a pattern. For example, an old password of yours can be used to find previously owned email or gaming accounts.

Passwords and Email Addresses

Passwords and emails frequently appear together in data breaches providing clues about user identity. Repeated use of similar passwords and usernames across services allows investigators to perform fuzzy searches, identifying patterns in compromised data.

Email addresses, passwords, and usernames form a sort of “trifecta”. If the target changes one or even two, the third can still be used as an anchor point to find new accounts and information. Using automated methods, even fairly major changes won’t throw an investigator off.

Current day investigation tools are sophisticated. You can quickly run entire datasets to find variations in passwords (e.g., TH1Ng114, 114TH1Ng, Th1-ng114). Some of the tools I currently use are configured to automatically manage both language changes and variations. For example: Chinese names often have variations in spelling and arrangement, but with automated tooling this won’t throw off analysis. With AI already on our doorstep, it’s only getting easier to further prune large datasets.

With this in mind, the objective is always harvesting more details. For instance, linking an email address or phone number with an active WhatsApp or Gravatar profile can provide further personal context or a picture of you. The same applies to similar themed usernames or other patterns. People commonly will reuse handles for games consoles, delivery services, book review services, and sometimes even adult services. These accounts may also lead to friends and associates who are linked to even more accounts that the target might own.

Constructing patterns from network data, naming data, and password data is a powerful way to cut down the number of accounts that could potentially be related to an investigator’s target, and it typically can be done in an hour or two. Less if the target is well exposed or has unique properties in their naming conventions.

It can also get a little darker. OSINT tools often come with or can be modified to include “grey hat” or “black hat” functions, such as automated password resets; such resets will reveal secret questions or the last digits of phone numbers in an automated fashion. Even just the last couple of digits of your phone number or a secret question can help confirm an account’s association with you. Because these attempts are so common that some large companies won’t even notify you that someone attempted to login to your account. Even if you do get a notification and change your password— that’s not what the investigator is looking for. They’re looking for things like your social media accounts.

Social Media and Infrastructure

Over the years, companies like Facebook have attempted to manage their platforms to reduce data scraping and the level of OSINT practitioner access. In many cases this has been somewhat successful in at least limiting the speed at which data can be accessed. However, techniques are alive and well that allow investigators to collate data from groups, posts, friends lists, friends of friends, and the rest. These techniques are not restricted to law enforcement.

Screenshot from Maltego, a cyber investigator platform which visualizes and assists in collecting OSINT, showing an investigator’s view of data and accounts belonging to Elon Musk.

Locking down privacy on Social Media platforms is an option, and one you should exercise; but remember that someone somewhere always has access to everything you post. The data must flow somewhere, and data brokers and collectors exist for every major platform.

If you are a person of note on platforms such as X or Bluesky there is a good chance someone is scraping you already, and they’re doing it quite quickly. Petabytes of historical data are available from X, Telegram, Bluesky and other sources because, simply put, these sources are not private. Never have been.

While employers have been coy about it, I have been asked to conduct “checks” on potential new hires in the past and it almost always centered around Social Media concerns. Where people tend to get “caught” by investigators is when a “profile” of them can be constructed that can be used to seek out other information or interests. Perhaps you were caught in data leak of a fitness device. The investigator wasn’t positive it was your details until they found photos of you running. Perhaps you think you have a “private” Facebook account, but you join all the same or similar groups as your older profile and have overlaps in friends.

Web and Property Records

We will mention in passing that property records, court filings, and other factors can come into play here. If you have a common name, it may be difficult to narrow down where you live or a house you own. However, with the information gained in above sections an investigator can start to use these more “vanilla” sources to collect further information about you though typically it requires a small cost. The real meat of this section is in websites.

If you’ve ever signed up a website, you are likely aware that “WHOIS” records can be viewed; essentially, they are the publicly associated details of that Web Domain. This will often include a contact number for the registrant. There are many services available which provide historical data on domain registrations; often many many years worth. WHOIS records will only provide recent and current web registrations, but it is possible to purchase or acquire access to large databases of historical filings. Domain registrations include registrants' full names, phone numbers, and addresses. Even if you have arranged your current registration details to be private, if they were ever historically available then they can be retrieved. Again, this is not limited to law enforcement.

Although domain data quality varies, it remains useful in broader OSINT investigations. Most people don’t realize or even remember historical domain records they may have created which makes them an excellent source. In some cases, a target has changed their name, married, or similar and a historical domain lookup is one way to discover that information. Archive services such as the “Way Back Machine” may also keep copies of information from these old websites that were linked to you. Old usernames, nicknames, email addresses, associates, etc….

With OSINT it’s all about pulling on small threads until the ball comes loose. I’ll also stress again that these techniques are automated for the most part. Tools will automatically list all email addresses, phone numbers, and other information related to a website with the proper automation. Similar automation can harvest copies of websites from “Wayback”, create lists of possible email addresses of a target, and connect identities. Even if the identities used by a target are “false” they can often be relinked back to the real person.

Archives such as “Wayback” also bring us to multimedia collection. Photos and documents are truly an underestimated source of data by many practitioners.

Screenshot of the Internet Archive WayBackMahine.

Physical Cameras and Device Metadata

Previously, camera serial numbers were searchable via Google. They put an end to that with good reason; to prevent users from being tracked via their photos. The trick, and overall purpose, of OSINT therefore is to always prune. Searching for photos belonging to a target is hard when the dataset is all known images, but if we can narrow it down and use it as a technique to confirm a handful of accounts or potential identities… that’s powerful.

Investigators can still cross-reference device ownership through metadata embedded in photos (e.g., EXIF data, camera serial numbers) if they can collect the images first. This could be of consequence if a target has a “second life” but use and reuse the same hardware regularly. Advanced techniques such as Photo Response Non-Uniformity (PRNU) analysis match image noise patterns from known user images to new images without the requirement for metadata; this is known as blind authentication. In some rare, but unique, circumstances other natural statistics / properties of your camera can factor in.

Specific degradation or damage, lens scratches, specific settings, post processing, compression, file structure... Much like the properties of your browser can be pulled together into a “fingerprint” the same is true of photos. So even if you delete the metadata from your device, it doesn’t mean your device can’t be linked to its output. Databases of comparison images and camera statistics are available to purchase for this very specific use. The same rules also apply to document scanners and other traditional imaging devices. Even without access to a database, one can still scrape the web for similar samples quite quickly and using automated tooling.

Typically these techniques, sometimes referred to as “image ballistics” are used to prove a person took a particular photo or made a particular document, but they have applications in tracking people. With the rise in artificial intelligence, this type of analysis will only become more available.

Conclusions

Complete online anonymity is challenging due to the extensive digital footprints most individuals unknowingly leave behind. For most, it’s likely not possible; limiting exposure is often the best approach.

Strong privacy hygiene means pro-actively managing online identities and accepting some level of publicly available personal information. Though people often overlook being candid with their personal lives. Transparency in personal interactions often leads to better security than secrecy.

Dystopian Think Tank is a Substack publication that blends speculative fiction with deep reflections on technology. It's written by a professional in Digital Forensics and Incident Response.

I've led forensics teams in criminal, civil, and security related matters. My work has supported courts, regulatory bodies, law enforcement, multinational firms, and government departments. It has also periodically supported cats, my chronic reading habit, and even the odd person. My most controversial opinion is that artificial intelligence still can’t generate forensic forgeries.

Love collaborating with other Substackers, so feel free to reach out any time.

Closing thoughts From Secrets of Privacy

Most people have no idea how exposed they really are.

This post gives you a rare look into the mindset and methods of someone who works on the investigative side of the privacy equation. It's a useful reminder that you don’t need to be famous or wealthy to be a target. Sometimes, just being online is enough.

So what can you do?

Minimize your exposure: Use VPNs, aliases, and privacy-friendly services to reduce the digital fingerprints you leave behind.
Shield Your Data: Discover the Privacy Perks of VPNs
Secrets of Privacy
·
May 23, 2024
Read full story
Stop reusing usernames and passwords: Pattern recognition is one of the fastest ways investigators (or bad actors) can link your accounts.
Best Practices for Creating Usernames on Websites
Secrets of Privacy
·
December 15, 2023
Read full story
Lock down your social media: Public profiles are a goldmine for OSINT. Review what’s visible, and to whom, and restrict as much as possible for your particular situation.
Be mindful of metadata: Stripping EXIF data from photos and documents is a small habit that can pay off big.

You don’t need to be perfect. But every privacy win you stack makes you a harder target.

Thanks to our guest author for sharing their perspective. If you'd like to see more behind-the-scenes posts like this, let us know.

And if you’re new here, consider subscribing to Secrets of Privacy. We publish practical, no-hype guides to help you protect your identity, information, and peace of mind online.

Got something to add? Is there a specific topic you’d like us to cover? Drop a comment to keep the conversation going.

💥 P.S. If you found this post helpful, would you please consider restacking it and sharing it with your friends, family and audience?

This helps spread the words and keeps us writing content that will help you bolster your privacy and become a harder target.

Looking for help with a privacy issue or privacy concern? Chances are we’ve covered it already or will soon. Follow us on X and LinkedIn for updates on this topic and other internet privacy related topics. We’re also now on Rumble and YouTube. Subscribe today to be notified when videos are published.

Disclaimer: None of the above is to be deemed legal advice of any kind. These are opinions written by a privacy attorney with years of working for, with and against Big Tech and Big Data. And this post is for informational purposes only and is not intended for use in furtherance of any unlawful activity. This post may also contain affiliate links, which means that at no additional cost to you, we earn a commission if you click through and make a purchase.Check out our Personal Privacy Stack here. It’s a simple, easy way to start De-Googleling your life.

AI scams are here and getting more sophisticated. One of the best things you can do to protect yourself is to remove your personal information from Google and the data broker sites. That starves the scammers of vital information, making you a much harder target. You can DIY, or pay a reasonable fee to DeleteMe to do it for you. Sign up today and get 20% off using our affiliate link here. We’ve used DeleteMe for almost five years and love it for the peace of mind. It’s also a huge time saver.

Privacy freedom is more affordable than you think. We tackle the top Big Tech digital services and price out privacy friendly competitors here. The results may surprise you.
Check out our specialized privacy and security guides in our digital shop. Below is a sample of what’s available. People are really enjoying the “How Exposed Are You Online” guide. Get started here.

If you’re reading this but haven’t yet signed up, join the booming Secrets of Privacy community for free (2.3K+ subscribers strong) and get our newsletter delivered to your inbox by subscribing here 👇

A guest post by

Dystopian Think Tank

I write about the secrets buried in hard drives and the ones lurking beyond the stars. Author of Interface, Protocol, and Short Stories.

Paul Snyder

Aug 18

Great info.

I realize that the “Breadcrumb” analogy has been widely adopted by the digital community, though in questioning a recent college grad when he was throwing around the term from whence it was derived…

He had no idea 😑

I used to lead and train SAR teams. The standard bit of levity was to get them thoroughly lost and then ask “Who had breadcrumb duty?”

Those beyond a certain age got it. Below that age… performative and obligatory laughter covered their cluelessness.

Hansel had a reasonable plan upon their first foray into the Deep Woods… he used white pebbles to designate their return path. All went according to plan.

Their second journey, however, was plagued by a lack of materials, in that he ran out of the high visibility pebbles and was forced to substitute “breadcrumbs”.

Birds have a thing for breadcrumbs, consumption occurs, lost kids, witches, hilarity ensues.

If only digital markers were randomly consumed by the natural environment, we might not have so many problems.

“We” collectively leave something more along the lines of obelisks… virtual monuments indicating our links and preferences. And even when some nefarious “bird” consumes them, they’re all still “out there” for the rest of the flock to feed upon, ceaselessly.

Fun to blow IT Frat Boys minds with a bit of Grimm, especially when they’re somewhat inebriated.

Thanks again for the post. Best.

2 replies

Secrets of Privacy

Ha. Yeah, now that you say that, "breadcrumbs" probably isn't the ideal metaphor.

2 more comments...

Shield Your Data: Discover the Privacy Perks of VPNs

Best Practices for Creating Usernames on Websites

Discussion about this post

Ready for more?