File Metadata Remover

A tool I made where you can remove file metadata.

File Metadata Remover

Metadata Remover - GitHub

Removes Metadata from files. AKA metadata scrubbing.

Why?

The problem:

A crucial step in the cyber kill chain is reconnaissance, where an attacker scopes at the target using open source intelligence tools and skills to find “vulnerable” spots to enter from.

Attackers and anyone in general can easily access public files from a website using a google search:

site:website-name filetype:filetype

example) site:aws.amazon.com filetype:pdf

(looks for pdf’s on amazon)

These files can contain metadata that shows the kind of software and software versions a company might use.

Hackers will then try to exploit these software versions by looking for any unpatched vulnerabilities.


One solution that I am covering here is to remove file metadata from any files published online in an attempt to reduce a company’s attack surface area.


Instructions you can follow:

  1. Install exiftool “sudo apt install libimage-exiftool-perl
  2. Run “exiftool "file-to-be-scrubbed.pdf"

However, this doesn’t remove the embedded metadata. The metadata can still be recoverd. Pdftk tries to reduce the chances of this metadata being recovered.


What I made:

Here’s the site:

alt text

 

I upload the file “OrganizationsCoreAssignment.pdf” (SANS example):

alt text

 

I download the cleaned file and compare the metadata:

alt text

As you can see, both the author and software, potential OSINT information, were removed.

Dependencies: exiftool, pdftk, qpdf, pdfinfo


Further OSINT reconnaissance mitigation topics I’d like to cover later on:

  • Webscraping
  • Email address obfuscation
  • Social media privacy
  • Network exposure (ports)
  • Job posting sanitization (although job posts might need an overhaul anyways to filter out bots somehow)
Email copied to clipboard!
Discord username copied to clipboard!