A modern, multithreaded GUI application that identifies real file types via binary signatures (Magic Numbers) to uncover hidden extensions, detect masqueraded malware, and mitigate phishing vectors.
#Cybersecurity #MalwareAnalysis #MagicNumbers #Python #CustomTkinter #DigitalForensics #AntiPhishing #DefensiveSecurity #BinaryAnalysis #SecOps
The File Type Identifier Using Magic Numbers is a legitimate, lightweight defensive security and malware analysis tool.
In cybersecurity, attackers frequently disguise executable files or malicious scripts as benign documents (e.g., renaming malware.exe to invoice.pdf or photo.jpg). Relying purely on the operating system's file extension handler leaves users vulnerable to spoofing.
This utility addresses that vulnerability by bypassing the file extension entirely. It reads the raw binary header (the first few bytes) of a file, matches it against a comprehensive signature database (Magic Numbers), compares the true identity with the user-facing file extension, and immediately flags any discrepancies or potential threats.
- Eliminate Blind Trust: Break reliance on superficial file extensions.
- Empower Malware Analysts: Provide triage capabilities to identify binary formats.
- Enhance Phishing Defense: Detect deceptive file configurations before execution.
- Deliver Accessible Security: Package digital forensic concepts into an intuitive, high-performance GUI.
Every standard file format embeds a unique sequence of bytes at its absolute beginning, known as a Magic Number or File Signature.
- Operating systems use Extensions (
.png) for superficial association. - Applications use Magic Numbers (
\x89PNG\r\n\x1a\n) for data integrity.
This application forces a strict validation comparison:
[User Selects File / Folder]
โ
โผ
[Scanner Module] โโโโบ Opens file in Raw Binary Read Mode ("rb")
โ
โผ
[Header Extraction] โโโโบ Reads first 64 bytes of data
โ
โผ
[Signature Comparison] โโโโ Matches against magic_db.py JSON-like Dictionary
โ
โโโโโบ Match Found? โโโโบ Extracted True MIME/Type
โโโโโบ No Match? โโโโบ Categorized as "unknown"
โ
โผ
[Validation Engine] โโโโบ Boolean evaluation: (Extension == True Type)
โ
โโโโโบ TRUE โโโโบ Log Status: Clear
โโโโโบ FALSE โโโโบ Log Status: MISMATCH (Flagged Threat)
โ
โผ
[UI Main Thread Updates] โโโโบ Pushes asynchronously via UI .after() hook
Phishing is a social engineering attack where malicious actors trick targets into revealing sensitive credentials or downloading malicious payloads. A high-risk delivery method is email attachments where the extension is manipulated to exploit human psychology.
Naming a file:
document.pdf.exe
If extensions are hidden in Windows, the user only sees:
document.pdf
Exploiting Unicode characters to flip the visual representation of text:
annex_gpj.exe
appears visually as:
annex_exe.jpg
Sending a dangerous executable or script while altering its extension to a harmless format such as:
.txt.png
waiting for execution through system misconfiguration or vulnerability exploitation.
- Never open unexpected attachments.
- Verify the sender through an out-of-band communication channel.
- Verify suspicious files using Magic Number analysis.
- Look for mismatches between extension and actual file type.
Example:
Claims to be: photo.jpg
Actual type: exe
This is a strong indicator of compromise (IoC).
Analysts can perform initial triage on user-reported phishing attachments without exposing endpoints to execution risks.
Responders can sweep directories for hidden executables deployed by threat actors maintaining persistence.
Useful for students studying:
- Digital Forensics
- Malware Reverse Engineering
- Low-Level Computing
- Binary Analysis
| Component | Technology |
|---|---|
| Programming Language | Python 3.10+ |
| GUI Framework | CustomTkinter |
| Drag-and-Drop | TkinterDnD2 |
| Image Processing | Pillow (PIL Fork) |
| Concurrency | Python Threading Library |
FileTypeIdentifier/
โ
โโโ main.py
โโโ magic_db.py
โโโ scanner.py
โโโ exports.py
โโโ requirements.txt
โ
โโโ assets/
โ โโโ logo.png
โ
โโโ reports/
customtkinter
tkinterdnd2
Pillow- โจ Multi-Extension Check
- ๐ Asynchronous Worker Engines
- ๐ Real-Time Threat Metrics
- ๐ JSON & TXT Report Exporting
Reads only:
f.read(64)allowing extremely fast processing of large files.
No external communication required.
Runs on:
- Windows
- Linux
- macOS
Office documents and ZIP archives share:
50 4B 03 04
requiring deeper inspection.
Attackers may intentionally remove or alter file signatures, resulting in:
unknown
classification.
This application runs entirely on:
localhost
No internet connection, web server, or open ports are required.
Install:
Python 3.10+
and ensure it is added to your system PATH.
FileTypeIdentifier/
Place all source files inside this folder.
pip install customtkinter tkinterdnd2 Pillowpython main.py- Click Scan File
- Select a file
- Check the Mismatch column
YES = Potential Threat
NO = Match Verified
- Click Scan Folder
- Select a directory
- Review all results
- Export JSON
- Export TXT
Reports are saved directly to disk.
This software is provided "as is" for educational, forensic triage, and security awareness purposes.
It is not a replacement for:
- Endpoint Detection & Response (EDR)
- Malware Sandboxes
- Antivirus Platforms
- Threat Hunting Suites
A successful scan does not guarantee a file is malware-free.
- Byte-level file analysis using binary streams
- Multithreaded GUI architecture
- File spoofing and social engineering attack techniques
- Magic Number signature validation
- Event-driven interface design
Detect packed, compressed, or encrypted malware payloads.
Differentiate:
.docx.xlsx.pptx.zip
through internal archive inspection.
Combine file signature validation with advanced pattern-based malware detection.
Developed By: Syed Shaheer Hussain
Copyright Horizon: ยฉ 2026 Syed Shaheer Hussain. All Rights Reserved.
Licensing Agreement: Licensed strictly for security awareness, local administrative triage, and instructional computing models. Use responsibly to secure your environments.