Introducing Sidikjari: Metadata Extraction for Cybersecurity Professionals
- Keith Pachulski
- May 9
- 3 min read

Introducing Sidikjari: Advanced Metadata Extraction for Cybersecurity Professionals
Metadata analysis is a crucial part of security assessments, but it can be tedious and time-consuming. That's why we developed Sidikjari, a Python-based tool that automates metadata extraction and analysis for security professionals. It's designed to streamline intelligence gathering during penetration tests, security audits, and digital forensics work.
What is Sidikjari?
Sidikjari (developed by Red Cell Security, LLC) is an open-source tool that automates metadata discovery and analysis. It efficiently crawls websites, retrieves documents, and extracts valuable metadata that organizations often inadvertently leave exposed. The tool reveals information that's technically public but typically hidden from casual observation, providing security professionals with actionable intelligence.
Key Features
Efficient Website Crawling
Sidikjari methodically explores target websites to identify documents and sensitive forms. You can configure the crawl depth based on the scope of your assessment and available time.
Document Metadata Extraction
The tool's core strength lies in its ability to extract detailed metadata from various document types:
PDF documents
Microsoft Office files (DOCX, XLSX, PPTX)
Image files (JPG, JPEG, PNG, GIF)
CSV files
Advanced Information Discovery
Sidikjari doesn't just extract basic metadata - it digs deeper to find:
Usernames and author information
Email addresses
Internal domain names
Software versions and creation tools
File paths that may reveal system architecture
IP addresses embedded in documents
GPS coordinates from geotagged files
Form Discovery and Analysis
The tool automatically identifies and captures sensitive web forms, including:
Login forms
Registration forms
Contact and subscription forms
Payment forms
Domain Intelligence
Sidikjari performs domain analysis:
WHOIS information collection
DNS record analysis
SSL certificate analysis and security assessment
IP geolocation and network details
Enhanced Visualization
The tool generates interactive HTML reports with useful visual elements:
Website screenshots for context and reference
Interactive maps showing locations of geotagged content
Relationship graphs connecting discovered entities (users, emails, domains)
Collapsible document sections organized by file type for easier navigation
Flexible Deployment Options
Sidikjari can be used in two primary modes:
Website scanning mode: Crawls websites to discover and analyze documents
Local directory mode: Processes files from a local directory for metadata extraction
Real-World Applications
For Penetration Testers
Discover potential security weaknesses through metadata
Identify information leakage in published documents
Map organizational structure through author and user information
Uncover network infrastructure details
For Security Auditors
Assess document security practices
Verify proper data sanitization before publication
Identify sensitive information exposure risks
Document organizational data handling practices
For Digital Forensics
Extract timestamps and authorship information
Determine software versions used to create documents
Identify geographic locations from file metadata
Map relationships between digital assets
Where to Download
Sidikjari is available as an open-source project on GitHub. You can download the latest version from the official repository:
GitHub Repository: https://github.com/sec0ps/sidikjari
Simply clone the repository or download the ZIP file to get started:
git clone https://github.com/sec0ps/sidikjari.git
cd sidikjari
Getting Started
Sidikjari requires Python and a few dependencies, with ExifTool being the primary external requirement. The tool supports customizable parameters including:
Target URL or local directory
Crawl depth
Threading for performance optimization
Time delays between requests
Custom user agents
Basic usage for website scanning:
python sidikjari.py --url example.com --depth 2 --output report_directory
For local directory analysis:
python sidikjari.py --local /path/to/documents --output report_directory
Security and Ethics
As with any powerful security tool, Sidikjari should be used responsibly and ethically. Always ensure you have proper authorization before scanning websites or analyzing documents that don't belong to you. The tool is designed for legitimate security assessments, penetration testing, and digital forensics work.
Practical Value
Metadata consistently proves to be a valuable source of intelligence that many organizations fail to properly sanitize. Sidikjari helps security professionals efficiently tap into this information source, transforming seemingly innocuous file properties into actionable intelligence.
For penetration testers, security auditors, and digital forensics investigators, Sidikjari offers a time-saving approach to metadata analysis that can reveal insights that might otherwise be missed through manual inspection. The tool's automation capabilities make it a practical addition to any security assessment toolkit.
Have questions or want to see how Sidikjari fits into your security operations?
📅 Book time with us to chat with our team.
Comments