top of page
  • X
  • Facebook
  • Linkedin
  • Instagram
Search

Introducing Sidikjari: Metadata Extraction for Cybersecurity Professionals


Introducing Sidikjari: Advanced Metadata Extraction for Cybersecurity Professionals

Metadata analysis is a crucial part of security assessments, but it can be tedious and time-consuming. That's why we developed Sidikjari, a Python-based tool that automates metadata extraction and analysis for security professionals. It's designed to streamline intelligence gathering during penetration tests, security audits, and digital forensics work.


What is Sidikjari?

Sidikjari (developed by Red Cell Security, LLC) is an open-source tool that automates metadata discovery and analysis. It efficiently crawls websites, retrieves documents, and extracts valuable metadata that organizations often inadvertently leave exposed. The tool reveals information that's technically public but typically hidden from casual observation, providing security professionals with actionable intelligence.


Key Features


Efficient Website Crawling

Sidikjari methodically explores target websites to identify documents and sensitive forms. You can configure the crawl depth based on the scope of your assessment and available time.


Document Metadata Extraction

The tool's core strength lies in its ability to extract detailed metadata from various document types:

  • PDF documents

  • Microsoft Office files (DOCX, XLSX, PPTX)

  • Image files (JPG, JPEG, PNG, GIF)

  • CSV files


Advanced Information Discovery

Sidikjari doesn't just extract basic metadata - it digs deeper to find:

  • Usernames and author information

  • Email addresses

  • Internal domain names

  • Software versions and creation tools

  • File paths that may reveal system architecture

  • IP addresses embedded in documents

  • GPS coordinates from geotagged files


Form Discovery and Analysis

The tool automatically identifies and captures sensitive web forms, including:

  • Login forms

  • Registration forms

  • Contact and subscription forms

  • Payment forms


Domain Intelligence

Sidikjari performs domain analysis:

  • WHOIS information collection

  • DNS record analysis

  • SSL certificate analysis and security assessment

  • IP geolocation and network details


Enhanced Visualization

The tool generates interactive HTML reports with useful visual elements:

  • Website screenshots for context and reference

  • Interactive maps showing locations of geotagged content

  • Relationship graphs connecting discovered entities (users, emails, domains)

  • Collapsible document sections organized by file type for easier navigation


Flexible Deployment Options

Sidikjari can be used in two primary modes:

  • Website scanning mode: Crawls websites to discover and analyze documents

  • Local directory mode: Processes files from a local directory for metadata extraction


Real-World Applications


For Penetration Testers

  • Discover potential security weaknesses through metadata

  • Identify information leakage in published documents

  • Map organizational structure through author and user information

  • Uncover network infrastructure details


For Security Auditors

  • Assess document security practices

  • Verify proper data sanitization before publication

  • Identify sensitive information exposure risks

  • Document organizational data handling practices


For Digital Forensics

  • Extract timestamps and authorship information

  • Determine software versions used to create documents

  • Identify geographic locations from file metadata

  • Map relationships between digital assets


Where to Download

Sidikjari is available as an open-source project on GitHub. You can download the latest version from the official repository:



Simply clone the repository or download the ZIP file to get started:

git clone https://github.com/sec0ps/sidikjari.git
cd sidikjari

Getting Started

Sidikjari requires Python and a few dependencies, with ExifTool being the primary external requirement. The tool supports customizable parameters including:

  • Target URL or local directory

  • Crawl depth

  • Threading for performance optimization

  • Time delays between requests

  • Custom user agents

Basic usage for website scanning:

python sidikjari.py --url example.com --depth 2 --output report_directory

For local directory analysis:

python sidikjari.py --local /path/to/documents --output report_directory

Security and Ethics


As with any powerful security tool, Sidikjari should be used responsibly and ethically. Always ensure you have proper authorization before scanning websites or analyzing documents that don't belong to you. The tool is designed for legitimate security assessments, penetration testing, and digital forensics work.


Practical Value


Metadata consistently proves to be a valuable source of intelligence that many organizations fail to properly sanitize. Sidikjari helps security professionals efficiently tap into this information source, transforming seemingly innocuous file properties into actionable intelligence.


For penetration testers, security auditors, and digital forensics investigators, Sidikjari offers a time-saving approach to metadata analysis that can reveal insights that might otherwise be missed through manual inspection. The tool's automation capabilities make it a practical addition to any security assessment toolkit.


Have questions or want to see how Sidikjari fits into your security operations?


📅 Book time with us to chat with our team.


 
 
 

Comments


© 2025 by Red Cell Security, LLC.

bottom of page