511/68 Monday, December 8, 2025

A critical vulnerability in Apache Tika, tracked as CVE-2025-66516 with a maximum CVSS score of 10.0, enables attackers to perform XML External Entity (XXE) Injection across multiple components, including the core module (tika-core), the PDF module (tika-pdf-module), and the parser module (tika-parsers). Attackers can embed a crafted XFA file inside a PDF to force the system to process external entities, potentially allowing access to sensitive internal data or resources that should otherwise be protected.
Apache Tika is an open-source toolkit widely used for content analysis and extraction across various file formats, and is heavily integrated into search indexing systems, document ingestion pipelines such as Apache Solr and Elasticsearch, as well as compliance systems and data analytics platforms. As a result, this vulnerability poses a significant risk to organizations that rely on Tika for processing large volumes of files.
The vulnerability is related to CVE-2025-54988, but recent disclosures have expanded the scope of affected packages. Although the flaw was identified through the PDF parser module, the root cause and fix lie in tika-core. Therefore, updating only the PDF module without upgrading tika-core to version 3.2.2 or later leaves systems vulnerable.
Additionally, in Apache Tika 1.x, the PDFParser is bundled within tika-parsers, meaning the range of affected packages is broader than initially reported.
Affected versions include:
- tika-core 1.13 – 3.2.1
- tika-parsers 1.13 prior to 2.0.0
- tika-parser-pdf-module 2.0.0 – 3.2.1
Users are strongly urged to upgrade to the latest version of Apache Tika immediately to prevent exploitation through malicious PDF files containing XFA and to mitigate the risk of unauthorized access to internal system resources.
