aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPhilip Paeps <philip@FreeBSD.org>2025-04-02 08:56:02 +0000
committerPhilip Paeps <philip@FreeBSD.org>2025-04-05 03:19:08 +0000
commit41b768ae1970ed484abaaea401453c3902df93c2 (patch)
treea61c52cdc6d4964f40e7c9a2b2d61f4b64a93a6b
parent5ae5f71d505ccddc7de235d3f9e3d9bdb03dd454 (diff)
contrib/expat: import expat 2.7.1
Changes: https://github.com/libexpat/libexpat/blob/R_2_7_1/expat/Changes https://github.com/libexpat/libexpat/blob/R_2_7_0/expat/Changes Security: CVE-2024-8176 (cherry picked from commit fe9278888fd4414abe2d922e469cf608005f4c65)
-rw-r--r--contrib/expat/COPYING2
-rw-r--r--contrib/expat/Changes123
-rw-r--r--contrib/expat/Makefile.am4
-rw-r--r--contrib/expat/Makefile.in4
-rw-r--r--contrib/expat/README.md18
-rw-r--r--contrib/expat/configure.ac4
-rw-r--r--contrib/expat/doc/reference.html9
-rw-r--r--contrib/expat/doc/xmlwf.12
-rw-r--r--contrib/expat/doc/xmlwf.xml4
-rw-r--r--contrib/expat/fuzz/xml_lpm_fuzzer.cpp464
-rw-r--r--contrib/expat/fuzz/xml_lpm_fuzzer.proto58
-rw-r--r--contrib/expat/fuzz/xml_parse_fuzzer.c2
-rw-r--r--contrib/expat/fuzz/xml_parsebuffer_fuzzer.c2
-rw-r--r--contrib/expat/lib/expat.h6
-rw-r--r--contrib/expat/lib/internal.h5
-rw-r--r--contrib/expat/lib/xmlparse.c586
-rw-r--r--contrib/expat/tests/acc_tests.c5
-rw-r--r--contrib/expat/tests/alloc_tests.c27
-rw-r--r--contrib/expat/tests/basic_tests.c331
-rw-r--r--contrib/expat/tests/benchmark/benchmark.c57
-rw-r--r--contrib/expat/tests/common.c33
-rw-r--r--contrib/expat/tests/common.h4
-rw-r--r--contrib/expat/tests/handlers.c23
-rw-r--r--contrib/expat/tests/handlers.h9
-rw-r--r--contrib/expat/tests/minicheck.h6
-rw-r--r--contrib/expat/tests/misc_tests.c247
-rwxr-xr-xcontrib/expat/tests/xmltest.sh5
-rw-r--r--contrib/expat/xmlwf/readfilemap.c3
28 files changed, 1779 insertions, 264 deletions
diff --git a/contrib/expat/COPYING b/contrib/expat/COPYING
index ce9e5939291e..c6d184a8aae8 100644
--- a/contrib/expat/COPYING
+++ b/contrib/expat/COPYING
@@ -1,5 +1,5 @@
Copyright (c) 1998-2000 Thai Open Source Software Center Ltd and Clark Cooper
-Copyright (c) 2001-2022 Expat maintainers
+Copyright (c) 2001-2025 Expat maintainers
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
diff --git a/contrib/expat/Changes b/contrib/expat/Changes
index aa19f70ae219..9d6c64b6a460 100644
--- a/contrib/expat/Changes
+++ b/contrib/expat/Changes
@@ -11,16 +11,23 @@
!! The following topics need *additional skilled C developers* to progress !!
!! in a timely manner or at all (loosely ordered by descending priority): !!
!! !!
-!! - <blink>fixing a complex non-public security issue</blink>, !!
!! - teaming up on researching and fixing future security reports and !!
!! ClusterFuzz findings with few-days-max response times in communication !!
!! in order to (1) have a sound fix ready before the end of a 90 days !!
!! grace period and (2) in a sustainable manner, !!
+!! - helping CPython Expat bindings with supporting Expat's billion laughs !!
+!! attack protection API (https://github.com/python/cpython/issues/90949): !!
+!! - XML_SetBillionLaughsAttackProtectionActivationThreshold !!
+!! - XML_SetBillionLaughsAttackProtectionMaximumAmplification !!
+!! - helping Perl's XML::Parser Expat bindings with supporting Expat's !!
+!! security API (https://github.com/cpan-authors/XML-Parser/issues/102): !!
+!! - XML_SetBillionLaughsAttackProtectionActivationThreshold !!
+!! - XML_SetBillionLaughsAttackProtectionMaximumAmplification !!
+!! - XML_SetReparseDeferralEnabled !!
!! - implementing and auto-testing XML 1.0r5 support !!
!! (needs discussion before pull requests), !!
!! - smart ideas on fixing the Autotools CMake files generation issue !!
!! without breaking CI (needs discussion before pull requests), !!
-!! - the Windows binaries topic (needs requirements engineering first), !!
!! - pushing migration from `int` to `size_t` further !!
!! including edge-cases test coverage (needs discussion before anything). !!
!! !!
@@ -30,6 +37,116 @@
!! THANK YOU! Sebastian Pipping -- Berlin, 2024-03-09 !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
+Release 2.7.1 Thu March 27 2025
+ Bug fixes:
+ #980 #989 Restore event pointer behavior from Expat 2.6.4
+ (that the fix to CVE-2024-8176 changed in 2.7.0);
+ affected API functions are:
+ - XML_GetCurrentByteCount
+ - XML_GetCurrentByteIndex
+ - XML_GetCurrentColumnNumber
+ - XML_GetCurrentLineNumber
+ - XML_GetInputContext
+
+ Other changes:
+ #976 #977 Autotools: Integrate files "fuzz/xml_lpm_fuzzer.{cpp,proto}"
+ with Automake that were missing from 2.7.0 release tarballs
+ #983 #984 Fix printf format specifiers for 32bit Emscripten
+ #992 docs: Promote OpenSSF Best Practices self-certification
+ #978 tests/benchmark: Resolve mistaken double close
+ #986 Address compiler warnings
+ #990 #993 Version info bumped from 11:1:10 (libexpat*.so.1.10.1)
+ to 11:2:10 (libexpat*.so.1.10.2); see https://verbump.de/
+ for what these numbers do
+
+ Infrastructure:
+ #982 CI: Start running Perl XML::Parser integration tests
+ #987 CI: Enforce Clang Static Analyzer clean code
+ #991 CI: Re-enable warning clang-analyzer-valist.Uninitialized
+ for clang-tidy
+ #981 CI: Cover compilation with musl
+ #983 #984 CI: Cover compilation with 32bit Emscripten
+ #976 #977 CI: Protect against fuzzer files missing from future
+ release archives
+
+ Special thanks to:
+ Berkay Eren Ürün
+ Matthew Fernandez
+ and
+ Perl XML::Parser
+
+Release 2.7.0 Thu March 13 2025
+ Security fixes:
+ #893 #973 CVE-2024-8176 -- Fix crash from chaining a large number
+ of entities caused by stack overflow by resolving use of
+ recursion, for all three uses of entities:
+ - general entities in character data ("<e>&g1;</e>")
+ - general entities in attribute values ("<e k1='&g1;'/>")
+ - parameter entities ("%p1;")
+ Known impact is (reliable and easy) denial of service:
+ CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H/E:H/RL:O/RC:C
+ (Base Score: 7.5, Temporal Score: 7.2)
+ Please note that a layer of compression around XML can
+ significantly reduce the minimum attack payload size.
+
+ Other changes:
+ #935 #937 Autotools: Make generated CMake files look for
+ libexpat.@SO_MAJOR@.dylib on macOS
+ #925 Autotools: Sync CMake templates with CMake 3.29
+ #945 #962 #966 CMake: Drop support for CMake <3.13
+ #942 CMake: Small fuzzing related improvements
+ #921 docs: Add missing documentation of error code
+ XML_ERROR_NOT_STARTED that was introduced with 2.6.4
+ #941 docs: Document need for C++11 compiler for use from C++
+ #959 tests/benchmark: Fix a (harmless) TOCTTOU
+ #944 Windows: Fix installer target location of file xmlwf.xml
+ for CMake
+ #953 Windows: Address warning -Wunknown-warning-option
+ about -Wno-pedantic-ms-format from LLVM MinGW
+ #971 Address Cppcheck warnings
+ #969 #970 Mass-migrate links from http:// to https://
+ #947 #958 ..
+ #974 #975 Document changes since the previous release
+ #974 #975 Version info bumped from 11:0:10 (libexpat*.so.1.10.0)
+ to 11:1:10 (libexpat*.so.1.10.1); see https://verbump.de/
+ for what these numbers do
+
+ Infrastructure:
+ #926 tests: Increase robustness
+ #927 #932 ..
+ #930 #933 tests: Increase test coverage
+ #617 #950 ..
+ #951 #952 ..
+ #954 #955 .. Fuzzing: Add new fuzzer "xml_lpm_fuzzer" based on
+ #961 Google's libprotobuf-mutator ("LPM")
+ #957 Fuzzing|CI: Start producing fuzzing code coverage reports
+ #936 CI: Pass -q -q for LCOV >=2.1 in coverage.sh
+ #942 CI: Small fuzzing related improvements
+ #139 #203 ..
+ #791 #946 CI: Make GitHub Actions build using MSVC on Windows and
+ produce 32bit and 64bit Windows binaries
+ #956 CI: Get off of about-to-be-removed Ubuntu 20.04
+ #960 #964 CI: Start uploading to Coverity Scan for static analysis
+ #972 CI: Stop loading DTD from the internet to address flaky CI
+ #971 CI: Adapt to breaking changes in Cppcheck
+
+ Special thanks to:
+ Alexander Gieringer
+ Berkay Eren Ürün
+ Hanno Böck
+ Jann Horn
+ Mark Brand
+ Sebastian Andrzej Siewior
+ Snild Dolkow
+ Thomas Pröll
+ Tomas Korbar
+ valord577
+ and
+ Google Project Zero
+ Linutronix
+ Red Hat
+ Siemens
+
Release 2.6.4 Wed November 6 2024
Security fixes:
#915 CVE-2024-50602 -- Fix crash within function XML_ResumeParser
@@ -46,6 +163,8 @@ Release 2.6.4 Wed November 6 2024
#904 tests: Resolve duplicate handler
#317 #918 tests: Improve tests on doctype closing (ex CVE-2019-15903)
#914 Fix signedness of format strings
+ #915 For use from C++, expat.h started requiring C++11 due to
+ use of C99 features
#919 #920 Version info bumped from 10:3:9 (libexpat*.so.1.9.3)
to 11:0:10 (libexpat*.so.1.10.0); see https://verbump.de/
for what these numbers do
diff --git a/contrib/expat/Makefile.am b/contrib/expat/Makefile.am
index 7d8e17c2cf86..c20531a8d6c6 100644
--- a/contrib/expat/Makefile.am
+++ b/contrib/expat/Makefile.am
@@ -6,7 +6,7 @@
# \___/_/\_\ .__/ \__,_|\__|
# |_| XML parser
#
-# Copyright (c) 2017-2023 Sebastian Pipping <sebastian@pipping.org>
+# Copyright (c) 2017-2025 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2018 KangLin <kl222@126.com>
# Copyright (c) 2022 Johnny Jazeix <jazeix@gmail.com>
# Copyright (c) 2023 Sony Corporation / Snild Dolkow <snild@sony.com>
@@ -96,6 +96,8 @@ EXTRA_DIST = \
conftools/expat.m4 \
conftools/get-version.sh \
\
+ fuzz/xml_lpm_fuzzer.cpp \
+ fuzz/xml_lpm_fuzzer.proto \
fuzz/xml_parsebuffer_fuzzer.c \
fuzz/xml_parse_fuzzer.c \
\
diff --git a/contrib/expat/Makefile.in b/contrib/expat/Makefile.in
index c0fcb5dd05d1..069ec4047eea 100644
--- a/contrib/expat/Makefile.in
+++ b/contrib/expat/Makefile.in
@@ -22,7 +22,7 @@
# \___/_/\_\ .__/ \__,_|\__|
# |_| XML parser
#
-# Copyright (c) 2017-2023 Sebastian Pipping <sebastian@pipping.org>
+# Copyright (c) 2017-2025 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2018 KangLin <kl222@126.com>
# Copyright (c) 2022 Johnny Jazeix <jazeix@gmail.com>
# Copyright (c) 2023 Sony Corporation / Snild Dolkow <snild@sony.com>
@@ -494,6 +494,8 @@ EXTRA_DIST = \
conftools/expat.m4 \
conftools/get-version.sh \
\
+ fuzz/xml_lpm_fuzzer.cpp \
+ fuzz/xml_lpm_fuzzer.proto \
fuzz/xml_parsebuffer_fuzzer.c \
fuzz/xml_parse_fuzzer.c \
\
diff --git a/contrib/expat/README.md b/contrib/expat/README.md
index 23d26dad2b92..77c6bf27d307 100644
--- a/contrib/expat/README.md
+++ b/contrib/expat/README.md
@@ -3,6 +3,7 @@
[![Packaging status](https://repology.org/badge/tiny-repos/expat.svg)](https://repology.org/metapackage/expat/versions)
[![Downloads SourceForge](https://img.shields.io/sourceforge/dt/expat?label=Downloads%20SourceForge)](https://sourceforge.net/projects/expat/files/)
[![Downloads GitHub](https://img.shields.io/github/downloads/libexpat/libexpat/total?label=Downloads%20GitHub)](https://github.com/libexpat/libexpat/releases)
+[![OpenSSF Best Practices](https://www.bestpractices.dev/projects/10205/badge)](https://www.bestpractices.dev/projects/10205)
> [!CAUTION]
>
@@ -11,7 +12,7 @@
> at the top of the `Changes` file.
-# Expat, Release 2.6.4
+# Expat, Release 2.7.1
This is Expat, a C99 library for parsing
[XML 1.0 Fourth Edition](https://www.w3.org/TR/2006/REC-xml-20060816/), started by
@@ -22,9 +23,9 @@ are called when the parser discovers the associated structures in the
document being parsed. A start tag is an example of the kind of
structures for which you may register handlers.
-Expat supports the following compilers:
+Expat supports the following C99 compilers:
-- GNU GCC >=4.5
+- GNU GCC >=4.5 (for use from C) or GNU GCC >=4.8.1 (for use from C++)
- LLVM Clang >=3.5
- Microsoft Visual Studio >=16.0/2019 (rolling `${today} minus 5 years`)
@@ -52,7 +53,7 @@ This approach leverages CMake's own [module `FindEXPAT`](https://cmake.org/cmake
Notice the *uppercase* `EXPAT` in the following example:
```cmake
-cmake_minimum_required(VERSION 3.0) # or 3.10, see below
+cmake_minimum_required(VERSION 3.10)
project(hello VERSION 1.0.0)
@@ -62,12 +63,7 @@ add_executable(hello
hello.c
)
-# a) for CMake >=3.10 (see CMake's FindEXPAT docs)
target_link_libraries(hello PUBLIC EXPAT::EXPAT)
-
-# b) for CMake >=3.0
-target_include_directories(hello PRIVATE ${EXPAT_INCLUDE_DIRS})
-target_link_libraries(hello PUBLIC ${EXPAT_LIBRARIES})
```
### b) `find_package` with Config Mode
@@ -85,7 +81,7 @@ or
Notice the *lowercase* `expat` in the following example:
```cmake
-cmake_minimum_required(VERSION 3.0)
+cmake_minimum_required(VERSION 3.10)
project(hello VERSION 1.0.0)
@@ -295,7 +291,7 @@ EXPAT_ENABLE_INSTALL:BOOL=ON
// Use /MT flag (static CRT) when compiling in MSVC
EXPAT_MSVC_STATIC_CRT:BOOL=OFF
-// Build fuzzers via ossfuzz for the expat library
+// Build fuzzers via OSS-Fuzz for the expat library
EXPAT_OSSFUZZ_BUILD:BOOL=OFF
// Build a shared expat library
diff --git a/contrib/expat/configure.ac b/contrib/expat/configure.ac
index fffcd125e9c4..0c88b8867019 100644
--- a/contrib/expat/configure.ac
+++ b/contrib/expat/configure.ac
@@ -11,7 +11,7 @@ dnl Copyright (c) 2000 Clark Cooper <coopercc@users.sourceforge.net>
dnl Copyright (c) 2000-2005 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
dnl Copyright (c) 2001-2003 Greg Stein <gstein@users.sourceforge.net>
dnl Copyright (c) 2006-2012 Karl Waclawek <karl@waclawek.net>
-dnl Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org>
+dnl Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
dnl Copyright (c) 2017 S. P. Zeidler <spz@netbsd.org>
dnl Copyright (c) 2017 Stephen Groat <stephen@groat.us>
dnl Copyright (c) 2017-2020 Joe Orton <jorton@redhat.com>
@@ -85,7 +85,7 @@ dnl If the API changes incompatibly set LIBAGE back to 0
dnl
LIBCURRENT=11 # sync
-LIBREVISION=0 # with
+LIBREVISION=2 # with
LIBAGE=10 # CMakeLists.txt!
AC_CONFIG_HEADERS([expat_config.h])
diff --git a/contrib/expat/doc/reference.html b/contrib/expat/doc/reference.html
index c2ae9bb71431..2b3bd39580a9 100644
--- a/contrib/expat/doc/reference.html
+++ b/contrib/expat/doc/reference.html
@@ -14,7 +14,7 @@
Copyright (c) 2000 Clark Cooper <coopercc@users.sourceforge.net>
Copyright (c) 2000-2004 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
Copyright (c) 2002-2012 Karl Waclawek <karl@waclawek.net>
- Copyright (c) 2017-2024 Sebastian Pipping <sebastian@pipping.org>
+ Copyright (c) 2017-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017 Jakub Wilk <jwilk@jwilk.net>
Copyright (c) 2021 Tomas Korbar <tkorbar@redhat.com>
Copyright (c) 2021 Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
@@ -52,7 +52,7 @@
<div>
<h1>
The Expat XML Parser
- <small>Release 2.6.4</small>
+ <small>Release 2.7.1</small>
</h1>
</div>
<div class="content">
@@ -1267,6 +1267,11 @@ call-backs, except when parsing an external parameter entity and
<code>XML_STATUS_ERROR</code> otherwise. The possible error codes
are:</p>
<dl>
+ <dt><code>XML_ERROR_NOT_STARTED</code></dt>
+ <dd>
+ when stopping or suspending a parser before it has started,
+ added in Expat 2.6.4.
+ </dd>
<dt><code>XML_ERROR_SUSPENDED</code></dt>
<dd>when suspending an already suspended parser.</dd>
<dt><code>XML_ERROR_FINISHED</code></dt>
diff --git a/contrib/expat/doc/xmlwf.1 b/contrib/expat/doc/xmlwf.1
index 61b302581ce9..76aa7e30d074 100644
--- a/contrib/expat/doc/xmlwf.1
+++ b/contrib/expat/doc/xmlwf.1
@@ -5,7 +5,7 @@
\\$2 \(la\\$1\(ra\\$3
..
.if \n(.g .mso www.tmac
-.TH XMLWF 1 "November 6, 2024" "" ""
+.TH XMLWF 1 "March 27, 2025" "" ""
.SH NAME
xmlwf \- Determines if an XML document is well-formed
.SH SYNOPSIS
diff --git a/contrib/expat/doc/xmlwf.xml b/contrib/expat/doc/xmlwf.xml
index cf6d984af463..17e9cf51c191 100644
--- a/contrib/expat/doc/xmlwf.xml
+++ b/contrib/expat/doc/xmlwf.xml
@@ -9,7 +9,7 @@
Copyright (c) 2001 Scott Bronson <bronson@rinspin.com>
Copyright (c) 2002-2003 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
Copyright (c) 2009 Karl Waclawek <karl@waclawek.net>
- Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org>
+ Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016 Ardo van Rangelrooij <ardo@debian.org>
Copyright (c) 2017 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2020 Joe Orton <jorton@redhat.com>
@@ -21,7 +21,7 @@
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
<!ENTITY dhfirstname "<firstname>Scott</firstname>">
<!ENTITY dhsurname "<surname>Bronson</surname>">
- <!ENTITY dhdate "<date>November 6, 2024</date>">
+ <!ENTITY dhdate "<date>March 27, 2025</date>">
<!-- Please adjust this^^ date whenever cutting a new release. -->
<!ENTITY dhsection "<manvolnum>1</manvolnum>">
<!ENTITY dhemail "<email>bronson@rinspin.com</email>">
diff --git a/contrib/expat/fuzz/xml_lpm_fuzzer.cpp b/contrib/expat/fuzz/xml_lpm_fuzzer.cpp
new file mode 100644
index 000000000000..f52ea7b21e40
--- /dev/null
+++ b/contrib/expat/fuzz/xml_lpm_fuzzer.cpp
@@ -0,0 +1,464 @@
+/*
+ __ __ _
+ ___\ \/ /_ __ __ _| |_
+ / _ \\ /| '_ \ / _` | __|
+ | __// \| |_) | (_| | |_
+ \___/_/\_\ .__/ \__,_|\__|
+ |_| XML parser
+
+ Copyright (c) 2022 Mark Brand <markbrand@google.com>
+ Copyright (c) 2025 Sebastian Pipping <sebastian@pipping.org>
+ Licensed under the MIT license:
+
+ Permission is hereby granted, free of charge, to any person obtaining
+ a copy of this software and associated documentation files (the
+ "Software"), to deal in the Software without restriction, including
+ without limitation the rights to use, copy, modify, merge, publish,
+ distribute, sublicense, and/or sell copies of the Software, and to permit
+ persons to whom the Software is furnished to do so, subject to the
+ following conditions:
+
+ The above copyright notice and this permission notice shall be included
+ in all copies or substantial portions of the Software.
+
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN
+ NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
+ DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+ OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ USE OR OTHER DEALINGS IN THE SOFTWARE.
+*/
+
+#if defined(NDEBUG)
+# undef NDEBUG // because checks below rely on assert(...)
+#endif
+
+#include <assert.h>
+#include <stdint.h>
+#include <vector>
+
+#include "expat.h"
+#include "xml_lpm_fuzzer.pb.h"
+#include "src/libfuzzer/libfuzzer_macro.h"
+
+static const char *g_encoding = nullptr;
+static const char *g_external_entity = nullptr;
+static size_t g_external_entity_size = 0;
+
+void
+SetEncoding(const xml_lpm_fuzzer::Encoding &e) {
+ switch (e) {
+ case xml_lpm_fuzzer::Encoding::UTF8:
+ g_encoding = "UTF-8";
+ break;
+
+ case xml_lpm_fuzzer::Encoding::UTF16:
+ g_encoding = "UTF-16";
+ break;
+
+ case xml_lpm_fuzzer::Encoding::ISO88591:
+ g_encoding = "ISO-8859-1";
+ break;
+
+ case xml_lpm_fuzzer::Encoding::ASCII:
+ g_encoding = "US-ASCII";
+ break;
+
+ case xml_lpm_fuzzer::Encoding::NONE:
+ g_encoding = NULL;
+ break;
+
+ default:
+ g_encoding = "UNKNOWN";
+ break;
+ }
+}
+
+static int g_allocation_count = 0;
+static std::vector<int> g_fail_allocations = {};
+
+void *
+MallocHook(size_t size) {
+ g_allocation_count += 1;
+ for (auto index : g_fail_allocations) {
+ if (index == g_allocation_count) {
+ return NULL;
+ }
+ }
+ return malloc(size);
+}
+
+void *
+ReallocHook(void *ptr, size_t size) {
+ g_allocation_count += 1;
+ for (auto index : g_fail_allocations) {
+ if (index == g_allocation_count) {
+ return NULL;
+ }
+ }
+ return realloc(ptr, size);
+}
+
+void
+FreeHook(void *ptr) {
+ free(ptr);
+}
+
+XML_Memory_Handling_Suite memory_handling_suite
+ = {MallocHook, ReallocHook, FreeHook};
+
+void InitializeParser(XML_Parser parser);
+
+// We want a parse function that supports resumption, so that we can cover the
+// suspend/resume code.
+enum XML_Status
+Parse(XML_Parser parser, const char *input, int input_len, int is_final) {
+ enum XML_Status status = XML_Parse(parser, input, input_len, is_final);
+ while (status == XML_STATUS_SUSPENDED) {
+ status = XML_ResumeParser(parser);
+ }
+ return status;
+}
+
+// When the fuzzer is compiled with instrumentation such as ASan, then the
+// accesses in TouchString will fault if they access invalid memory (ie. detect
+// either a use-after-free or buffer-overflow). By calling TouchString in each
+// of the callbacks, we can check that the arguments meet the API specifications
+// in terms of length/null-termination. no_optimize is used to ensure that the
+// compiler has to emit actual memory reads, instead of removing them.
+static volatile size_t no_optimize = 0;
+static void
+TouchString(const XML_Char *ptr, int len = -1) {
+ if (! ptr) {
+ return;
+ }
+
+ if (len == -1) {
+ for (XML_Char value = *ptr++; value; value = *ptr++) {
+ no_optimize += value;
+ }
+ } else {
+ for (int i = 0; i < len; ++i) {
+ no_optimize += ptr[i];
+ }
+ }
+}
+
+static void
+TouchNodeAndRecurse(XML_Content *content) {
+ switch (content->type) {
+ case XML_CTYPE_EMPTY:
+ case XML_CTYPE_ANY:
+ assert(content->quant == XML_CQUANT_NONE);
+ assert(content->name == NULL);
+ assert(content->numchildren == 0);
+ assert(content->children == NULL);
+ break;
+
+ case XML_CTYPE_MIXED:
+ assert(content->quant == XML_CQUANT_NONE
+ || content->quant == XML_CQUANT_REP);
+ assert(content->name == NULL);
+ for (unsigned int i = 0; i < content->numchildren; ++i) {
+ assert(content->children[i].type == XML_CTYPE_NAME);
+ assert(content->children[i].quant == XML_CQUANT_NONE);
+ assert(content->children[i].numchildren == 0);
+ assert(content->children[i].children == NULL);
+ TouchString(content->children[i].name);
+ }
+ break;
+
+ case XML_CTYPE_NAME:
+ assert((content->quant == XML_CQUANT_NONE)
+ || (content->quant == XML_CQUANT_OPT)
+ || (content->quant == XML_CQUANT_REP)
+ || (content->quant == XML_CQUANT_PLUS));
+ assert(content->numchildren == 0);
+ assert(content->children == NULL);
+ TouchString(content->name);
+ break;
+
+ case XML_CTYPE_CHOICE:
+ case XML_CTYPE_SEQ:
+ assert((content->quant == XML_CQUANT_NONE)
+ || (content->quant == XML_CQUANT_OPT)
+ || (content->quant == XML_CQUANT_REP)
+ || (content->quant == XML_CQUANT_PLUS));
+ assert(content->name == NULL);
+ for (unsigned int i = 0; i < content->numchildren; ++i) {
+ TouchNodeAndRecurse(&content->children[i]);
+ }
+ break;
+
+ default:
+ assert(false);
+ }
+}
+
+static void XMLCALL
+ElementDeclHandler(void *userData, const XML_Char *name, XML_Content *model) {
+ TouchString(name);
+ TouchNodeAndRecurse(model);
+ XML_FreeContentModel((XML_Parser)userData, model);
+}
+
+static void XMLCALL
+AttlistDeclHandler(void *userData, const XML_Char *elname,
+ const XML_Char *attname, const XML_Char *atttype,
+ const XML_Char *dflt, int isrequired) {
+ (void)userData;
+ TouchString(elname);
+ TouchString(attname);
+ TouchString(atttype);
+ TouchString(dflt);
+ (void)isrequired;
+}
+
+static void XMLCALL
+XmlDeclHandler(void *userData, const XML_Char *version,
+ const XML_Char *encoding, int standalone) {
+ (void)userData;
+ TouchString(version);
+ TouchString(encoding);
+ (void)standalone;
+}
+
+static void XMLCALL
+StartElementHandler(void *userData, const XML_Char *name,
+ const XML_Char **atts) {
+ (void)userData;
+ TouchString(name);
+ for (size_t i = 0; atts[i] != NULL; ++i) {
+ TouchString(atts[i]);
+ }
+}
+
+static void XMLCALL
+EndElementHandler(void *userData, const XML_Char *name) {
+ (void)userData;
+ TouchString(name);
+}
+
+static void XMLCALL
+CharacterDataHandler(void *userData, const XML_Char *s, int len) {
+ (void)userData;
+ TouchString(s, len);
+}
+
+static void XMLCALL
+ProcessingInstructionHandler(void *userData, const XML_Char *target,
+ const XML_Char *data) {
+ (void)userData;
+ TouchString(target);
+ TouchString(data);
+}
+
+static void XMLCALL
+CommentHandler(void *userData, const XML_Char *data) {
+ TouchString(data);
+ // Use the comment handler to trigger parser suspend, so that we can get
+ // coverage of that code.
+ XML_StopParser((XML_Parser)userData, XML_TRUE);
+}
+
+static void XMLCALL
+StartCdataSectionHandler(void *userData) {
+ (void)userData;
+}
+
+static void XMLCALL
+EndCdataSectionHandler(void *userData) {
+ (void)userData;
+}
+
+static void XMLCALL
+DefaultHandler(void *userData, const XML_Char *s, int len) {
+ (void)userData;
+ TouchString(s, len);
+}
+
+static void XMLCALL
+StartDoctypeDeclHandler(void *userData, const XML_Char *doctypeName,
+ const XML_Char *sysid, const XML_Char *pubid,
+ int has_internal_subset) {
+ (void)userData;
+ TouchString(doctypeName);
+ TouchString(sysid);
+ TouchString(pubid);
+ (void)has_internal_subset;
+}
+
+static void XMLCALL
+EndDoctypeDeclHandler(void *userData) {
+ (void)userData;
+}
+
+static void XMLCALL
+EntityDeclHandler(void *userData, const XML_Char *entityName,
+ int is_parameter_entity, const XML_Char *value,
+ int value_length, const XML_Char *base,
+ const XML_Char *systemId, const XML_Char *publicId,
+ const XML_Char *notationName) {
+ (void)userData;
+ TouchString(entityName);
+ (void)is_parameter_entity;
+ TouchString(value, value_length);
+ TouchString(base);
+ TouchString(systemId);
+ TouchString(publicId);
+ TouchString(notationName);
+}
+
+static void XMLCALL
+NotationDeclHandler(void *userData, const XML_Char *notationName,
+ const XML_Char *base, const XML_Char *systemId,
+ const XML_Char *publicId) {
+ (void)userData;
+ TouchString(notationName);
+ TouchString(base);
+ TouchString(systemId);
+ TouchString(publicId);
+}
+
+static void XMLCALL
+StartNamespaceDeclHandler(void *userData, const XML_Char *prefix,
+ const XML_Char *uri) {
+ (void)userData;
+ TouchString(prefix);
+ TouchString(uri);
+}
+
+static void XMLCALL
+EndNamespaceDeclHandler(void *userData, const XML_Char *prefix) {
+ (void)userData;
+ TouchString(prefix);
+}
+
+static int XMLCALL
+NotStandaloneHandler(void *userData) {
+ (void)userData;
+ return XML_STATUS_OK;
+}
+
+static int XMLCALL
+ExternalEntityRefHandler(XML_Parser parser, const XML_Char *context,
+ const XML_Char *base, const XML_Char *systemId,
+ const XML_Char *publicId) {
+ int rc = XML_STATUS_ERROR;
+ TouchString(context);
+ TouchString(base);
+ TouchString(systemId);
+ TouchString(publicId);
+
+ if (g_external_entity) {
+ XML_Parser ext_parser
+ = XML_ExternalEntityParserCreate(parser, context, g_encoding);
+ rc = Parse(ext_parser, g_external_entity, g_external_entity_size, 1);
+ XML_ParserFree(ext_parser);
+ }
+
+ return rc;
+}
+
+static void XMLCALL
+SkippedEntityHandler(void *userData, const XML_Char *entityName,
+ int is_parameter_entity) {
+ (void)userData;
+ TouchString(entityName);
+ (void)is_parameter_entity;
+}
+
+static int XMLCALL
+UnknownEncodingHandler(void *encodingHandlerData, const XML_Char *name,
+ XML_Encoding *info) {
+ (void)encodingHandlerData;
+ TouchString(name);
+ (void)info;
+ return XML_STATUS_ERROR;
+}
+
+void
+InitializeParser(XML_Parser parser) {
+ XML_SetUserData(parser, (void *)parser);
+ XML_SetHashSalt(parser, 0x41414141);
+ XML_SetParamEntityParsing(parser, XML_PARAM_ENTITY_PARSING_ALWAYS);
+
+ XML_SetElementDeclHandler(parser, ElementDeclHandler);
+ XML_SetAttlistDeclHandler(parser, AttlistDeclHandler);
+ XML_SetXmlDeclHandler(parser, XmlDeclHandler);
+ XML_SetElementHandler(parser, StartElementHandler, EndElementHandler);
+ XML_SetCharacterDataHandler(parser, CharacterDataHandler);
+ XML_SetProcessingInstructionHandler(parser, ProcessingInstructionHandler);
+ XML_SetCommentHandler(parser, CommentHandler);
+ XML_SetCdataSectionHandler(parser, StartCdataSectionHandler,
+ EndCdataSectionHandler);
+ // XML_SetDefaultHandler disables entity expansion
+ XML_SetDefaultHandlerExpand(parser, DefaultHandler);
+ XML_SetDoctypeDeclHandler(parser, StartDoctypeDeclHandler,
+ EndDoctypeDeclHandler);
+ // Note: This is mutually exclusive with XML_SetUnparsedEntityDeclHandler,
+ // and there isn't any significant code change between the two.
+ XML_SetEntityDeclHandler(parser, EntityDeclHandler);
+ XML_SetNotationDeclHandler(parser, NotationDeclHandler);
+ XML_SetNamespaceDeclHandler(parser, StartNamespaceDeclHandler,
+ EndNamespaceDeclHandler);
+ XML_SetNotStandaloneHandler(parser, NotStandaloneHandler);
+ XML_SetExternalEntityRefHandler(parser, ExternalEntityRefHandler);
+ XML_SetSkippedEntityHandler(parser, SkippedEntityHandler);
+ XML_SetUnknownEncodingHandler(parser, UnknownEncodingHandler, (void *)parser);
+}
+
+DEFINE_TEXT_PROTO_FUZZER(const xml_lpm_fuzzer::Testcase &testcase) {
+ g_external_entity = nullptr;
+
+ if (! testcase.actions_size()) {
+ return;
+ }
+
+ g_allocation_count = 0;
+ g_fail_allocations.clear();
+ for (int i = 0; i < testcase.fail_allocations_size(); ++i) {
+ g_fail_allocations.push_back(testcase.fail_allocations(i));
+ }
+
+ SetEncoding(testcase.encoding());
+ XML_Parser parser
+ = XML_ParserCreate_MM(g_encoding, &memory_handling_suite, "|");
+ InitializeParser(parser);
+
+ for (int i = 0; i < testcase.actions_size(); ++i) {
+ const auto &action = testcase.actions(i);
+ switch (action.action_case()) {
+ case xml_lpm_fuzzer::Action::kChunk:
+ if (XML_STATUS_ERROR
+ == Parse(parser, action.chunk().data(), action.chunk().size(), 0)) {
+ // Force a reset after parse error.
+ XML_ParserReset(parser, g_encoding);
+ InitializeParser(parser);
+ }
+ break;
+
+ case xml_lpm_fuzzer::Action::kLastChunk:
+ Parse(parser, action.last_chunk().data(), action.last_chunk().size(), 1);
+ XML_ParserReset(parser, g_encoding);
+ InitializeParser(parser);
+ break;
+
+ case xml_lpm_fuzzer::Action::kReset:
+ XML_ParserReset(parser, g_encoding);
+ InitializeParser(parser);
+ break;
+
+ case xml_lpm_fuzzer::Action::kExternalEntity:
+ g_external_entity = action.external_entity().data();
+ g_external_entity_size = action.external_entity().size();
+ break;
+
+ default:
+ break;
+ }
+ }
+
+ XML_ParserFree(parser);
+}
diff --git a/contrib/expat/fuzz/xml_lpm_fuzzer.proto b/contrib/expat/fuzz/xml_lpm_fuzzer.proto
new file mode 100644
index 000000000000..ddc4e958b919
--- /dev/null
+++ b/contrib/expat/fuzz/xml_lpm_fuzzer.proto
@@ -0,0 +1,58 @@
+/*
+ __ __ _
+ ___\ \/ /_ __ __ _| |_
+ / _ \\ /| '_ \ / _` | __|
+ | __// \| |_) | (_| | |_
+ \___/_/\_\ .__/ \__,_|\__|
+ |_| XML parser
+
+ Copyright (c) 2022 Mark Brand <markbrand@google.com>
+ Copyright (c) 2025 Sebastian Pipping <sebastian@pipping.org>
+ Licensed under the MIT license:
+
+ Permission is hereby granted, free of charge, to any person obtaining
+ a copy of this software and associated documentation files (the
+ "Software"), to deal in the Software without restriction, including
+ without limitation the rights to use, copy, modify, merge, publish,
+ distribute, sublicense, and/or sell copies of the Software, and to permit
+ persons to whom the Software is furnished to do so, subject to the
+ following conditions:
+
+ The above copyright notice and this permission notice shall be included
+ in all copies or substantial portions of the Software.
+
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN
+ NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
+ DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+ OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ USE OR OTHER DEALINGS IN THE SOFTWARE.
+*/
+
+syntax = "proto2";
+package xml_lpm_fuzzer;
+
+enum Encoding {
+ UTF8 = 0;
+ UTF16 = 1;
+ ISO88591 = 2;
+ ASCII = 3;
+ UNKNOWN = 4;
+ NONE = 5;
+}
+
+message Action {
+ oneof action {
+ string chunk = 1;
+ string last_chunk = 2;
+ bool reset = 3;
+ string external_entity = 4;
+ }
+}
+
+message Testcase {
+ required Encoding encoding = 1;
+ repeated Action actions = 2;
+ repeated int32 fail_allocations = 3;
+}
diff --git a/contrib/expat/fuzz/xml_parse_fuzzer.c b/contrib/expat/fuzz/xml_parse_fuzzer.c
index a7e8414ce355..6a1affe2b1f6 100644
--- a/contrib/expat/fuzz/xml_parse_fuzzer.c
+++ b/contrib/expat/fuzz/xml_parse_fuzzer.c
@@ -5,7 +5,7 @@
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
- * http://www.apache.org/licenses/LICENSE-2.0
+ * https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
diff --git a/contrib/expat/fuzz/xml_parsebuffer_fuzzer.c b/contrib/expat/fuzz/xml_parsebuffer_fuzzer.c
index 0327aa9f952e..cfc4af202851 100644
--- a/contrib/expat/fuzz/xml_parsebuffer_fuzzer.c
+++ b/contrib/expat/fuzz/xml_parsebuffer_fuzzer.c
@@ -5,7 +5,7 @@
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
- * http://www.apache.org/licenses/LICENSE-2.0
+ * https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
diff --git a/contrib/expat/lib/expat.h b/contrib/expat/lib/expat.h
index 523b37d8d578..610e1ddc0e94 100644
--- a/contrib/expat/lib/expat.h
+++ b/contrib/expat/lib/expat.h
@@ -11,7 +11,7 @@
Copyright (c) 2000-2005 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
Copyright (c) 2001-2002 Greg Stein <gstein@users.sourceforge.net>
Copyright (c) 2002-2016 Karl Waclawek <karl@waclawek.net>
- Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org>
+ Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016 Cristian Rodríguez <crrodriguez@opensuse.org>
Copyright (c) 2016 Thomas Beutlich <tc@tbeu.de>
Copyright (c) 2017 Rhodri James <rhodri@wildebeest.org.uk>
@@ -1067,8 +1067,8 @@ XML_SetReparseDeferralEnabled(XML_Parser parser, XML_Bool enabled);
See https://semver.org
*/
#define XML_MAJOR_VERSION 2
-#define XML_MINOR_VERSION 6
-#define XML_MICRO_VERSION 4
+#define XML_MINOR_VERSION 7
+#define XML_MICRO_VERSION 1
#ifdef __cplusplus
}
diff --git a/contrib/expat/lib/internal.h b/contrib/expat/lib/internal.h
index 167ec36804a4..6bde6ae6b31d 100644
--- a/contrib/expat/lib/internal.h
+++ b/contrib/expat/lib/internal.h
@@ -28,7 +28,7 @@
Copyright (c) 2002-2003 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
Copyright (c) 2002-2006 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2003 Greg Stein <gstein@users.sourceforge.net>
- Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org>
+ Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2018 Yury Gribov <tetra2005@gmail.com>
Copyright (c) 2019 David Loffredo <loffredo@steptools.com>
Copyright (c) 2023-2024 Sony Corporation / Snild Dolkow <snild@sony.com>
@@ -127,6 +127,9 @@
# elif ULONG_MAX == 18446744073709551615u // 2^64-1
# define EXPAT_FMT_PTRDIFF_T(midpart) "%" midpart "ld"
# define EXPAT_FMT_SIZE_T(midpart) "%" midpart "lu"
+# elif defined(EMSCRIPTEN) // 32bit mode Emscripten
+# define EXPAT_FMT_PTRDIFF_T(midpart) "%" midpart "ld"
+# define EXPAT_FMT_SIZE_T(midpart) "%" midpart "zu"
# else
# define EXPAT_FMT_PTRDIFF_T(midpart) "%" midpart "d"
# define EXPAT_FMT_SIZE_T(midpart) "%" midpart "u"
diff --git a/contrib/expat/lib/xmlparse.c b/contrib/expat/lib/xmlparse.c
index a4e091e7c33c..38a2d9657b6a 100644
--- a/contrib/expat/lib/xmlparse.c
+++ b/contrib/expat/lib/xmlparse.c
@@ -1,4 +1,4 @@
-/* c5625880f4bf417c1463deee4eb92d86ff413f802048621c57e25fe483eb59e4 (2.6.4+)
+/* d19ae032c224863c1527ba44d228cc34b99192c3a4c5a27af1f4e054d45ee031 (2.7.1+)
__ __ _
___\ \/ /_ __ __ _| |_
/ _ \\ /| '_ \ / _` | __|
@@ -13,7 +13,7 @@
Copyright (c) 2002-2016 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2005-2009 Steven Solie <steven@solie.ca>
Copyright (c) 2016 Eric Rahm <erahm@mozilla.com>
- Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org>
+ Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016 Gaurav <g.gupta@samsung.com>
Copyright (c) 2016 Thomas Beutlich <tc@tbeu.de>
Copyright (c) 2016 Gustavo Grieco <gustavo.grieco@imag.fr>
@@ -39,7 +39,7 @@
Copyright (c) 2022 Sean McBride <sean@rogue-research.com>
Copyright (c) 2023 Owain Davies <owaind@bath.edu>
Copyright (c) 2023-2024 Sony Corporation / Snild Dolkow <snild@sony.com>
- Copyright (c) 2024 Berkay Eren Ürün <berkay.ueruen@siemens.com>
+ Copyright (c) 2024-2025 Berkay Eren Ürün <berkay.ueruen@siemens.com>
Copyright (c) 2024 Hanno Böck <hanno@gentoo.org>
Licensed under the MIT license:
@@ -325,6 +325,10 @@ typedef struct {
const XML_Char *publicId;
const XML_Char *notation;
XML_Bool open;
+ XML_Bool hasMore; /* true if entity has not been completely processed */
+ /* An entity can be open while being already completely processed (hasMore ==
+ XML_FALSE). The reason is the delayed closing of entities until their inner
+ entities are processed and closed */
XML_Bool is_param;
XML_Bool is_internal; /* true if declared in internal subset outside PE */
} ENTITY;
@@ -415,6 +419,12 @@ typedef struct {
int *scaffIndex;
} DTD;
+enum EntityType {
+ ENTITY_INTERNAL,
+ ENTITY_ATTRIBUTE,
+ ENTITY_VALUE,
+};
+
typedef struct open_internal_entity {
const char *internalEventPtr;
const char *internalEventEndPtr;
@@ -422,6 +432,7 @@ typedef struct open_internal_entity {
ENTITY *entity;
int startTagLevel;
XML_Bool betweenDecl; /* WFC: PE Between Declarations */
+ enum EntityType type;
} OPEN_INTERNAL_ENTITY;
enum XML_Account {
@@ -481,8 +492,8 @@ static enum XML_Error doProlog(XML_Parser parser, const ENCODING *enc,
const char *next, const char **nextPtr,
XML_Bool haveMore, XML_Bool allowClosingDoctype,
enum XML_Account account);
-static enum XML_Error processInternalEntity(XML_Parser parser, ENTITY *entity,
- XML_Bool betweenDecl);
+static enum XML_Error processEntity(XML_Parser parser, ENTITY *entity,
+ XML_Bool betweenDecl, enum EntityType type);
static enum XML_Error doContent(XML_Parser parser, int startTagLevel,
const ENCODING *enc, const char *start,
const char *end, const char **endPtr,
@@ -513,18 +524,22 @@ static enum XML_Error storeAttributeValue(XML_Parser parser,
const char *ptr, const char *end,
STRING_POOL *pool,
enum XML_Account account);
-static enum XML_Error appendAttributeValue(XML_Parser parser,
- const ENCODING *enc,
- XML_Bool isCdata, const char *ptr,
- const char *end, STRING_POOL *pool,
- enum XML_Account account);
+static enum XML_Error
+appendAttributeValue(XML_Parser parser, const ENCODING *enc, XML_Bool isCdata,
+ const char *ptr, const char *end, STRING_POOL *pool,
+ enum XML_Account account, const char **nextPtr);
static ATTRIBUTE_ID *getAttributeId(XML_Parser parser, const ENCODING *enc,
const char *start, const char *end);
static int setElementTypePrefix(XML_Parser parser, ELEMENT_TYPE *elementType);
#if XML_GE == 1
static enum XML_Error storeEntityValue(XML_Parser parser, const ENCODING *enc,
const char *start, const char *end,
- enum XML_Account account);
+ enum XML_Account account,
+ const char **nextPtr);
+static enum XML_Error callStoreEntityValue(XML_Parser parser,
+ const ENCODING *enc,
+ const char *start, const char *end,
+ enum XML_Account account);
#else
static enum XML_Error storeSelfEntityValue(XML_Parser parser, ENTITY *entity);
#endif
@@ -709,6 +724,10 @@ struct XML_ParserStruct {
const char *m_positionPtr;
OPEN_INTERNAL_ENTITY *m_openInternalEntities;
OPEN_INTERNAL_ENTITY *m_freeInternalEntities;
+ OPEN_INTERNAL_ENTITY *m_openAttributeEntities;
+ OPEN_INTERNAL_ENTITY *m_freeAttributeEntities;
+ OPEN_INTERNAL_ENTITY *m_openValueEntities;
+ OPEN_INTERNAL_ENTITY *m_freeValueEntities;
XML_Bool m_defaultExpandInternalEntities;
int m_tagLevel;
ENTITY *m_declEntity;
@@ -756,6 +775,7 @@ struct XML_ParserStruct {
ACCOUNTING m_accounting;
ENTITY_STATS m_entity_stats;
#endif
+ XML_Bool m_reenter;
};
#define MALLOC(parser, s) (parser->m_mem.malloc_fcn((s)))
@@ -1028,7 +1048,29 @@ callProcessor(XML_Parser parser, const char *start, const char *end,
#if defined(XML_TESTING)
g_bytesScanned += (unsigned)have_now;
#endif
- const enum XML_Error ret = parser->m_processor(parser, start, end, endPtr);
+ // Run in a loop to eliminate dangerous recursion depths
+ enum XML_Error ret;
+ *endPtr = start;
+ while (1) {
+ // Use endPtr as the new start in each iteration, since it will
+ // be set to the next start point by m_processor.
+ ret = parser->m_processor(parser, *endPtr, end, endPtr);
+
+ // Make parsing status (and in particular XML_SUSPENDED) take
+ // precedence over re-enter flag when they disagree
+ if (parser->m_parsingStatus.parsing != XML_PARSING) {
+ parser->m_reenter = XML_FALSE;
+ }
+
+ if (! parser->m_reenter) {
+ break;
+ }
+
+ parser->m_reenter = XML_FALSE;
+ if (ret != XML_ERROR_NONE)
+ return ret;
+ }
+
if (ret == XML_ERROR_NONE) {
// if we consumed nothing, remember what we had on this parse attempt.
if (*endPtr == start) {
@@ -1139,6 +1181,8 @@ parserCreate(const XML_Char *encodingName,
parser->m_freeBindingList = NULL;
parser->m_freeTagList = NULL;
parser->m_freeInternalEntities = NULL;
+ parser->m_freeAttributeEntities = NULL;
+ parser->m_freeValueEntities = NULL;
parser->m_groupSize = 0;
parser->m_groupConnector = NULL;
@@ -1241,6 +1285,8 @@ parserInit(XML_Parser parser, const XML_Char *encodingName) {
parser->m_eventEndPtr = NULL;
parser->m_positionPtr = NULL;
parser->m_openInternalEntities = NULL;
+ parser->m_openAttributeEntities = NULL;
+ parser->m_openValueEntities = NULL;
parser->m_defaultExpandInternalEntities = XML_TRUE;
parser->m_tagLevel = 0;
parser->m_tagStack = NULL;
@@ -1251,6 +1297,8 @@ parserInit(XML_Parser parser, const XML_Char *encodingName) {
parser->m_unknownEncodingData = NULL;
parser->m_parentParser = NULL;
parser->m_parsingStatus.parsing = XML_INITIALIZED;
+ // Reentry can only be triggered inside m_processor calls
+ parser->m_reenter = XML_FALSE;
#ifdef XML_DTD
parser->m_isParamEntity = XML_FALSE;
parser->m_useForeignDTD = XML_FALSE;
@@ -1310,6 +1358,24 @@ XML_ParserReset(XML_Parser parser, const XML_Char *encodingName) {
openEntity->next = parser->m_freeInternalEntities;
parser->m_freeInternalEntities = openEntity;
}
+ /* move m_openAttributeEntities to m_freeAttributeEntities (i.e. same task but
+ * for attributes) */
+ openEntityList = parser->m_openAttributeEntities;
+ while (openEntityList) {
+ OPEN_INTERNAL_ENTITY *openEntity = openEntityList;
+ openEntityList = openEntity->next;
+ openEntity->next = parser->m_freeAttributeEntities;
+ parser->m_freeAttributeEntities = openEntity;
+ }
+ /* move m_openValueEntities to m_freeValueEntities (i.e. same task but
+ * for value entities) */
+ openEntityList = parser->m_openValueEntities;
+ while (openEntityList) {
+ OPEN_INTERNAL_ENTITY *openEntity = openEntityList;
+ openEntityList = openEntity->next;
+ openEntity->next = parser->m_freeValueEntities;
+ parser->m_freeValueEntities = openEntity;
+ }
moveToFreeBindingList(parser, parser->m_inheritedBindings);
FREE(parser, parser->m_unknownEncodingMem);
if (parser->m_unknownEncodingRelease)
@@ -1323,6 +1389,19 @@ XML_ParserReset(XML_Parser parser, const XML_Char *encodingName) {
return XML_TRUE;
}
+static XML_Bool
+parserBusy(XML_Parser parser) {
+ switch (parser->m_parsingStatus.parsing) {
+ case XML_PARSING:
+ case XML_SUSPENDED:
+ return XML_TRUE;
+ case XML_INITIALIZED:
+ case XML_FINISHED:
+ default:
+ return XML_FALSE;
+ }
+}
+
enum XML_Status XMLCALL
XML_SetEncoding(XML_Parser parser, const XML_Char *encodingName) {
if (parser == NULL)
@@ -1331,8 +1410,7 @@ XML_SetEncoding(XML_Parser parser, const XML_Char *encodingName) {
XXX There's no way for the caller to determine which of the
XXX possible error cases caused the XML_STATUS_ERROR return.
*/
- if (parser->m_parsingStatus.parsing == XML_PARSING
- || parser->m_parsingStatus.parsing == XML_SUSPENDED)
+ if (parserBusy(parser))
return XML_STATUS_ERROR;
/* Get rid of any previous encoding name */
@@ -1569,7 +1647,34 @@ XML_ParserFree(XML_Parser parser) {
entityList = entityList->next;
FREE(parser, openEntity);
}
-
+ /* free m_openAttributeEntities and m_freeAttributeEntities */
+ entityList = parser->m_openAttributeEntities;
+ for (;;) {
+ OPEN_INTERNAL_ENTITY *openEntity;
+ if (entityList == NULL) {
+ if (parser->m_freeAttributeEntities == NULL)
+ break;
+ entityList = parser->m_freeAttributeEntities;
+ parser->m_freeAttributeEntities = NULL;
+ }
+ openEntity = entityList;
+ entityList = entityList->next;
+ FREE(parser, openEntity);
+ }
+ /* free m_openValueEntities and m_freeValueEntities */
+ entityList = parser->m_openValueEntities;
+ for (;;) {
+ OPEN_INTERNAL_ENTITY *openEntity;
+ if (entityList == NULL) {
+ if (parser->m_freeValueEntities == NULL)
+ break;
+ entityList = parser->m_freeValueEntities;
+ parser->m_freeValueEntities = NULL;
+ }
+ openEntity = entityList;
+ entityList = entityList->next;
+ FREE(parser, openEntity);
+ }
destroyBindings(parser->m_freeBindingList, parser);
destroyBindings(parser->m_inheritedBindings, parser);
poolDestroy(&parser->m_tempPool);
@@ -1611,8 +1716,7 @@ XML_UseForeignDTD(XML_Parser parser, XML_Bool useDTD) {
return XML_ERROR_INVALID_ARGUMENT;
#ifdef XML_DTD
/* block after XML_Parse()/XML_ParseBuffer() has been called */
- if (parser->m_parsingStatus.parsing == XML_PARSING
- || parser->m_parsingStatus.parsing == XML_SUSPENDED)
+ if (parserBusy(parser))
return XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING;
parser->m_useForeignDTD = useDTD;
return XML_ERROR_NONE;
@@ -1627,8 +1731,7 @@ XML_SetReturnNSTriplet(XML_Parser parser, int do_nst) {
if (parser == NULL)
return;
/* block after XML_Parse()/XML_ParseBuffer() has been called */
- if (parser->m_parsingStatus.parsing == XML_PARSING
- || parser->m_parsingStatus.parsing == XML_SUSPENDED)
+ if (parserBusy(parser))
return;
parser->m_ns_triplets = do_nst ? XML_TRUE : XML_FALSE;
}
@@ -1897,8 +2000,7 @@ XML_SetParamEntityParsing(XML_Parser parser,
if (parser == NULL)
return 0;
/* block after XML_Parse()/XML_ParseBuffer() has been called */
- if (parser->m_parsingStatus.parsing == XML_PARSING
- || parser->m_parsingStatus.parsing == XML_SUSPENDED)
+ if (parserBusy(parser))
return 0;
#ifdef XML_DTD
parser->m_paramEntityParsing = peParsing;
@@ -1915,8 +2017,7 @@ XML_SetHashSalt(XML_Parser parser, unsigned long hash_salt) {
if (parser->m_parentParser)
return XML_SetHashSalt(parser->m_parentParser, hash_salt);
/* block after XML_Parse()/XML_ParseBuffer() has been called */
- if (parser->m_parsingStatus.parsing == XML_PARSING
- || parser->m_parsingStatus.parsing == XML_SUSPENDED)
+ if (parserBusy(parser))
return 0;
parser->m_hash_secret_salt = hash_salt;
return 1;
@@ -2230,6 +2331,11 @@ XML_GetBuffer(XML_Parser parser, int len) {
return parser->m_bufferEnd;
}
+static void
+triggerReenter(XML_Parser parser) {
+ parser->m_reenter = XML_TRUE;
+}
+
enum XML_Status XMLCALL
XML_StopParser(XML_Parser parser, XML_Bool resumable) {
if (parser == NULL)
@@ -2704,8 +2810,9 @@ static enum XML_Error PTRCALL
contentProcessor(XML_Parser parser, const char *start, const char *end,
const char **endPtr) {
enum XML_Error result = doContent(
- parser, 0, parser->m_encoding, start, end, endPtr,
- (XML_Bool)! parser->m_parsingStatus.finalBuffer, XML_ACCOUNT_DIRECT);
+ parser, parser->m_parentParser ? 1 : 0, parser->m_encoding, start, end,
+ endPtr, (XML_Bool)! parser->m_parsingStatus.finalBuffer,
+ XML_ACCOUNT_DIRECT);
if (result == XML_ERROR_NONE) {
if (! storeRawNames(parser))
return XML_ERROR_NO_MEMORY;
@@ -2793,6 +2900,11 @@ externalEntityInitProcessor3(XML_Parser parser, const char *start,
return XML_ERROR_NONE;
case XML_FINISHED:
return XML_ERROR_ABORTED;
+ case XML_PARSING:
+ if (parser->m_reenter) {
+ return XML_ERROR_UNEXPECTED_STATE; // LCOV_EXCL_LINE
+ }
+ /* Fall through */
default:
start = next;
}
@@ -2966,7 +3078,7 @@ doContent(XML_Parser parser, int startTagLevel, const ENCODING *enc,
reportDefault(parser, enc, s, next);
break;
}
- result = processInternalEntity(parser, entity, XML_FALSE);
+ result = processEntity(parser, entity, XML_FALSE, ENTITY_INTERNAL);
if (result != XML_ERROR_NONE)
return result;
} else if (parser->m_externalEntityRefHandler) {
@@ -3092,7 +3204,9 @@ doContent(XML_Parser parser, int startTagLevel, const ENCODING *enc,
}
if ((parser->m_tagLevel == 0)
&& (parser->m_parsingStatus.parsing != XML_FINISHED)) {
- if (parser->m_parsingStatus.parsing == XML_SUSPENDED)
+ if (parser->m_parsingStatus.parsing == XML_SUSPENDED
+ || (parser->m_parsingStatus.parsing == XML_PARSING
+ && parser->m_reenter))
parser->m_processor = epilogProcessor;
else
return epilogProcessor(parser, next, end, nextPtr);
@@ -3153,7 +3267,9 @@ doContent(XML_Parser parser, int startTagLevel, const ENCODING *enc,
}
if ((parser->m_tagLevel == 0)
&& (parser->m_parsingStatus.parsing != XML_FINISHED)) {
- if (parser->m_parsingStatus.parsing == XML_SUSPENDED)
+ if (parser->m_parsingStatus.parsing == XML_SUSPENDED
+ || (parser->m_parsingStatus.parsing == XML_PARSING
+ && parser->m_reenter))
parser->m_processor = epilogProcessor;
else
return epilogProcessor(parser, next, end, nextPtr);
@@ -3286,14 +3402,22 @@ doContent(XML_Parser parser, int startTagLevel, const ENCODING *enc,
break;
/* LCOV_EXCL_STOP */
}
- *eventPP = s = next;
switch (parser->m_parsingStatus.parsing) {
case XML_SUSPENDED:
+ *eventPP = next;
*nextPtr = next;
return XML_ERROR_NONE;
case XML_FINISHED:
+ *eventPP = next;
return XML_ERROR_ABORTED;
+ case XML_PARSING:
+ if (parser->m_reenter) {
+ *nextPtr = next;
+ return XML_ERROR_NONE;
+ }
+ /* Fall through */
default:;
+ *eventPP = s = next;
}
}
/* not reached */
@@ -4210,14 +4334,21 @@ doCdataSection(XML_Parser parser, const ENCODING *enc, const char **startPtr,
/* LCOV_EXCL_STOP */
}
- *eventPP = s = next;
switch (parser->m_parsingStatus.parsing) {
case XML_SUSPENDED:
+ *eventPP = next;
*nextPtr = next;
return XML_ERROR_NONE;
case XML_FINISHED:
+ *eventPP = next;
return XML_ERROR_ABORTED;
+ case XML_PARSING:
+ if (parser->m_reenter) {
+ return XML_ERROR_UNEXPECTED_STATE; // LCOV_EXCL_LINE
+ }
+ /* Fall through */
default:;
+ *eventPP = s = next;
}
}
/* not reached */
@@ -4549,7 +4680,7 @@ entityValueInitProcessor(XML_Parser parser, const char *s, const char *end,
}
/* found end of entity value - can store it now */
return storeEntityValue(parser, parser->m_encoding, s, end,
- XML_ACCOUNT_DIRECT);
+ XML_ACCOUNT_DIRECT, NULL);
} else if (tok == XML_TOK_XML_DECL) {
enum XML_Error result;
result = processXmlDecl(parser, 0, start, next);
@@ -4676,7 +4807,7 @@ entityValueProcessor(XML_Parser parser, const char *s, const char *end,
break;
}
/* found end of entity value - can store it now */
- return storeEntityValue(parser, enc, s, end, XML_ACCOUNT_DIRECT);
+ return storeEntityValue(parser, enc, s, end, XML_ACCOUNT_DIRECT, NULL);
}
start = next;
}
@@ -5119,9 +5250,9 @@ doProlog(XML_Parser parser, const ENCODING *enc, const char *s, const char *end,
#if XML_GE == 1
// This will store the given replacement text in
// parser->m_declEntity->textPtr.
- enum XML_Error result
- = storeEntityValue(parser, enc, s + enc->minBytesPerChar,
- next - enc->minBytesPerChar, XML_ACCOUNT_NONE);
+ enum XML_Error result = callStoreEntityValue(
+ parser, enc, s + enc->minBytesPerChar, next - enc->minBytesPerChar,
+ XML_ACCOUNT_NONE);
if (parser->m_declEntity) {
parser->m_declEntity->textPtr = poolStart(&dtd->entityValuePool);
parser->m_declEntity->textLen
@@ -5546,7 +5677,7 @@ doProlog(XML_Parser parser, const ENCODING *enc, const char *s, const char *end,
enum XML_Error result;
XML_Bool betweenDecl
= (role == XML_ROLE_PARAM_ENTITY_REF ? XML_TRUE : XML_FALSE);
- result = processInternalEntity(parser, entity, betweenDecl);
+ result = processEntity(parser, entity, betweenDecl, ENTITY_INTERNAL);
if (result != XML_ERROR_NONE)
return result;
handleDefault = XML_FALSE;
@@ -5751,6 +5882,12 @@ doProlog(XML_Parser parser, const ENCODING *enc, const char *s, const char *end,
return XML_ERROR_NONE;
case XML_FINISHED:
return XML_ERROR_ABORTED;
+ case XML_PARSING:
+ if (parser->m_reenter) {
+ *nextPtr = next;
+ return XML_ERROR_NONE;
+ }
+ /* Fall through */
default:
s = next;
tok = XmlPrologTok(enc, s, end, &next);
@@ -5818,28 +5955,58 @@ epilogProcessor(XML_Parser parser, const char *s, const char *end,
default:
return XML_ERROR_JUNK_AFTER_DOC_ELEMENT;
}
- parser->m_eventPtr = s = next;
switch (parser->m_parsingStatus.parsing) {
case XML_SUSPENDED:
+ parser->m_eventPtr = next;
*nextPtr = next;
return XML_ERROR_NONE;
case XML_FINISHED:
+ parser->m_eventPtr = next;
return XML_ERROR_ABORTED;
+ case XML_PARSING:
+ if (parser->m_reenter) {
+ return XML_ERROR_UNEXPECTED_STATE; // LCOV_EXCL_LINE
+ }
+ /* Fall through */
default:;
+ parser->m_eventPtr = s = next;
}
}
}
static enum XML_Error
-processInternalEntity(XML_Parser parser, ENTITY *entity, XML_Bool betweenDecl) {
- const char *textStart, *textEnd;
- const char *next;
- enum XML_Error result;
- OPEN_INTERNAL_ENTITY *openEntity;
+processEntity(XML_Parser parser, ENTITY *entity, XML_Bool betweenDecl,
+ enum EntityType type) {
+ OPEN_INTERNAL_ENTITY *openEntity, **openEntityList, **freeEntityList;
+ switch (type) {
+ case ENTITY_INTERNAL:
+ parser->m_processor = internalEntityProcessor;
+ openEntityList = &parser->m_openInternalEntities;
+ freeEntityList = &parser->m_freeInternalEntities;
+ break;
+ case ENTITY_ATTRIBUTE:
+ openEntityList = &parser->m_openAttributeEntities;
+ freeEntityList = &parser->m_freeAttributeEntities;
+ break;
+ case ENTITY_VALUE:
+ openEntityList = &parser->m_openValueEntities;
+ freeEntityList = &parser->m_freeValueEntities;
+ break;
+ /* default case serves merely as a safety net in case of a
+ * wrong entityType. Therefore we exclude the following lines
+ * from the test coverage.
+ *
+ * LCOV_EXCL_START
+ */
+ default:
+ // Should not reach here
+ assert(0);
+ /* LCOV_EXCL_STOP */
+ }
- if (parser->m_freeInternalEntities) {
- openEntity = parser->m_freeInternalEntities;
- parser->m_freeInternalEntities = openEntity->next;
+ if (*freeEntityList) {
+ openEntity = *freeEntityList;
+ *freeEntityList = openEntity->next;
} else {
openEntity
= (OPEN_INTERNAL_ENTITY *)MALLOC(parser, sizeof(OPEN_INTERNAL_ENTITY));
@@ -5847,55 +6014,34 @@ processInternalEntity(XML_Parser parser, ENTITY *entity, XML_Bool betweenDecl) {
return XML_ERROR_NO_MEMORY;
}
entity->open = XML_TRUE;
+ entity->hasMore = XML_TRUE;
#if XML_GE == 1
entityTrackingOnOpen(parser, entity, __LINE__);
#endif
entity->processed = 0;
- openEntity->next = parser->m_openInternalEntities;
- parser->m_openInternalEntities = openEntity;
+ openEntity->next = *openEntityList;
+ *openEntityList = openEntity;
openEntity->entity = entity;
+ openEntity->type = type;
openEntity->startTagLevel = parser->m_tagLevel;
openEntity->betweenDecl = betweenDecl;
openEntity->internalEventPtr = NULL;
openEntity->internalEventEndPtr = NULL;
- textStart = (const char *)entity->textPtr;
- textEnd = (const char *)(entity->textPtr + entity->textLen);
- /* Set a safe default value in case 'next' does not get set */
- next = textStart;
-
- if (entity->is_param) {
- int tok
- = XmlPrologTok(parser->m_internalEncoding, textStart, textEnd, &next);
- result = doProlog(parser, parser->m_internalEncoding, textStart, textEnd,
- tok, next, &next, XML_FALSE, XML_FALSE,
- XML_ACCOUNT_ENTITY_EXPANSION);
- } else {
- result = doContent(parser, parser->m_tagLevel, parser->m_internalEncoding,
- textStart, textEnd, &next, XML_FALSE,
- XML_ACCOUNT_ENTITY_EXPANSION);
- }
- if (result == XML_ERROR_NONE) {
- if (textEnd != next && parser->m_parsingStatus.parsing == XML_SUSPENDED) {
- entity->processed = (int)(next - textStart);
- parser->m_processor = internalEntityProcessor;
- } else if (parser->m_openInternalEntities->entity == entity) {
-#if XML_GE == 1
- entityTrackingOnClose(parser, entity, __LINE__);
-#endif /* XML_GE == 1 */
- entity->open = XML_FALSE;
- parser->m_openInternalEntities = openEntity->next;
- /* put openEntity back in list of free instances */
- openEntity->next = parser->m_freeInternalEntities;
- parser->m_freeInternalEntities = openEntity;
- }
+ // Only internal entities make use of the reenter flag
+ // therefore no need to set it for other entity types
+ if (type == ENTITY_INTERNAL) {
+ triggerReenter(parser);
}
- return result;
+ return XML_ERROR_NONE;
}
static enum XML_Error PTRCALL
internalEntityProcessor(XML_Parser parser, const char *s, const char *end,
const char **nextPtr) {
+ UNUSED_P(s);
+ UNUSED_P(end);
+ UNUSED_P(nextPtr);
ENTITY *entity;
const char *textStart, *textEnd;
const char *next;
@@ -5905,68 +6051,67 @@ internalEntityProcessor(XML_Parser parser, const char *s, const char *end,
return XML_ERROR_UNEXPECTED_STATE;
entity = openEntity->entity;
- textStart = ((const char *)entity->textPtr) + entity->processed;
- textEnd = (const char *)(entity->textPtr + entity->textLen);
- /* Set a safe default value in case 'next' does not get set */
- next = textStart;
-
- if (entity->is_param) {
- int tok
- = XmlPrologTok(parser->m_internalEncoding, textStart, textEnd, &next);
- result = doProlog(parser, parser->m_internalEncoding, textStart, textEnd,
- tok, next, &next, XML_FALSE, XML_TRUE,
- XML_ACCOUNT_ENTITY_EXPANSION);
- } else {
- result = doContent(parser, openEntity->startTagLevel,
- parser->m_internalEncoding, textStart, textEnd, &next,
- XML_FALSE, XML_ACCOUNT_ENTITY_EXPANSION);
- }
- if (result != XML_ERROR_NONE)
- return result;
+ // This will return early
+ if (entity->hasMore) {
+ textStart = ((const char *)entity->textPtr) + entity->processed;
+ textEnd = (const char *)(entity->textPtr + entity->textLen);
+ /* Set a safe default value in case 'next' does not get set */
+ next = textStart;
+
+ if (entity->is_param) {
+ int tok
+ = XmlPrologTok(parser->m_internalEncoding, textStart, textEnd, &next);
+ result = doProlog(parser, parser->m_internalEncoding, textStart, textEnd,
+ tok, next, &next, XML_FALSE, XML_FALSE,
+ XML_ACCOUNT_ENTITY_EXPANSION);
+ } else {
+ result = doContent(parser, openEntity->startTagLevel,
+ parser->m_internalEncoding, textStart, textEnd, &next,
+ XML_FALSE, XML_ACCOUNT_ENTITY_EXPANSION);
+ }
- if (textEnd != next && parser->m_parsingStatus.parsing == XML_SUSPENDED) {
- entity->processed = (int)(next - (const char *)entity->textPtr);
+ if (result != XML_ERROR_NONE)
+ return result;
+ // Check if entity is complete, if not, mark down how much of it is
+ // processed
+ if (textEnd != next
+ && (parser->m_parsingStatus.parsing == XML_SUSPENDED
+ || (parser->m_parsingStatus.parsing == XML_PARSING
+ && parser->m_reenter))) {
+ entity->processed = (int)(next - (const char *)entity->textPtr);
+ return result;
+ }
+
+ // Entity is complete. We cannot close it here since we need to first
+ // process its possible inner entities (which are added to the
+ // m_openInternalEntities during doProlog or doContent calls above)
+ entity->hasMore = XML_FALSE;
+ triggerReenter(parser);
return result;
- }
+ } // End of entity processing, "if" block will return here
+ // Remove fully processed openEntity from open entity list.
#if XML_GE == 1
entityTrackingOnClose(parser, entity, __LINE__);
#endif
+ // openEntity is m_openInternalEntities' head, as we set it at the start of
+ // this function and we skipped doProlog and doContent calls with hasMore set
+ // to false. This means we can directly remove the head of
+ // m_openInternalEntities
+ assert(parser->m_openInternalEntities == openEntity);
entity->open = XML_FALSE;
- parser->m_openInternalEntities = openEntity->next;
+ parser->m_openInternalEntities = parser->m_openInternalEntities->next;
+
/* put openEntity back in list of free instances */
openEntity->next = parser->m_freeInternalEntities;
parser->m_freeInternalEntities = openEntity;
- // If there are more open entities we want to stop right here and have the
- // upcoming call to XML_ResumeParser continue with entity content, or it would
- // be ignored altogether.
- if (parser->m_openInternalEntities != NULL
- && parser->m_parsingStatus.parsing == XML_SUSPENDED) {
- return XML_ERROR_NONE;
- }
-
- if (entity->is_param) {
- int tok;
- parser->m_processor = prologProcessor;
- tok = XmlPrologTok(parser->m_encoding, s, end, &next);
- return doProlog(parser, parser->m_encoding, s, end, tok, next, nextPtr,
- (XML_Bool)! parser->m_parsingStatus.finalBuffer, XML_TRUE,
- XML_ACCOUNT_DIRECT);
- } else {
- parser->m_processor = contentProcessor;
- /* see externalEntityContentProcessor vs contentProcessor */
- result = doContent(parser, parser->m_parentParser ? 1 : 0,
- parser->m_encoding, s, end, nextPtr,
- (XML_Bool)! parser->m_parsingStatus.finalBuffer,
- XML_ACCOUNT_DIRECT);
- if (result == XML_ERROR_NONE) {
- if (! storeRawNames(parser))
- return XML_ERROR_NO_MEMORY;
- }
- return result;
+ if (parser->m_openInternalEntities == NULL) {
+ parser->m_processor = entity->is_param ? prologProcessor : contentProcessor;
}
+ triggerReenter(parser);
+ return XML_ERROR_NONE;
}
static enum XML_Error PTRCALL
@@ -5982,8 +6127,70 @@ static enum XML_Error
storeAttributeValue(XML_Parser parser, const ENCODING *enc, XML_Bool isCdata,
const char *ptr, const char *end, STRING_POOL *pool,
enum XML_Account account) {
- enum XML_Error result
- = appendAttributeValue(parser, enc, isCdata, ptr, end, pool, account);
+ const char *next = ptr;
+ enum XML_Error result = XML_ERROR_NONE;
+
+ while (1) {
+ if (! parser->m_openAttributeEntities) {
+ result = appendAttributeValue(parser, enc, isCdata, next, end, pool,
+ account, &next);
+ } else {
+ OPEN_INTERNAL_ENTITY *const openEntity = parser->m_openAttributeEntities;
+ if (! openEntity)
+ return XML_ERROR_UNEXPECTED_STATE;
+
+ ENTITY *const entity = openEntity->entity;
+ const char *const textStart
+ = ((const char *)entity->textPtr) + entity->processed;
+ const char *const textEnd
+ = (const char *)(entity->textPtr + entity->textLen);
+ /* Set a safe default value in case 'next' does not get set */
+ const char *nextInEntity = textStart;
+ if (entity->hasMore) {
+ result = appendAttributeValue(
+ parser, parser->m_internalEncoding, isCdata, textStart, textEnd,
+ pool, XML_ACCOUNT_ENTITY_EXPANSION, &nextInEntity);
+ if (result != XML_ERROR_NONE)
+ break;
+ // Check if entity is complete, if not, mark down how much of it is
+ // processed. A XML_SUSPENDED check here is not required as
+ // appendAttributeValue will never suspend the parser.
+ if (textEnd != nextInEntity) {
+ entity->processed
+ = (int)(nextInEntity - (const char *)entity->textPtr);
+ continue;
+ }
+
+ // Entity is complete. We cannot close it here since we need to first
+ // process its possible inner entities (which are added to the
+ // m_openAttributeEntities during appendAttributeValue)
+ entity->hasMore = XML_FALSE;
+ continue;
+ } // End of entity processing, "if" block skips the rest
+
+ // Remove fully processed openEntity from open entity list.
+#if XML_GE == 1
+ entityTrackingOnClose(parser, entity, __LINE__);
+#endif
+ // openEntity is m_openAttributeEntities' head, since we set it at the
+ // start of this function and because we skipped appendAttributeValue call
+ // with hasMore set to false. This means we can directly remove the head
+ // of m_openAttributeEntities
+ assert(parser->m_openAttributeEntities == openEntity);
+ entity->open = XML_FALSE;
+ parser->m_openAttributeEntities = parser->m_openAttributeEntities->next;
+
+ /* put openEntity back in list of free instances */
+ openEntity->next = parser->m_freeAttributeEntities;
+ parser->m_freeAttributeEntities = openEntity;
+ }
+
+ // Break if an error occurred or there is nothing left to process
+ if (result || (parser->m_openAttributeEntities == NULL && end == next)) {
+ break;
+ }
+ }
+
if (result)
return result;
if (! isCdata && poolLength(pool) && poolLastChar(pool) == 0x20)
@@ -5996,7 +6203,7 @@ storeAttributeValue(XML_Parser parser, const ENCODING *enc, XML_Bool isCdata,
static enum XML_Error
appendAttributeValue(XML_Parser parser, const ENCODING *enc, XML_Bool isCdata,
const char *ptr, const char *end, STRING_POOL *pool,
- enum XML_Account account) {
+ enum XML_Account account, const char **nextPtr) {
DTD *const dtd = parser->m_dtd; /* save one level of indirection */
#ifndef XML_DTD
UNUSED_P(account);
@@ -6014,6 +6221,9 @@ appendAttributeValue(XML_Parser parser, const ENCODING *enc, XML_Bool isCdata,
#endif
switch (tok) {
case XML_TOK_NONE:
+ if (nextPtr) {
+ *nextPtr = next;
+ }
return XML_ERROR_NONE;
case XML_TOK_INVALID:
if (enc == parser->m_encoding)
@@ -6154,21 +6364,11 @@ appendAttributeValue(XML_Parser parser, const ENCODING *enc, XML_Bool isCdata,
return XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF;
} else {
enum XML_Error result;
- const XML_Char *textEnd = entity->textPtr + entity->textLen;
- entity->open = XML_TRUE;
-#if XML_GE == 1
- entityTrackingOnOpen(parser, entity, __LINE__);
-#endif
- result = appendAttributeValue(parser, parser->m_internalEncoding,
- isCdata, (const char *)entity->textPtr,
- (const char *)textEnd, pool,
- XML_ACCOUNT_ENTITY_EXPANSION);
-#if XML_GE == 1
- entityTrackingOnClose(parser, entity, __LINE__);
-#endif
- entity->open = XML_FALSE;
- if (result)
- return result;
+ result = processEntity(parser, entity, XML_FALSE, ENTITY_ATTRIBUTE);
+ if ((result == XML_ERROR_NONE) && (nextPtr != NULL)) {
+ *nextPtr = next;
+ }
+ return result;
}
} break;
default:
@@ -6197,7 +6397,7 @@ appendAttributeValue(XML_Parser parser, const ENCODING *enc, XML_Bool isCdata,
static enum XML_Error
storeEntityValue(XML_Parser parser, const ENCODING *enc,
const char *entityTextPtr, const char *entityTextEnd,
- enum XML_Account account) {
+ enum XML_Account account, const char **nextPtr) {
DTD *const dtd = parser->m_dtd; /* save one level of indirection */
STRING_POOL *pool = &(dtd->entityValuePool);
enum XML_Error result = XML_ERROR_NONE;
@@ -6215,8 +6415,9 @@ storeEntityValue(XML_Parser parser, const ENCODING *enc,
return XML_ERROR_NO_MEMORY;
}
+ const char *next;
for (;;) {
- const char *next
+ next
= entityTextPtr; /* XmlEntityValueTok doesn't always set the last arg */
int tok = XmlEntityValueTok(enc, entityTextPtr, entityTextEnd, &next);
@@ -6278,16 +6479,8 @@ storeEntityValue(XML_Parser parser, const ENCODING *enc,
} else
dtd->keepProcessing = dtd->standalone;
} else {
- entity->open = XML_TRUE;
- entityTrackingOnOpen(parser, entity, __LINE__);
- result = storeEntityValue(
- parser, parser->m_internalEncoding, (const char *)entity->textPtr,
- (const char *)(entity->textPtr + entity->textLen),
- XML_ACCOUNT_ENTITY_EXPANSION);
- entityTrackingOnClose(parser, entity, __LINE__);
- entity->open = XML_FALSE;
- if (result)
- goto endEntityValue;
+ result = processEntity(parser, entity, XML_FALSE, ENTITY_VALUE);
+ goto endEntityValue;
}
break;
}
@@ -6375,6 +6568,81 @@ endEntityValue:
# ifdef XML_DTD
parser->m_prologState.inEntityValue = oldInEntityValue;
# endif /* XML_DTD */
+ // If 'nextPtr' is given, it should be updated during the processing
+ if (nextPtr != NULL) {
+ *nextPtr = next;
+ }
+ return result;
+}
+
+static enum XML_Error
+callStoreEntityValue(XML_Parser parser, const ENCODING *enc,
+ const char *entityTextPtr, const char *entityTextEnd,
+ enum XML_Account account) {
+ const char *next = entityTextPtr;
+ enum XML_Error result = XML_ERROR_NONE;
+ while (1) {
+ if (! parser->m_openValueEntities) {
+ result
+ = storeEntityValue(parser, enc, next, entityTextEnd, account, &next);
+ } else {
+ OPEN_INTERNAL_ENTITY *const openEntity = parser->m_openValueEntities;
+ if (! openEntity)
+ return XML_ERROR_UNEXPECTED_STATE;
+
+ ENTITY *const entity = openEntity->entity;
+ const char *const textStart
+ = ((const char *)entity->textPtr) + entity->processed;
+ const char *const textEnd
+ = (const char *)(entity->textPtr + entity->textLen);
+ /* Set a safe default value in case 'next' does not get set */
+ const char *nextInEntity = textStart;
+ if (entity->hasMore) {
+ result = storeEntityValue(parser, parser->m_internalEncoding, textStart,
+ textEnd, XML_ACCOUNT_ENTITY_EXPANSION,
+ &nextInEntity);
+ if (result != XML_ERROR_NONE)
+ break;
+ // Check if entity is complete, if not, mark down how much of it is
+ // processed. A XML_SUSPENDED check here is not required as
+ // appendAttributeValue will never suspend the parser.
+ if (textEnd != nextInEntity) {
+ entity->processed
+ = (int)(nextInEntity - (const char *)entity->textPtr);
+ continue;
+ }
+
+ // Entity is complete. We cannot close it here since we need to first
+ // process its possible inner entities (which are added to the
+ // m_openValueEntities during storeEntityValue)
+ entity->hasMore = XML_FALSE;
+ continue;
+ } // End of entity processing, "if" block skips the rest
+
+ // Remove fully processed openEntity from open entity list.
+# if XML_GE == 1
+ entityTrackingOnClose(parser, entity, __LINE__);
+# endif
+ // openEntity is m_openValueEntities' head, since we set it at the
+ // start of this function and because we skipped storeEntityValue call
+ // with hasMore set to false. This means we can directly remove the head
+ // of m_openValueEntities
+ assert(parser->m_openValueEntities == openEntity);
+ entity->open = XML_FALSE;
+ parser->m_openValueEntities = parser->m_openValueEntities->next;
+
+ /* put openEntity back in list of free instances */
+ openEntity->next = parser->m_freeValueEntities;
+ parser->m_freeValueEntities = openEntity;
+ }
+
+ // Break if an error occurred or there is nothing left to process
+ if (result
+ || (parser->m_openValueEntities == NULL && entityTextEnd == next)) {
+ break;
+ }
+ }
+
return result;
}
@@ -7983,7 +8251,7 @@ entityTrackingReportStats(XML_Parser rootParser, ENTITY *entity,
(void *)rootParser, rootParser->m_entity_stats.countEverOpened,
rootParser->m_entity_stats.currentDepth,
rootParser->m_entity_stats.maximumDepthSeen,
- (rootParser->m_entity_stats.currentDepth - 1) * 2, "",
+ ((int)rootParser->m_entity_stats.currentDepth - 1) * 2, "",
entity->is_param ? "%" : "&", entityName, action, entity->textLen,
sourceLine);
}
@@ -8542,11 +8810,13 @@ unsignedCharToPrintable(unsigned char c) {
return "\\xFE";
case 255:
return "\\xFF";
+ // LCOV_EXCL_START
default:
assert(0); /* never gets here */
return "dead code";
}
assert(0); /* never gets here */
+ // LCOV_EXCL_STOP
}
#endif /* XML_GE == 1 */
diff --git a/contrib/expat/tests/acc_tests.c b/contrib/expat/tests/acc_tests.c
index f193aa58a492..b58647a2ab02 100644
--- a/contrib/expat/tests/acc_tests.c
+++ b/contrib/expat/tests/acc_tests.c
@@ -360,13 +360,16 @@ END_TEST
START_TEST(test_helper_unsigned_char_to_printable) {
// Smoke test
unsigned char uc = 0;
- for (; uc < (unsigned char)-1; uc++) {
+ for (;; uc++) {
set_subtest("char %u", (unsigned)uc);
const char *const printable = unsignedCharToPrintable(uc);
if (printable == NULL)
fail("unsignedCharToPrintable returned NULL");
else if (strlen(printable) < (size_t)1)
fail("unsignedCharToPrintable returned empty string");
+ if (uc == (unsigned char)-1) {
+ break;
+ }
}
// Two concrete samples
diff --git a/contrib/expat/tests/alloc_tests.c b/contrib/expat/tests/alloc_tests.c
index e5d46ebea821..12ea3b2a81d2 100644
--- a/contrib/expat/tests/alloc_tests.c
+++ b/contrib/expat/tests/alloc_tests.c
@@ -19,6 +19,7 @@
Copyright (c) 2020 Tim Gates <tim.gates@iress.com>
Copyright (c) 2021 Donghee Na <donghee.na@python.org>
Copyright (c) 2023 Sony Corporation / Snild Dolkow <snild@sony.com>
+ Copyright (c) 2025 Berkay Eren Ürün <berkay.ueruen@siemens.com>
Licensed under the MIT license:
Permission is hereby granted, free of charge, to any person obtaining
@@ -450,6 +451,31 @@ START_TEST(test_alloc_internal_entity) {
}
END_TEST
+START_TEST(test_alloc_parameter_entity) {
+ const char *text = "<!DOCTYPE foo ["
+ "<!ENTITY % param1 \"<!ENTITY internal 'some_text'>\">"
+ "%param1;"
+ "]> <foo>&internal;content</foo>";
+ int i;
+ const int alloc_test_max_repeats = 30;
+
+ for (i = 0; i < alloc_test_max_repeats; i++) {
+ g_allocation_count = i;
+ XML_SetParamEntityParsing(g_parser, XML_PARAM_ENTITY_PARSING_ALWAYS);
+ if (_XML_Parse_SINGLE_BYTES(g_parser, text, (int)strlen(text), XML_TRUE)
+ != XML_STATUS_ERROR)
+ break;
+ alloc_teardown();
+ alloc_setup();
+ }
+ g_allocation_count = -1;
+ if (i == 0)
+ fail("Parameter entity processed despite duff allocator");
+ if (i == alloc_test_max_repeats)
+ fail("Parameter entity not processed at max allocation count");
+}
+END_TEST
+
/* Test the robustness against allocation failure of element handling
* Based on test_dtd_default_handling().
*/
@@ -2079,6 +2105,7 @@ make_alloc_test_case(Suite *s) {
tcase_add_test__ifdef_xml_dtd(tc_alloc, test_alloc_external_entity);
tcase_add_test__ifdef_xml_dtd(tc_alloc, test_alloc_ext_entity_set_encoding);
tcase_add_test__ifdef_xml_dtd(tc_alloc, test_alloc_internal_entity);
+ tcase_add_test__ifdef_xml_dtd(tc_alloc, test_alloc_parameter_entity);
tcase_add_test__ifdef_xml_dtd(tc_alloc, test_alloc_dtd_default_handling);
tcase_add_test(tc_alloc, test_alloc_explicit_encoding);
tcase_add_test(tc_alloc, test_alloc_set_base);
diff --git a/contrib/expat/tests/basic_tests.c b/contrib/expat/tests/basic_tests.c
index d38b8fd18416..e813df8b6fd2 100644
--- a/contrib/expat/tests/basic_tests.c
+++ b/contrib/expat/tests/basic_tests.c
@@ -10,7 +10,7 @@
Copyright (c) 2003 Greg Stein <gstein@users.sourceforge.net>
Copyright (c) 2005-2007 Steven Solie <steven@solie.ca>
Copyright (c) 2005-2012 Karl Waclawek <karl@waclawek.net>
- Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org>
+ Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017-2022 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2017 Joe Orton <jorton@redhat.com>
Copyright (c) 2017 José Gutiérrez de la Concha <jose@zeroc.com>
@@ -19,6 +19,7 @@
Copyright (c) 2020 Tim Gates <tim.gates@iress.com>
Copyright (c) 2021 Donghee Na <donghee.na@python.org>
Copyright (c) 2023-2024 Sony Corporation / Snild Dolkow <snild@sony.com>
+ Copyright (c) 2024-2025 Berkay Eren Ürün <berkay.ueruen@siemens.com>
Licensed under the MIT license:
Permission is hereby granted, free of charge, to any person obtaining
@@ -1191,6 +1192,22 @@ START_TEST(test_not_standalone_handler_accept) {
}
END_TEST
+START_TEST(test_entity_start_tag_level_greater_than_one) {
+ const char *const text = "<!DOCTYPE t1 [\n"
+ " <!ENTITY e1 'hello'>\n"
+ "]>\n"
+ "<t1>\n"
+ " <t2>&e1;</t2>\n"
+ "</t1>\n";
+
+ XML_Parser parser = XML_ParserCreate(NULL);
+ assert_true(_XML_Parse_SINGLE_BYTES(parser, text, (int)strlen(text),
+ /*isFinal*/ XML_TRUE)
+ == XML_STATUS_OK);
+ XML_ParserFree(parser);
+}
+END_TEST
+
START_TEST(test_wfc_no_recursive_entity_refs) {
const char *text = "<!DOCTYPE doc [\n"
" <!ENTITY entity '&#38;entity;'>\n"
@@ -1202,6 +1219,93 @@ START_TEST(test_wfc_no_recursive_entity_refs) {
}
END_TEST
+START_TEST(test_no_indirectly_recursive_entity_refs) {
+ struct TestCase {
+ const char *doc;
+ bool usesParameterEntities;
+ };
+
+ const struct TestCase cases[] = {
+ // general entity + character data
+ {"<!DOCTYPE a [\n"
+ " <!ENTITY e1 '&e2;'>\n"
+ " <!ENTITY e2 '&e1;'>\n"
+ "]><a>&e2;</a>\n",
+ false},
+
+ // general entity + attribute value
+ {"<!DOCTYPE a [\n"
+ " <!ENTITY e1 '&e2;'>\n"
+ " <!ENTITY e2 '&e1;'>\n"
+ "]><a k1='&e2;' />\n",
+ false},
+
+ // parameter entity
+ {"<!DOCTYPE doc [\n"
+ " <!ENTITY % p1 '&#37;p2;'>\n"
+ " <!ENTITY % p2 '&#37;p1;'>\n"
+ " <!ENTITY % define_g \"<!ENTITY g '&#37;p2;'>\">\n"
+ " %define_g;\n"
+ "]>\n"
+ "<doc/>\n",
+ true},
+ };
+ const XML_Bool reset_or_not[] = {XML_TRUE, XML_FALSE};
+
+ for (size_t i = 0; i < sizeof(cases) / sizeof(cases[0]); i++) {
+ for (size_t j = 0; j < sizeof(reset_or_not) / sizeof(reset_or_not[0]);
+ j++) {
+ const XML_Bool reset_wanted = reset_or_not[j];
+ const char *const doc = cases[i].doc;
+ const bool usesParameterEntities = cases[i].usesParameterEntities;
+
+ set_subtest("[%i,reset=%i] %s", (int)i, (int)j, doc);
+
+#ifdef XML_DTD // both GE and DTD
+ const bool rejection_expected = true;
+#elif XML_GE == 1 // GE but not DTD
+ const bool rejection_expected = ! usesParameterEntities;
+#else // neither DTD nor GE
+ const bool rejection_expected = false;
+#endif
+
+ XML_Parser parser = XML_ParserCreate(NULL);
+
+#ifdef XML_DTD
+ if (usesParameterEntities) {
+ assert_true(
+ XML_SetParamEntityParsing(parser, XML_PARAM_ENTITY_PARSING_ALWAYS)
+ == 1);
+ }
+#else
+ UNUSED_P(usesParameterEntities);
+#endif // XML_DTD
+
+ const enum XML_Status status
+ = _XML_Parse_SINGLE_BYTES(parser, doc, (int)strlen(doc),
+ /*isFinal*/ XML_TRUE);
+
+ if (rejection_expected) {
+ assert_true(status == XML_STATUS_ERROR);
+ assert_true(XML_GetErrorCode(parser) == XML_ERROR_RECURSIVE_ENTITY_REF);
+ } else {
+ assert_true(status == XML_STATUS_OK);
+ }
+
+ if (reset_wanted) {
+ // This covers free'ing of (eventually) all three open entity lists by
+ // XML_ParserReset.
+ XML_ParserReset(parser, NULL);
+ }
+
+ // This covers free'ing of (eventually) all three open entity lists by
+ // XML_ParserFree (unless XML_ParserReset has already done that above).
+ XML_ParserFree(parser);
+ }
+ }
+}
+END_TEST
+
START_TEST(test_recursive_external_parameter_entity_2) {
struct TestCase {
const char *doc;
@@ -1417,7 +1521,9 @@ START_TEST(test_suspend_parser_between_char_data_calls) {
XML_SetCharacterDataHandler(g_parser, clearing_aborting_character_handler);
g_resumable = XML_TRUE;
- if (_XML_Parse_SINGLE_BYTES(g_parser, text, (int)strlen(text), XML_TRUE)
+ // can't use SINGLE_BYTES here, because it'll return early on suspension, and
+ // we won't know exactly how much input we actually managed to give Expat.
+ if (XML_Parse(g_parser, text, (int)strlen(text), XML_TRUE)
!= XML_STATUS_SUSPENDED)
xml_failure(g_parser);
if (XML_GetErrorCode(g_parser) != XML_ERROR_NONE)
@@ -1446,7 +1552,9 @@ START_TEST(test_repeated_stop_parser_between_char_data_calls) {
XML_SetCharacterDataHandler(g_parser, parser_stop_character_handler);
g_resumable = XML_TRUE;
g_abortable = XML_FALSE;
- if (_XML_Parse_SINGLE_BYTES(g_parser, text, (int)strlen(text), XML_TRUE)
+ // can't use SINGLE_BYTES here, because it'll return early on suspension, and
+ // we won't know exactly how much input we actually managed to give Expat.
+ if (XML_Parse(g_parser, text, (int)strlen(text), XML_TRUE)
!= XML_STATUS_SUSPENDED)
fail("Failed to double-suspend parser");
@@ -1830,12 +1938,19 @@ END_TEST
/* Test suspending the parser in cdata handler */
START_TEST(test_suspend_parser_between_cdata_calls) {
+ if (g_chunkSize != 0) {
+ // this test does not use SINGLE_BYTES, because of suspension
+ return;
+ }
+
const char *text = long_cdata_text;
enum XML_Status result;
XML_SetCharacterDataHandler(g_parser, clearing_aborting_character_handler);
g_resumable = XML_TRUE;
- result = _XML_Parse_SINGLE_BYTES(g_parser, text, (int)strlen(text), XML_TRUE);
+ // can't use SINGLE_BYTES here, because it'll return early on suspension, and
+ // we won't know exactly how much input we actually managed to give Expat.
+ result = XML_Parse(g_parser, text, (int)strlen(text), XML_TRUE);
if (result != XML_STATUS_SUSPENDED) {
if (result == XML_STATUS_ERROR)
xml_failure(g_parser);
@@ -2378,6 +2493,11 @@ END_TEST
* entity. Exercises some obscure code in XML_ParserReset().
*/
START_TEST(test_reset_in_entity) {
+ if (g_chunkSize != 0) {
+ // this test does not use SINGLE_BYTES, because of suspension
+ return;
+ }
+
const char *text = "<!DOCTYPE doc [\n"
"<!ENTITY wombat 'wom'>\n"
"<!ENTITY entity 'hi &wom; there'>\n"
@@ -2387,7 +2507,9 @@ START_TEST(test_reset_in_entity) {
g_resumable = XML_TRUE;
XML_SetCharacterDataHandler(g_parser, clearing_aborting_character_handler);
- if (_XML_Parse_SINGLE_BYTES(g_parser, text, (int)strlen(text), XML_TRUE)
+ // can't use SINGLE_BYTES here, because it'll return early on suspension, and
+ // we won't know exactly how much input we actually managed to give Expat.
+ if (XML_Parse(g_parser, text, (int)strlen(text), XML_TRUE)
== XML_STATUS_ERROR)
xml_failure(g_parser);
XML_GetParsingStatus(g_parser, &status);
@@ -3634,7 +3756,9 @@ START_TEST(test_suspend_xdecl) {
XML_SetXmlDeclHandler(g_parser, entity_suspending_xdecl_handler);
XML_SetUserData(g_parser, g_parser);
g_resumable = XML_TRUE;
- if (_XML_Parse_SINGLE_BYTES(g_parser, text, (int)strlen(text), XML_TRUE)
+ // can't use SINGLE_BYTES here, because it'll return early on suspension, and
+ // we won't know exactly how much input we actually managed to give Expat.
+ if (XML_Parse(g_parser, text, (int)strlen(text), XML_TRUE)
!= XML_STATUS_SUSPENDED)
xml_failure(g_parser);
if (XML_GetErrorCode(g_parser) != XML_ERROR_NONE)
@@ -3830,13 +3954,20 @@ END_TEST
/* Test syntax error is caught at parse resumption */
START_TEST(test_resume_entity_with_syntax_error) {
+ if (g_chunkSize != 0) {
+ // this test does not use SINGLE_BYTES, because of suspension
+ return;
+ }
+
const char *text = "<!DOCTYPE doc [\n"
"<!ENTITY foo '<suspend>Hi</wombat>'>\n"
"]>\n"
"<doc>&foo;</doc>\n";
XML_SetStartElementHandler(g_parser, start_element_suspender);
- if (_XML_Parse_SINGLE_BYTES(g_parser, text, (int)strlen(text), XML_TRUE)
+ // can't use SINGLE_BYTES here, because it'll return early on suspension, and
+ // we won't know exactly how much input we actually managed to give Expat.
+ if (XML_Parse(g_parser, text, (int)strlen(text), XML_TRUE)
!= XML_STATUS_SUSPENDED)
xml_failure(g_parser);
if (XML_ResumeParser(g_parser) != XML_STATUS_ERROR)
@@ -3960,7 +4091,7 @@ START_TEST(test_skipped_null_loaded_ext_entity) {
= {"<!ENTITY % pe1 SYSTEM 'http://example.org/two.ent'>\n"
"<!ENTITY % pe2 '%pe1;'>\n"
"%pe2;\n",
- external_entity_null_loader};
+ external_entity_null_loader, NULL};
XML_SetUserData(g_parser, &test_data);
XML_SetParamEntityParsing(g_parser, XML_PARAM_ENTITY_PARSING_ALWAYS);
@@ -3978,7 +4109,7 @@ START_TEST(test_skipped_unloaded_ext_entity) {
= {"<!ENTITY % pe1 SYSTEM 'http://example.org/two.ent'>\n"
"<!ENTITY % pe2 '%pe1;'>\n"
"%pe2;\n",
- NULL};
+ NULL, NULL};
XML_SetUserData(g_parser, &test_data);
XML_SetParamEntityParsing(g_parser, XML_PARAM_ENTITY_PARSING_ALWAYS);
@@ -5278,6 +5409,151 @@ START_TEST(test_pool_integrity_with_unfinished_attr) {
}
END_TEST
+/* Test a possible early return location in internalEntityProcessor */
+START_TEST(test_entity_ref_no_elements) {
+ const char *const text = "<!DOCTYPE foo [\n"
+ "<!ENTITY e1 \"test\">\n"
+ "]> <foo>&e1;"; // intentionally missing newline
+
+ XML_Parser parser = XML_ParserCreate(NULL);
+ assert_true(_XML_Parse_SINGLE_BYTES(parser, text, (int)strlen(text), XML_TRUE)
+ == XML_STATUS_ERROR);
+ assert_true(XML_GetErrorCode(parser) == XML_ERROR_NO_ELEMENTS);
+ XML_ParserFree(parser);
+}
+END_TEST
+
+/* Tests if chained entity references lead to unbounded recursion */
+START_TEST(test_deep_nested_entity) {
+ const size_t N_LINES = 60000;
+ const size_t SIZE_PER_LINE = 50;
+
+ char *const text = (char *)malloc((N_LINES + 4) * SIZE_PER_LINE);
+ if (text == NULL) {
+ fail("malloc failed");
+ }
+
+ char *textPtr = text;
+
+ // Create the XML
+ textPtr += snprintf(textPtr, SIZE_PER_LINE,
+ "<!DOCTYPE foo [\n"
+ " <!ENTITY s0 'deepText'>\n");
+
+ for (size_t i = 1; i < N_LINES; ++i) {
+ textPtr += snprintf(textPtr, SIZE_PER_LINE, " <!ENTITY s%lu '&s%lu;'>\n",
+ (long unsigned)i, (long unsigned)(i - 1));
+ }
+
+ snprintf(textPtr, SIZE_PER_LINE, "]> <foo>&s%lu;</foo>\n",
+ (long unsigned)(N_LINES - 1));
+
+ const XML_Char *const expected = XCS("deepText");
+
+ CharData storage;
+ CharData_Init(&storage);
+
+ XML_Parser parser = XML_ParserCreate(NULL);
+
+ XML_SetCharacterDataHandler(parser, accumulate_characters);
+ XML_SetUserData(parser, &storage);
+
+ if (_XML_Parse_SINGLE_BYTES(parser, text, (int)strlen(text), XML_TRUE)
+ == XML_STATUS_ERROR)
+ xml_failure(parser);
+
+ CharData_CheckXMLChars(&storage, expected);
+ XML_ParserFree(parser);
+ free(text);
+}
+END_TEST
+
+/* Tests if chained entity references in attributes
+lead to unbounded recursion */
+START_TEST(test_deep_nested_attribute_entity) {
+ const size_t N_LINES = 60000;
+ const size_t SIZE_PER_LINE = 100;
+
+ char *const text = (char *)malloc((N_LINES + 4) * SIZE_PER_LINE);
+ if (text == NULL) {
+ fail("malloc failed");
+ }
+
+ char *textPtr = text;
+
+ // Create the XML
+ textPtr += snprintf(textPtr, SIZE_PER_LINE,
+ "<!DOCTYPE foo [\n"
+ " <!ENTITY s0 'deepText'>\n");
+
+ for (size_t i = 1; i < N_LINES; ++i) {
+ textPtr += snprintf(textPtr, SIZE_PER_LINE, " <!ENTITY s%lu '&s%lu;'>\n",
+ (long unsigned)i, (long unsigned)(i - 1));
+ }
+
+ snprintf(textPtr, SIZE_PER_LINE, "]> <foo name='&s%lu;'>mainText</foo>\n",
+ (long unsigned)(N_LINES - 1));
+
+ AttrInfo doc_info[] = {{XCS("name"), XCS("deepText")}, {NULL, NULL}};
+ ElementInfo info[] = {{XCS("foo"), 1, NULL, NULL}, {NULL, 0, NULL, NULL}};
+ info[0].attributes = doc_info;
+
+ XML_Parser parser = XML_ParserCreate(NULL);
+ ParserAndElementInfo parserPlusElemenInfo = {parser, info};
+
+ XML_SetStartElementHandler(parser, counting_start_element_handler);
+ XML_SetUserData(parser, &parserPlusElemenInfo);
+
+ if (_XML_Parse_SINGLE_BYTES(parser, text, (int)strlen(text), XML_TRUE)
+ == XML_STATUS_ERROR)
+ xml_failure(parser);
+
+ XML_ParserFree(parser);
+ free(text);
+}
+END_TEST
+
+START_TEST(test_deep_nested_entity_delayed_interpretation) {
+ const size_t N_LINES = 70000;
+ const size_t SIZE_PER_LINE = 100;
+
+ char *const text = (char *)malloc((N_LINES + 4) * SIZE_PER_LINE);
+ if (text == NULL) {
+ fail("malloc failed");
+ }
+
+ char *textPtr = text;
+
+ // Create the XML
+ textPtr += snprintf(textPtr, SIZE_PER_LINE,
+ "<!DOCTYPE foo [\n"
+ " <!ENTITY %% s0 'deepText'>\n");
+
+ for (size_t i = 1; i < N_LINES; ++i) {
+ textPtr += snprintf(textPtr, SIZE_PER_LINE,
+ " <!ENTITY %% s%lu '&#37;s%lu;'>\n", (long unsigned)i,
+ (long unsigned)(i - 1));
+ }
+
+ snprintf(textPtr, SIZE_PER_LINE,
+ " <!ENTITY %% define_g \"<!ENTITY g '&#37;s%lu;'>\">\n"
+ " %%define_g;\n"
+ "]>\n"
+ "<foo/>\n",
+ (long unsigned)(N_LINES - 1));
+
+ XML_Parser parser = XML_ParserCreate(NULL);
+
+ XML_SetParamEntityParsing(parser, XML_PARAM_ENTITY_PARSING_ALWAYS);
+ if (_XML_Parse_SINGLE_BYTES(parser, text, (int)strlen(text), XML_TRUE)
+ == XML_STATUS_ERROR)
+ xml_failure(parser);
+
+ XML_ParserFree(parser);
+ free(text);
+}
+END_TEST
+
START_TEST(test_nested_entity_suspend) {
const char *const text = "<!DOCTYPE a [\n"
" <!ENTITY e1 '<!--e1-->'>\n"
@@ -5308,6 +5584,35 @@ START_TEST(test_nested_entity_suspend) {
}
END_TEST
+START_TEST(test_nested_entity_suspend_2) {
+ const char *const text = "<!DOCTYPE doc [\n"
+ " <!ENTITY ge1 'head1Ztail1'>\n"
+ " <!ENTITY ge2 'head2&ge1;tail2'>\n"
+ " <!ENTITY ge3 'head3&ge2;tail3'>\n"
+ "]>\n"
+ "<doc>&ge3;</doc>";
+ const XML_Char *const expected = XCS("head3") XCS("head2") XCS("head1")
+ XCS("Z") XCS("tail1") XCS("tail2") XCS("tail3");
+ CharData storage;
+ CharData_Init(&storage);
+ XML_Parser parser = XML_ParserCreate(NULL);
+ ParserPlusStorage parserPlusStorage = {parser, &storage};
+
+ XML_SetCharacterDataHandler(parser, accumulate_char_data_and_suspend);
+ XML_SetUserData(parser, &parserPlusStorage);
+
+ enum XML_Status status = XML_Parse(parser, text, (int)strlen(text), XML_TRUE);
+ while (status == XML_STATUS_SUSPENDED) {
+ status = XML_ResumeParser(parser);
+ }
+ if (status != XML_STATUS_OK)
+ xml_failure(parser);
+
+ CharData_CheckXMLChars(&storage, expected);
+ XML_ParserFree(parser);
+}
+END_TEST
+
/* Regression test for quadratic parsing on large tokens */
START_TEST(test_big_tokens_scale_linearly) {
const struct {
@@ -5968,7 +6273,9 @@ make_basic_test_case(Suite *s) {
tcase_add_test(tc_basic, test_wfc_undeclared_entity_with_external_subset);
tcase_add_test(tc_basic, test_not_standalone_handler_reject);
tcase_add_test(tc_basic, test_not_standalone_handler_accept);
+ tcase_add_test(tc_basic, test_entity_start_tag_level_greater_than_one);
tcase_add_test__if_xml_ge(tc_basic, test_wfc_no_recursive_entity_refs);
+ tcase_add_test(tc_basic, test_no_indirectly_recursive_entity_refs);
tcase_add_test__ifdef_xml_dtd(tc_basic, test_ext_entity_invalid_parse);
tcase_add_test__if_xml_ge(tc_basic, test_dtd_default_handling);
tcase_add_test(tc_basic, test_dtd_attr_handling);
@@ -6147,7 +6454,13 @@ make_basic_test_case(Suite *s) {
tcase_add_test(tc_basic, test_empty_element_abort);
tcase_add_test__ifdef_xml_dtd(tc_basic,
test_pool_integrity_with_unfinished_attr);
+ tcase_add_test__if_xml_ge(tc_basic, test_entity_ref_no_elements);
+ tcase_add_test__if_xml_ge(tc_basic, test_deep_nested_entity);
+ tcase_add_test__if_xml_ge(tc_basic, test_deep_nested_attribute_entity);
+ tcase_add_test__if_xml_ge(tc_basic,
+ test_deep_nested_entity_delayed_interpretation);
tcase_add_test__if_xml_ge(tc_basic, test_nested_entity_suspend);
+ tcase_add_test__if_xml_ge(tc_basic, test_nested_entity_suspend_2);
tcase_add_test(tc_basic, test_big_tokens_scale_linearly);
tcase_add_test(tc_basic, test_set_reparse_deferral);
tcase_add_test(tc_basic, test_reparse_deferral_is_inherited);
diff --git a/contrib/expat/tests/benchmark/benchmark.c b/contrib/expat/tests/benchmark/benchmark.c
index 355d83f896de..a02b84a0131d 100644
--- a/contrib/expat/tests/benchmark/benchmark.c
+++ b/contrib/expat/tests/benchmark/benchmark.c
@@ -8,7 +8,7 @@
Copyright (c) 2003-2006 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2005-2007 Steven Solie <steven@solie.ca>
- Copyright (c) 2017-2023 Sebastian Pipping <sebastian@pipping.org>
+ Copyright (c) 2017-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017 Rhodri James <rhodri@wildebeest.org.uk>
Licensed under the MIT license:
@@ -32,10 +32,18 @@
USE OR OTHER DEALINGS IN THE SOFTWARE.
*/
+#define _POSIX_C_SOURCE 1 // fdopen
+
+#if defined(_MSC_VER)
+# include <io.h> // _open, _close
+#else
+# include <unistd.h> // close
+#endif
+
+#include <fcntl.h> // open
#include <sys/stat.h>
#include <assert.h>
#include <stddef.h> // ptrdiff_t
-#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include "expat.h"
@@ -52,17 +60,18 @@
# define XML_FMT_STR "s"
#endif
-static void
+static int
usage(const char *prog, int rc) {
fprintf(stderr, "usage: %s [-n] filename bufferSize nr_of_loops\n", prog);
- exit(rc);
+ return rc;
}
int
main(int argc, char *argv[]) {
XML_Parser parser;
char *XMLBuf, *XMLBufEnd, *XMLBufPtr;
- FILE *fd;
+ int fd;
+ FILE *file;
struct stat fileAttr;
int nrOfLoops, bufferSize, i, isFinal;
size_t fileSize;
@@ -76,34 +85,48 @@ main(int argc, char *argv[]) {
ns = 1;
j = 1;
} else
- usage(argv[0], 1);
+ return usage(argv[0], 1);
}
}
if (argc != j + 4)
- usage(argv[0], 1);
+ return usage(argv[0], 1);
- if (stat(argv[j + 1], &fileAttr) != 0) {
- fprintf(stderr, "could not access file '%s'\n", argv[j + 1]);
+ fd = open(argv[j + 1], O_RDONLY);
+ if (fd == -1) {
+ fprintf(stderr, "could not open file '%s'\n", argv[j + 1]);
return 2;
}
- fd = fopen(argv[j + 1], "r");
- if (! fd) {
- fprintf(stderr, "could not open file '%s'\n", argv[j + 1]);
- exit(2);
+ if (fstat(fd, &fileAttr) != 0) {
+ close(fd);
+ fprintf(stderr, "could not fstat file '%s'\n", argv[j + 1]);
+ return 2;
+ }
+
+ file = fdopen(fd, "r");
+ if (! file) {
+ close(fd);
+ fprintf(stderr, "could not fdopen file '%s'\n", argv[j + 1]);
+ return 2;
}
bufferSize = atoi(argv[j + 2]);
nrOfLoops = atoi(argv[j + 3]);
if (bufferSize <= 0 || nrOfLoops <= 0) {
+ fclose(file); // NOTE: this closes fd as well
fprintf(stderr, "buffer size and nr of loops must be greater than zero.\n");
- exit(3);
+ return 3;
}
XMLBuf = malloc(fileAttr.st_size);
- fileSize = fread(XMLBuf, sizeof(char), fileAttr.st_size, fd);
- fclose(fd);
+ if (XMLBuf == NULL) {
+ fclose(file); // NOTE: this closes fd as well
+ fprintf(stderr, "ouf of memory.\n");
+ return 5;
+ }
+ fileSize = fread(XMLBuf, sizeof(char), fileAttr.st_size, file);
+ fclose(file); // NOTE: this closes fd as well
if (ns)
parser = XML_ParserCreateNS(NULL, '!');
@@ -132,7 +155,7 @@ main(int argc, char *argv[]) {
XML_GetCurrentColumnNumber(parser));
free(XMLBuf);
XML_ParserFree(parser);
- exit(4);
+ return 4;
}
XMLBufPtr += bufferSize;
} while (! isFinal);
diff --git a/contrib/expat/tests/common.c b/contrib/expat/tests/common.c
index 3aea8d74d1ee..b158385f56a8 100644
--- a/contrib/expat/tests/common.c
+++ b/contrib/expat/tests/common.c
@@ -10,7 +10,7 @@
Copyright (c) 2003 Greg Stein <gstein@users.sourceforge.net>
Copyright (c) 2005-2007 Steven Solie <steven@solie.ca>
Copyright (c) 2005-2012 Karl Waclawek <karl@waclawek.net>
- Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org>
+ Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017-2022 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2017 Joe Orton <jorton@redhat.com>
Copyright (c) 2017 José Gutiérrez de la Concha <jose@zeroc.com>
@@ -42,6 +42,8 @@
*/
#include <assert.h>
+#include <errno.h>
+#include <stdint.h> // for SIZE_MAX
#include <stdio.h>
#include <string.h>
@@ -202,6 +204,12 @@ _XML_Parse_SINGLE_BYTES(XML_Parser parser, const char *s, int len,
for (; len > chunksize; len -= chunksize, s += chunksize) {
enum XML_Status res = XML_Parse(parser, s, chunksize, XML_FALSE);
if (res != XML_STATUS_OK) {
+ if ((res == XML_STATUS_SUSPENDED) && (len > chunksize)) {
+ fail("Use of function _XML_Parse_SINGLE_BYTES with a chunk size "
+ "greater than 0 (from g_chunkSize) does not work well with "
+ "suspension. Please consider use of plain XML_Parse at this "
+ "place in your test, instead.");
+ }
return res;
}
}
@@ -294,3 +302,26 @@ duff_reallocator(void *ptr, size_t size) {
g_reallocation_count--;
return realloc(ptr, size);
}
+
+// Portable remake of strndup(3) for C99; does not care about space efficiency
+char *
+portable_strndup(const char *s, size_t n) {
+ if ((s == NULL) || (n == SIZE_MAX)) {
+ errno = EINVAL;
+ return NULL;
+ }
+
+ char *const buffer = (char *)malloc(n + 1);
+ if (buffer == NULL) {
+ errno = ENOMEM;
+ return NULL;
+ }
+
+ errno = 0;
+
+ memcpy(buffer, s, n);
+
+ buffer[n] = '\0';
+
+ return buffer;
+}
diff --git a/contrib/expat/tests/common.h b/contrib/expat/tests/common.h
index bc4c7da68071..2d1a5f207a09 100644
--- a/contrib/expat/tests/common.h
+++ b/contrib/expat/tests/common.h
@@ -10,7 +10,7 @@
Copyright (c) 2003 Greg Stein <gstein@users.sourceforge.net>
Copyright (c) 2005-2007 Steven Solie <steven@solie.ca>
Copyright (c) 2005-2012 Karl Waclawek <karl@waclawek.net>
- Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org>
+ Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017-2022 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2017 Joe Orton <jorton@redhat.com>
Copyright (c) 2017 José Gutiérrez de la Concha <jose@zeroc.com>
@@ -146,6 +146,8 @@ extern void *duff_allocator(size_t size);
extern void *duff_reallocator(void *ptr, size_t size);
+extern char *portable_strndup(const char *s, size_t n);
+
#endif /* XML_COMMON_H */
#ifdef __cplusplus
diff --git a/contrib/expat/tests/handlers.c b/contrib/expat/tests/handlers.c
index 0211985fe95c..ac459507580b 100644
--- a/contrib/expat/tests/handlers.c
+++ b/contrib/expat/tests/handlers.c
@@ -1843,6 +1843,15 @@ element_decl_suspender(void *userData, const XML_Char *name,
}
void XMLCALL
+suspend_after_element_declaration(void *userData, const XML_Char *name,
+ XML_Content *model) {
+ UNUSED_P(name);
+ XML_Parser parser = (XML_Parser)userData;
+ assert_true(XML_StopParser(parser, /*resumable*/ XML_TRUE) == XML_STATUS_OK);
+ XML_FreeContentModel(parser, model);
+}
+
+void XMLCALL
accumulate_pi_characters(void *userData, const XML_Char *target,
const XML_Char *data) {
CharData *storage = (CharData *)userData;
@@ -1883,6 +1892,20 @@ accumulate_entity_decl(void *userData, const XML_Char *entityName,
}
void XMLCALL
+accumulate_char_data_and_suspend(void *userData, const XML_Char *s, int len) {
+ ParserPlusStorage *const parserPlusStorage = (ParserPlusStorage *)userData;
+
+ CharData_AppendXMLChars(parserPlusStorage->storage, s, len);
+
+ for (int i = 0; i < len; i++) {
+ if (s[i] == 'Z') {
+ XML_StopParser(parserPlusStorage->parser, /*resumable=*/XML_TRUE);
+ break;
+ }
+ }
+}
+
+void XMLCALL
accumulate_start_element(void *userData, const XML_Char *name,
const XML_Char **atts) {
CharData *const storage = (CharData *)userData;
diff --git a/contrib/expat/tests/handlers.h b/contrib/expat/tests/handlers.h
index 8850bb948da3..fa6267fbbd08 100644
--- a/contrib/expat/tests/handlers.h
+++ b/contrib/expat/tests/handlers.h
@@ -325,6 +325,7 @@ extern int XMLCALL external_entity_devaluer(XML_Parser parser,
typedef struct ext_hdlr_data {
const char *parse_text;
XML_ExternalEntityRefHandler handler;
+ CharData *storage;
} ExtHdlrData;
extern int XMLCALL external_entity_oneshot_loader(XML_Parser parser,
@@ -557,6 +558,10 @@ extern void XMLCALL suspending_comment_handler(void *userData,
extern void XMLCALL element_decl_suspender(void *userData, const XML_Char *name,
XML_Content *model);
+extern void XMLCALL suspend_after_element_declaration(void *userData,
+ const XML_Char *name,
+ XML_Content *model);
+
extern void XMLCALL accumulate_pi_characters(void *userData,
const XML_Char *target,
const XML_Char *data);
@@ -569,6 +574,10 @@ extern void XMLCALL accumulate_entity_decl(
const XML_Char *systemId, const XML_Char *publicId,
const XML_Char *notationName);
+extern void XMLCALL accumulate_char_data_and_suspend(void *userData,
+ const XML_Char *s,
+ int len);
+
extern void XMLCALL accumulate_start_element(void *userData,
const XML_Char *name,
const XML_Char **atts);
diff --git a/contrib/expat/tests/minicheck.h b/contrib/expat/tests/minicheck.h
index 3d888f8d2a36..29ae4cb2420d 100644
--- a/contrib/expat/tests/minicheck.h
+++ b/contrib/expat/tests/minicheck.h
@@ -14,7 +14,7 @@
Copyright (c) 2004-2006 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
Copyright (c) 2006-2012 Karl Waclawek <karl@waclawek.net>
- Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org>
+ Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2022 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2023-2024 Sony Corporation / Snild Dolkow <snild@sony.com>
Licensed under the MIT license:
@@ -129,8 +129,10 @@ void _check_set_test_info(char const *function, char const *filename,
* Prototypes for the actual implementation.
*/
-# if defined(__GNUC__)
+# if defined(__has_attribute)
+# if __has_attribute(noreturn)
__attribute__((noreturn))
+# endif
# endif
void
_fail(const char *file, int line, const char *msg);
diff --git a/contrib/expat/tests/misc_tests.c b/contrib/expat/tests/misc_tests.c
index 9afe0922d6b2..fb95014b142f 100644
--- a/contrib/expat/tests/misc_tests.c
+++ b/contrib/expat/tests/misc_tests.c
@@ -10,7 +10,7 @@
Copyright (c) 2003 Greg Stein <gstein@users.sourceforge.net>
Copyright (c) 2005-2007 Steven Solie <steven@solie.ca>
Copyright (c) 2005-2012 Karl Waclawek <karl@waclawek.net>
- Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org>
+ Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017-2022 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2017 Joe Orton <jorton@redhat.com>
Copyright (c) 2017 José Gutiérrez de la Concha <jose@zeroc.com>
@@ -59,6 +59,9 @@
#include "handlers.h"
#include "misc_tests.h"
+void XMLCALL accumulate_characters_ext_handler(void *userData,
+ const XML_Char *s, int len);
+
/* Test that a failure to allocate the parser structure fails gracefully */
START_TEST(test_misc_alloc_create_parser) {
XML_Memory_Handling_Suite memsuite = {duff_allocator, realloc, free};
@@ -208,7 +211,7 @@ START_TEST(test_misc_version) {
if (! versions_equal(&read_version, &parsed_version))
fail("Version mismatch");
- if (xcstrcmp(version_text, XCS("expat_2.6.4"))) /* needs bump on releases */
+ if (xcstrcmp(version_text, XCS("expat_2.7.1"))) /* needs bump on releases */
fail("XML_*_VERSION in expat.h out of sync?\n");
}
END_TEST
@@ -294,6 +297,7 @@ START_TEST(test_misc_stop_during_end_handler_issue_240_1) {
parser = XML_ParserCreate(NULL);
XML_SetElementHandler(parser, start_element_issue_240, end_element_issue_240);
mydata = (DataIssue240 *)malloc(sizeof(DataIssue240));
+ assert_true(mydata != NULL);
mydata->parser = parser;
mydata->deep = 0;
XML_SetUserData(parser, mydata);
@@ -315,6 +319,7 @@ START_TEST(test_misc_stop_during_end_handler_issue_240_2) {
parser = XML_ParserCreate(NULL);
XML_SetElementHandler(parser, start_element_issue_240, end_element_issue_240);
mydata = (DataIssue240 *)malloc(sizeof(DataIssue240));
+ assert_true(mydata != NULL);
mydata->parser = parser;
mydata->deep = 0;
XML_SetUserData(parser, mydata);
@@ -328,64 +333,119 @@ START_TEST(test_misc_stop_during_end_handler_issue_240_2) {
END_TEST
START_TEST(test_misc_deny_internal_entity_closing_doctype_issue_317) {
- const char *const inputOne = "<!DOCTYPE d [\n"
- "<!ENTITY % e ']><d/>'>\n"
- "\n"
- "%e;";
+ const char *const inputOne
+ = "<!DOCTYPE d [\n"
+ "<!ENTITY % element_d '<!ELEMENT d (#PCDATA)*>'>\n"
+ "%element_d;\n"
+ "<!ENTITY % e ']><d/>'>\n"
+ "\n"
+ "%e;";
const char *const inputTwo
= "<!DOCTYPE d [\n"
+ "<!ENTITY % element_d '<!ELEMENT d (#PCDATA)*>'>\n"
+ "%element_d;\n"
"<!ENTITY % e1 ']><d/>'><!ENTITY % e2 '&#37;e1;'>\n"
"\n"
"%e2;";
- const char *const inputThree = "<!DOCTYPE d [\n"
- "<!ENTITY % e ']><d'>\n"
- "\n"
- "%e;/>";
- const char *const inputIssue317 = "<!DOCTYPE doc [\n"
- "<!ENTITY % foo ']>\n"
- "<doc>Hell<oc (#PCDATA)*>'>\n"
- "%foo;\n"
- "]>\n"
- "<doc>Hello, world</dVc>";
+ const char *const inputThree
+ = "<!DOCTYPE d [\n"
+ "<!ENTITY % element_d '<!ELEMENT d (#PCDATA)*>'>\n"
+ "%element_d;\n"
+ "<!ENTITY % e ']><d'>\n"
+ "\n"
+ "%e;/>";
+ const char *const inputIssue317
+ = "<!DOCTYPE doc [\n"
+ "<!ENTITY % element_doc '<!ELEMENT doc (#PCDATA)*>'>\n"
+ "%element_doc;\n"
+ "<!ENTITY % foo ']>\n"
+ "<doc>Hell<oc (#PCDATA)*>'>\n"
+ "%foo;\n"
+ "]>\n"
+ "<doc>Hello, world</dVc>";
const char *const inputs[] = {inputOne, inputTwo, inputThree, inputIssue317};
+ const XML_Bool suspendOrNot[] = {XML_FALSE, XML_TRUE};
size_t inputIndex = 0;
for (; inputIndex < sizeof(inputs) / sizeof(inputs[0]); inputIndex++) {
- set_subtest("%s", inputs[inputIndex]);
- XML_Parser parser;
- enum XML_Status parseResult;
- int setParamEntityResult;
- XML_Size lineNumber;
- XML_Size columnNumber;
- const char *const input = inputs[inputIndex];
-
- parser = XML_ParserCreate(NULL);
- setParamEntityResult
- = XML_SetParamEntityParsing(parser, XML_PARAM_ENTITY_PARSING_ALWAYS);
- if (setParamEntityResult != 1)
- fail("Failed to set XML_PARAM_ENTITY_PARSING_ALWAYS.");
-
- parseResult = _XML_Parse_SINGLE_BYTES(parser, input, (int)strlen(input), 0);
- if (parseResult != XML_STATUS_ERROR) {
- parseResult = _XML_Parse_SINGLE_BYTES(parser, "", 0, 1);
+ for (size_t suspendOrNotIndex = 0;
+ suspendOrNotIndex < sizeof(suspendOrNot) / sizeof(suspendOrNot[0]);
+ suspendOrNotIndex++) {
+ const char *const input = inputs[inputIndex];
+ const XML_Bool suspend = suspendOrNot[suspendOrNotIndex];
+ if (suspend && (g_chunkSize > 0)) {
+ // We cannot use _XML_Parse_SINGLE_BYTES below due to suspension, and
+ // so chunk sizes >0 would only repeat the very same test
+ // due to use of plain XML_Parse; we are saving upon that runtime:
+ return;
+ }
+
+ set_subtest("[input=%d suspend=%s] %s", (int)inputIndex,
+ suspend ? "true" : "false", input);
+ XML_Parser parser;
+ enum XML_Status parseResult;
+ int setParamEntityResult;
+ XML_Size lineNumber;
+ XML_Size columnNumber;
+
+ parser = XML_ParserCreate(NULL);
+ setParamEntityResult
+ = XML_SetParamEntityParsing(parser, XML_PARAM_ENTITY_PARSING_ALWAYS);
+ if (setParamEntityResult != 1)
+ fail("Failed to set XML_PARAM_ENTITY_PARSING_ALWAYS.");
+
+ if (suspend) {
+ XML_SetUserData(parser, parser);
+ XML_SetElementDeclHandler(parser, suspend_after_element_declaration);
+ }
+
+ if (suspend) {
+ // can't use SINGLE_BYTES here, because it'll return early on
+ // suspension, and we won't know exactly how much input we actually
+ // managed to give Expat.
+ parseResult = XML_Parse(parser, input, (int)strlen(input), 0);
+
+ while (parseResult == XML_STATUS_SUSPENDED) {
+ parseResult = XML_ResumeParser(parser);
+ }
+
+ if (parseResult != XML_STATUS_ERROR) {
+ // can't use SINGLE_BYTES here, because it'll return early on
+ // suspension, and we won't know exactly how much input we actually
+ // managed to give Expat.
+ parseResult = XML_Parse(parser, "", 0, 1);
+ }
+
+ while (parseResult == XML_STATUS_SUSPENDED) {
+ parseResult = XML_ResumeParser(parser);
+ }
+ } else {
+ parseResult
+ = _XML_Parse_SINGLE_BYTES(parser, input, (int)strlen(input), 0);
+
+ if (parseResult != XML_STATUS_ERROR) {
+ parseResult = _XML_Parse_SINGLE_BYTES(parser, "", 0, 1);
+ }
+ }
+
if (parseResult != XML_STATUS_ERROR) {
fail("Parsing was expected to fail but succeeded.");
}
- }
- if (XML_GetErrorCode(parser) != XML_ERROR_INVALID_TOKEN)
- fail("Error code does not match XML_ERROR_INVALID_TOKEN");
+ if (XML_GetErrorCode(parser) != XML_ERROR_INVALID_TOKEN)
+ fail("Error code does not match XML_ERROR_INVALID_TOKEN");
- lineNumber = XML_GetCurrentLineNumber(parser);
- if (lineNumber != 4)
- fail("XML_GetCurrentLineNumber does not work as expected.");
+ lineNumber = XML_GetCurrentLineNumber(parser);
+ if (lineNumber != 6)
+ fail("XML_GetCurrentLineNumber does not work as expected.");
- columnNumber = XML_GetCurrentColumnNumber(parser);
- if (columnNumber != 0)
- fail("XML_GetCurrentColumnNumber does not work as expected.");
+ columnNumber = XML_GetCurrentColumnNumber(parser);
+ if (columnNumber != 0)
+ fail("XML_GetCurrentColumnNumber does not work as expected.");
- XML_ParserFree(parser);
+ XML_ParserFree(parser);
+ }
}
}
END_TEST
@@ -519,6 +579,105 @@ START_TEST(test_misc_stopparser_rejects_unstarted_parser) {
}
END_TEST
+/* Adaptation of accumulate_characters that takes ExtHdlrData input to work with
+ * test_renter_loop_finite_content below */
+void XMLCALL
+accumulate_characters_ext_handler(void *userData, const XML_Char *s, int len) {
+ ExtHdlrData *const test_data = (ExtHdlrData *)userData;
+ CharData_AppendXMLChars(test_data->storage, s, len);
+}
+
+/* Test that internalEntityProcessor does not re-enter forever;
+ * based on files tests/xmlconf/xmltest/valid/ext-sa/012.{xml,ent} */
+START_TEST(test_renter_loop_finite_content) {
+ CharData storage;
+ CharData_Init(&storage);
+ const char *const text = "<!DOCTYPE doc [\n"
+ "<!ENTITY e1 '&e2;'>\n"
+ "<!ENTITY e2 '&e3;'>\n"
+ "<!ENTITY e3 SYSTEM '012.ent'>\n"
+ "<!ENTITY e4 '&e5;'>\n"
+ "<!ENTITY e5 '(e5)'>\n"
+ "<!ELEMENT doc (#PCDATA)>\n"
+ "]>\n"
+ "<doc>&e1;</doc>\n";
+ ExtHdlrData test_data = {"&e4;\n", external_entity_null_loader, &storage};
+ const XML_Char *const expected = XCS("(e5)\n");
+
+ XML_Parser parser = XML_ParserCreate(NULL);
+ assert_true(parser != NULL);
+ XML_SetUserData(parser, &test_data);
+ XML_SetExternalEntityRefHandler(parser, external_entity_oneshot_loader);
+ XML_SetCharacterDataHandler(parser, accumulate_characters_ext_handler);
+ if (_XML_Parse_SINGLE_BYTES(parser, text, (int)strlen(text), XML_TRUE)
+ == XML_STATUS_ERROR)
+ xml_failure(parser);
+
+ CharData_CheckXMLChars(&storage, expected);
+ XML_ParserFree(parser);
+}
+END_TEST
+
+// Inspired by function XML_OriginalString of Perl's XML::Parser
+static char *
+dup_original_string(XML_Parser parser) {
+ const int byte_count = XML_GetCurrentByteCount(parser);
+
+ assert_true(byte_count >= 0);
+
+ int offset = -1;
+ int size = -1;
+
+ const char *const context = XML_GetInputContext(parser, &offset, &size);
+
+#if XML_CONTEXT_BYTES > 0
+ assert_true(context != NULL);
+ assert_true(offset >= 0);
+ assert_true(size >= 0);
+ return portable_strndup(context + offset, byte_count);
+#else
+ assert_true(context == NULL);
+ return NULL;
+#endif
+}
+
+static void
+on_characters_issue_980(void *userData, const XML_Char *s, int len) {
+ (void)s;
+ (void)len;
+ XML_Parser parser = (XML_Parser)userData;
+
+ char *const original_string = dup_original_string(parser);
+
+#if XML_CONTEXT_BYTES > 0
+ assert_true(original_string != NULL);
+ assert_true(strcmp(original_string, "&draft.day;") == 0);
+ free(original_string);
+#else
+ assert_true(original_string == NULL);
+#endif
+}
+
+START_TEST(test_misc_expected_event_ptr_issue_980) {
+ // NOTE: This is a tiny subset of sample "REC-xml-19980210.xml"
+ // from Perl's XML::Parser
+ const char *const doc = "<!DOCTYPE day [\n"
+ " <!ENTITY draft.day '10'>\n"
+ "]>\n"
+ "<day>&draft.day;</day>\n";
+
+ XML_Parser parser = XML_ParserCreate(NULL);
+ XML_SetUserData(parser, parser);
+ XML_SetCharacterDataHandler(parser, on_characters_issue_980);
+
+ assert_true(_XML_Parse_SINGLE_BYTES(parser, doc, (int)strlen(doc),
+ /*isFinal=*/XML_TRUE)
+ == XML_STATUS_OK);
+
+ XML_ParserFree(parser);
+}
+END_TEST
+
void
make_miscellaneous_test_case(Suite *s) {
TCase *tc_misc = tcase_create("miscellaneous tests");
@@ -545,4 +704,6 @@ make_miscellaneous_test_case(Suite *s) {
tcase_add_test(tc_misc, test_misc_char_handler_stop_without_leak);
tcase_add_test(tc_misc, test_misc_resumeparser_not_crashing);
tcase_add_test(tc_misc, test_misc_stopparser_rejects_unstarted_parser);
+ tcase_add_test__if_xml_ge(tc_misc, test_renter_loop_finite_content);
+ tcase_add_test(tc_misc, test_misc_expected_event_ptr_issue_980);
}
diff --git a/contrib/expat/tests/xmltest.sh b/contrib/expat/tests/xmltest.sh
index dc409d01e456..56e66c56f6ea 100755
--- a/contrib/expat/tests/xmltest.sh
+++ b/contrib/expat/tests/xmltest.sh
@@ -2,8 +2,8 @@
# EXPAT TEST SCRIPT FOR W3C XML TEST SUITE
#
# This script can be used to exercise Expat against the
-# w3c.org xml test suite, available from
-# http://www.w3.org/XML/Test/xmlts20020606.zip.
+# w3c.org xml test suite, available from:
+# https://www.w3.org/XML/Test/xmlts20020606.zip
#
# To run this script, first set XMLWF below so that xmlwf can be
# found, then set the output directory with OUTPUT.
@@ -30,6 +30,7 @@
# Copyright (c) 2002 Karl Waclawek <karl@waclawek.net>
# Copyright (c) 2008-2019 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2017 Rhodri James <rhodri@wildebeest.org.uk>
+# Copyright (c) 2025 Hanno Böck <hanno@gentoo.org>
# Licensed under the MIT license:
#
# Permission is hereby granted, free of charge, to any person obtaining
diff --git a/contrib/expat/xmlwf/readfilemap.c b/contrib/expat/xmlwf/readfilemap.c
index 2cb53feef8d5..d8e7fce42ead 100644
--- a/contrib/expat/xmlwf/readfilemap.c
+++ b/contrib/expat/xmlwf/readfilemap.c
@@ -14,6 +14,7 @@
Copyright (c) 2017 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2017 Franek Korta <fkorta@gmail.com>
Copyright (c) 2022 Sean McBride <sean@rogue-research.com>
+ Copyright (c) 2025 Hanno Böck <hanno@gentoo.org>
Licensed under the MIT license:
Permission is hereby granted, free of charge, to any person obtaining
@@ -55,7 +56,7 @@
# define EXPAT_read_count_t int
# define EXPAT_read_req_t unsigned int
#else /* POSIX */
-/* http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html */
+/* https://pubs.opengroup.org/onlinepubs/009695399/functions/read.html */
# define EXPAT_read read
# define EXPAT_read_count_t ssize_t
# define EXPAT_read_req_t size_t