pkgsrc-Changes archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
CVS commit: pkgsrc/www/py-mod_wsgi
Module Name: pkgsrc
Committed By: wiz
Date: Tue Nov 15 08:52:09 UTC 2022
Modified Files:
pkgsrc/www/py-mod_wsgi: Makefile distinfo
Removed Files:
pkgsrc/www/py-mod_wsgi/patches: patch-configure
patch-src_server_wsgi__python.h
Log Message:
py-ap24-mod_wsgi: update to 4.9.4.
4.9.4
Bugs Fixed
Apache 2.4.54 changed the default value for LimitRequestBody
from 0, which indicates there is no limit, to 1Gi. If the Apache
configuration supplied with a distribution wasn’t explicitly
setting LimitRequestBody to 0 at global server scope for the
purposes of documenting the default, and it was actually relying
on the compiled in default, then when using mod_wsgi daemon
mode, if a request body size greater than 1Gi was encountered
the mod_wsgi daemon mode process would crash. Fix ability to
build mod_wsgi against Apache 2.2. Do note that in general only
recent versions of Apache 2.4 are supported
4.9.3
Bugs Fixed
When using WSGITrustedProxies and WSGITrustedProxyHeaders in
the Apache configuration, or --trust-proxy and --trust-proxy-header
options with mod_wsgi-express, if you trusted the X-Client-IP
header and a request was received from an untrusted client,
the header was not being correctly removed from the set of
headers passed through to the WSGI application.
This only occurred with the X-Client-IP header and the same
problem was not present if trusting the X-Real-IP or X-Forwarded-For
headers.
The purpose of this feature for trusting a front end proxy was
in this case for the headers:
X-Client-IP X-Real-IP X-Forwarded-For
and was designed to allow the value of REMOTE_ADDR passed to
the WSGI application to be rewritten to the IP address that a
trusted proxy said was the real remote address of the client.
In other words, if a request was received from a proxy the IP
address of which was trusted, REMOTE_ADDR would be set to the
value of the single designated header out of those listed above
which was to be trusted.
In the case where the proxy was trusted, in addition to
REMOTE_ADDR being rewritten, only the trusted header would be
passed through. That is, if X-Real-IP was the trusted header,
then HTTP_X_REAL_IP would be passed to the WSGI application,
but HTTP_X_CLIENT_IP and HTTP_X_FORWARDED_FOR would be dropped
if corresponding headers had also been supplied. That the header
used to rewrite REMOTE_ADDR was passed through still was only
intended for the purpose of documenting where the value of
REMOTE_ADDR came from. A WSGI application when relying on this
feature should only ever use the value of REMOTE_ADDR and should
ignore the header passed through.
The behaviour as described was therefore based on a WSGI
application not at the same time enabling any WSGI or web
framework middleware to try and process any proxy headers a
second time and REMOTE_ADDR should be the single source of
truth. Albeit the headers which were passed through should have
resulted in the same result for REMOTE_ADDR if the proxy headers
were processed a second time.
Now in the case of the client a request was received from not
being a trusted proxy, then REMOTE_ADDR would not be rewritten,
and would be left as the IP of the client, and none of the
headers listed above were supposed to be passed through.
That REMOTE_ADDR is not rewritten is implemented correctly when
the client is not a trusted proxy, but of the three headers
listed above, HTTP_X_CLIENT_ID was not being dropped if the
corresponding header was supplied.
If the WSGI application followed best practice and only relied
on the value of REMOTE_ADDR as the source of truth for the
remote client address, then that HTTP_X_CLIENT_ID was not being
dropped should pose no security risk. There would however be
a problem if a WSGI application was still enabling a WSGI or
web framework specific middleware to process the proxy headers
a second time even though not required. In this case, the
middleware used by the WSGI application may still trust the
X-Client-IP header and rewrite REMOTE_ADDR allowing a malicious
client to pretend to have a different IP address.
In addition to the WSGI application having redundant checks
for the proxy headers, to take advantage of this, a client
would also need direct access to the Apache/mod_wsgi server
instance.
In the case that only clients on your private network behind
your proxy could access the Apache/mod_wsgi server instance,
that would imply any malicious actor already had access to your
private network and had access to hosts in that private network
or could attach their own device to that private network.
In the case where your Apache/mod_wsgi server instance could
be accessed from the same external networks as a proxy forwarding
requests to it, such as may occur if making use of a CDN proxy
cache, a client would still need to know the direct address
used by the Apache/mod_wsgi server instance.
Note that only one proxy header for designating the IP of a
client should ever be trusted. If you trust more than one, then
which will be used if both are present is undefined as it is
dependent on the order that Apache processes headers. This
hasn’t changed and as before to avoid ambiguity you should only
trust one of the proxy headers recognised for this purpose.
4.9.2
Bugs Fixed
When using mod_wsgi-express in daemon mode, and source code
reloading was enabled, an invalid URL path which contained a
byte sequence which could not be decoded as UTF-8 was causing
a process crash.
4.9.1
Bugs Fixed
When using --enable-debugger of mod_wsgi-express to enable Pdb,
it was failing due to prior changes to run Apache in a sub
processes to avoid Apache being shutdown when the window size
changed. This was because standard input was being detached
from Apache and so it was not possible to interact with Pdb.
Now when --enable-debugger is used, or any feature which uses
--debug-mode, Apache will not be run in a sub process so that
you can still use standard input to interact with the process
if needed. This does mean that a window size change event will
again cause Apache to shutdown in these cases though. Update
code so compiles on Python 3.11. Python 3.11 makes structures
for Python frame objects opaque and requires functions to access
struct members.
Features Changed
Historically when a process was being shutdown, mod_wsgi would
do its best to destroy any Python sub interpreters as well as
the main Python interpreter. This was done in case applications
attempted to run any actions on process shutdown via atexit
registered callbacks or other means.
Because of changes in Python 3.9, and possibly because mod_wsgi
makes use of externally created C threads to handle requests,
and not Python native threads, there is now a suspiscion that
attempting to delete Python sub interpreters can hang. It is
believed this may relate to Python core now expecting all Python
thread state objects to have been deleted before the Python
sub interpreter can be destroyed. If they aren’t then Python
core code can block indefinitely. If the issue isn’t the
externally created C threads that mod_wsgi uses, it might
instead be arising as a problem when a hosted WSGI application
creates its own background threads but they are still running
when the attempt is made to destroy the sub interpreter.
In the case of using daemon mode the result is that processes
can hang on shutdown, but will still at least be deleted after
5 seconds due to how Apache process management will forcibly
kill managed processes after 5 seconds if they do not exit
cleanly themselves. In other words the issue may not be noticed.
For embedded mode however, the Apache child process can hang
around indefinitely, possibly only being deleted if some higher
level system application manager such as systemd is able to
detect the problem and forcibly deleted the hung process.
Although mod_wsgi always attempts to ensure that the externally
created C threads are not still handling HTTP requests and thus
not active prior to destroying the Python interpreter, it is
impossible to guarantee this. Similarly, there is no way to
guarantee that background threads created by a WSGI application
aren’t still running. As such, it isn’t possible to safely
attempt to delete the Python thread state objects before deleting
the Python sub interpreter.
Because of this uncertainty mod_wsgi now provides a way to
disable the attempt to destroy the Python sub interpreters or
the main Python interpreter when the process is being shutdown.
This will though mean that atexit registered callbacks will
not be called if this option is enabled. It is therefore
important that you use mod_wsgi’s own mechanism of being notified
when a process is being shutdown to perform any special actions.
import mod_wsgi
def shutdown_handler(event, **kwargs):
print('SHUTDOWN-HANDLER', event, kwargs)
mod_wsgi.subscribe_shutdown(shutdown_handler)
Use of this shutdown notification was necessary anyway to
reliably attempt to stop background threads created by the WSGI
application since atexit registered callbacks are not called
by Python core until after it thinks all threads have been
stopped. In other words, atexit register callbacks couldn’t be
used to reliably stop background threads. Thus use of the
mod_wsgi mechanism for performing actions on process shutdown
is the preferred way.
Overall it is expected that the majority of users will not
notice this change as it is very rare to see WSGI applications
want to perform special actions on process shutdown. If you
are affected, you should use mod_wsgi’s mechanism to perform
special actions on process shutdown.
If you need to enable this mode whereby no attempt is made to
destroy the Python interpreter (including sub interpreters) on
process shutdown, you can add at global scope in the Apache
configuration:
WSGIDestroyInterpreter Off
If you are using mod_wsgi-express, you can instead supply the
command line option --orphan-interpreter.
4.9.0
Bugs Fixed
The mod_wsgi code wouldn’t compile on Python 3.10 as various
Python C API functions were removed. Note that the changes
required switching to alternate C APIs. The changes were made
for all Python versions back to Python 3.6 and were not
conditional on Python 3.10+ being used. This is why the minor
version got bumped. When using CMMI (configure/make/make
install) method for compiling mod_wsgi if embedded mode was
being disabled at compile time, compilation would fail. When
maximum-requests option was used with mod_wsgi daemon mode,
and a graceful restart signal was sent to the daemon process
while there was an active request, the process would only
shutdown when the graceful timeout period had expired, and not
as soon as any active requests had completed, if that had
occurred before the graceful timeout had expired. When using
the startup-timeout and restart-interval options of WSGIDaemonProcess
directive together, checking for the expiration time of the
startup time was done incorrectly, resulting in process restart
being delayed if startup had failed. At worst case this was
the lessor of the time periods specified by the options
restart-interval, deadlock-timeout, graceful-timeout and
eviction-timeout. If request-timeout were defined it would
however still be calculated correctly. As request-timeout was
by default defined when using mod_wsgi-express, this issue only
usually affect mod_wsgi when manually configuring Apache.
Features Changed
Historically when using embedded mode, wsgi.multithread in the
WSGI environ dictionary has reported True when any multithread
capable Apache MPM were used (eg., worker, event), even if the
current number of configured threads per child process was
overridden to be 1. Why this was the case has been forgotten,
but generally wouldn’t matter since no one would ever set up
Apache with a mulithread MPM and then configure the number of
threads to be 1. If that was desired then prefork MPM would be
used.
With mod_wsgi-express since 4.8.0 making it much easier to use
embedded mode and have a sane configuration used, since it is
generated for you, the value of wsgi.multithread has been
changed such that it will now correctly report False if using
embedded mode, a multithread capable MPM is used, but the number
of configured threads is set to 1.
The graceful-timeout option for WSGIDaemonProcess now defaults
to 15 seconds. This was always the case when mod_wsgi-express
was used but the default was never applied back to the case
where mod_wsgi was being configured manually.
A default of 15 seconds for graceful-timeout is being added to
avoid the problem where sending a SIGUSR1 to a daemon mode
process would never see the process shutdown due to there never
being a time when there were no active requests. This might
occur when there were a stuck request that never completed, or
numerous long running requests which always overlapped in time
meaning the process was never idle.
You can still force graceful-timeout to be 0 to restore the
original behaviour, but that is probably not recommended.
4.8.0
Bugs Fixed
Fixed potential for process crash on Apache startup when the
WSGI script file or other Python script file were being preloaded.
This was triggered when WSGIImportScript was used, or if
WSGIScriptAlias or WSGIScriptAliasMatch were used and both the
process-group and application-group options were used with
those directives.
The potential for this problem arising was extremely high on
Alpine Linux, but seem to be very rare on a full Linux of macOS
distribution where glibc was being used.
Include a potential workaround so that virtual environment work
on Windows.
Use of virtual environments in embedded systems on Windows has
been broken ever since python -m venv was introduced.
Initially virtualenv was not affected, although when it changed
to use the new style Python virtual environment layout the same
as python -m venv it also broke. This was with the introduction
of about virtualenv version 20.0.0.
The underlying cause is lack of support for using virtual
environments in CPython for the new style virtual environments.
The bug has existed in CPython since back in 2014 and has not
been fixed. For details of the issue see
https://bugs.python.org/issue22213.
For non Window systems a workaround had been used to resolve
the problem, but the same workaround has never worked on Windows.
The change in this version tries a different workaround for
Windows environments.
Added a workaround for the fact that Python doesn’t actually
set the _main_thread attribute of the threading module to the
main thread which initialized the main interpreter or sub
interpreter, but the first thread that imports the threading
module. In an embedded system such as mod_wsgi it could be a
request thread, not the main thread, that would import the
threading module.
This issue was causing the asgiref module used in Django to
fail when using signal.set_wakeup_fd() as code was thinking it
was in the main thread when it wasn’t. See
https://github.com/django/asgiref/issues/143.
Using WSGILazyInitialization Off would cause Python to abort
the Apache parent process. The issue has been resolved, but
you are warned that you should not be using this option anyway
as it is dangerous and opens up security holes with the potential
for user code to run as the root user when Python is initialized.
Fix a Python deprecation warning for PyArg_ParseTuple() which
would cause the process to crash when deprecation warnings were
turned on globally for an application. Crash was occuring
whenever anything was output to Apache error log via print().
Features Changed
The --isatty option of mod_wsgi-express has been removed and
the behaviour enabled by the option is now the default. The
default behaviour is now that if mod_wsgi-express is run in an
interactive terminal, then Apache will be started within a sub
process of the mod_wsgi-express script and the SIGWINCH signal
will be blocked and not passed through to Apache. This means
that a window resizing event will no longer cause mod_wsgi-express
to shutdown unexpectedly. When trying to set resource limits
and they can’t be set, the system error number will now be
included in the error message.
New Features
Added the mod_wsgi.subscribe_shutdown() function for registering
a callback to be called when the process is being shutdown.
This is needed because atexit.register() doesn’t work as required
for the main Python interpreter, specifically the atexit callback
isn’t called before the main interpreter thread attempts to
wait on threads on shutdown, thus preventing one from shutting
down daemon threads and waiting on them.
This feature to get a callback on process shutdown was previously
available by using mod_wsgi.subscribe_events(), but that would
also reports events to the callback on requests as they happen,
thus adding extra overhead if not using the request events.
The new registration function can thus be used where only
interested in the event for the process being shutdown.
Added an --embedded-mode option to mod_wsgi-express to make it
easier to force it into embedded mode for high throughput, CPU
bound applications with minimal response times. In this case
the number of Apache child worker processes used for embedded
mode will be dictated by the --processes and --threads option,
completely overriding any automatic mechanism to set those
parameters. Any auto scaling done by Apache for the child worker
processes will also be disabled.
This gives preference to using Apache worker MPM instead of
event MPM, as event MPM doesn’t work correctly when told to
run with less than three threads per process. You can switch
back to using event MPM by using the --server-mpm option, but
will need to ensure that have three threads per process or
more.
Locking of the Python global interpreter lock has been reviewed
with changes resulting in a reduction in overhead, or otherwise
changing the interaction between threads such that at high
request rate with a hello world application, a greater request
throughput can be achieved. How much improvement you see with
your own applications will depend on what your application does
and whether you have short response times to begin with. If
you have an I/O bound application with long response times you
likely aren’t going to see any difference.
Internal metrics collection has been improved with additional
information provided in process metrics and a new request
metrics feature added giving access to aggregrated metrics over
the time of a reporting period. This includes bucketed time
data on requests so can calculate distribution of server, queue
and application time.
Note that the new request metrics is still a work in progress
and may be modified or enhanced, causing breaking changes in
the format of data returned.
Hidden experimental support for running mod_wsgi-express
start-server on Windows. It will not show in list of sub commands
mod_wsgi-express accepts on Windows, but it is there. There
are still various issues that need to be sorted out but need
assistance from someone who knows more about programming Python
on Windows and Windows programming in general to get it all
working properly. If you are interested in helping, reach out
on the mod_wsgi mailing list.
To generate a diff of this commit:
cvs rdiff -u -r1.22 -r1.23 pkgsrc/www/py-mod_wsgi/Makefile
cvs rdiff -u -r1.19 -r1.20 pkgsrc/www/py-mod_wsgi/distinfo
cvs rdiff -u -r1.1 -r0 pkgsrc/www/py-mod_wsgi/patches/patch-configure \
pkgsrc/www/py-mod_wsgi/patches/patch-src_server_wsgi__python.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
Modified files:
Index: pkgsrc/www/py-mod_wsgi/Makefile
diff -u pkgsrc/www/py-mod_wsgi/Makefile:1.22 pkgsrc/www/py-mod_wsgi/Makefile:1.23
--- pkgsrc/www/py-mod_wsgi/Makefile:1.22 Wed Jan 5 20:47:37 2022
+++ pkgsrc/www/py-mod_wsgi/Makefile Tue Nov 15 08:52:09 2022
@@ -1,8 +1,7 @@
-# $NetBSD: Makefile,v 1.22 2022/01/05 20:47:37 wiz Exp $
+# $NetBSD: Makefile,v 1.23 2022/11/15 08:52:09 wiz Exp $
-DISTNAME= mod_wsgi-4.7.1
+DISTNAME= mod_wsgi-4.9.4
PKGNAME= ${PYPKGPREFIX}-${APACHE_PKG_PREFIX}-${DISTNAME}
-PKGREVISION= 2
CATEGORIES= www python
MASTER_SITES= ${MASTER_SITE_PYPI:=m/mod_wsgi/}
Index: pkgsrc/www/py-mod_wsgi/distinfo
diff -u pkgsrc/www/py-mod_wsgi/distinfo:1.19 pkgsrc/www/py-mod_wsgi/distinfo:1.20
--- pkgsrc/www/py-mod_wsgi/distinfo:1.19 Sun Dec 19 14:12:48 2021
+++ pkgsrc/www/py-mod_wsgi/distinfo Tue Nov 15 08:52:09 2022
@@ -1,7 +1,5 @@
-$NetBSD: distinfo,v 1.19 2021/12/19 14:12:48 wiz Exp $
+$NetBSD: distinfo,v 1.20 2022/11/15 08:52:09 wiz Exp $
-BLAKE2s (mod_wsgi-4.7.1.tar.gz) = f74642297af9ecfed637416cb118885acc11c3d1d8715bbd92d47503e3d009ed
-SHA512 (mod_wsgi-4.7.1.tar.gz) = 2c9d83737fe0ca5c599d3915e47047db2d06880ac3721c94350cd2d9ae930c20058e350f07c918dd301e50bf3433480e1bad479f4ffd382e6b2e42675352734e
-Size (mod_wsgi-4.7.1.tar.gz) = 498301 bytes
-SHA1 (patch-configure) = 7ece56413dfcb8de755dab722ebac632f3d1166f
-SHA1 (patch-src_server_wsgi__python.h) = 70c153e55642714d7172246748b6e4038e87d831
+BLAKE2s (mod_wsgi-4.9.4.tar.gz) = 39922e1c24dba83ca3e14288d722127fc0c2ad07c12e4ce15d5c24f55fbc6222
+SHA512 (mod_wsgi-4.9.4.tar.gz) = e99c062a8fa9fdb9ce50f8d902ff9c9b572f7c470bfad6db0ad34b52ee476814845331ebded86c62e335bbfd8887c56a3a62109c332a951f883314b8350a3ae4
+Size (mod_wsgi-4.9.4.tar.gz) = 497531 bytes
Home |
Main Index |
Thread Index |
Old Index