pkgsrc-Changes archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

CVS commit: pkgsrc/www/py-mod_wsgi



Module Name:    pkgsrc
Committed By:   wiz
Date:           Tue Nov 15 08:52:09 UTC 2022

Modified Files:
        pkgsrc/www/py-mod_wsgi: Makefile distinfo
Removed Files:
        pkgsrc/www/py-mod_wsgi/patches: patch-configure
            patch-src_server_wsgi__python.h

Log Message:
py-ap24-mod_wsgi: update to 4.9.4.

4.9.4

Bugs Fixed

    Apache 2.4.54 changed the default value for LimitRequestBody
    from 0, which indicates there is no limit, to 1Gi. If the Apache
    configuration supplied with a distribution wasn’t explicitly
    setting LimitRequestBody to 0 at global server scope for the
    purposes of documenting the default, and it was actually relying
    on the compiled in default, then when using mod_wsgi daemon
    mode, if a request body size greater than 1Gi was encountered
    the mod_wsgi daemon mode process would crash.  Fix ability to
    build mod_wsgi against Apache 2.2. Do note that in general only
    recent versions of Apache 2.4 are supported

4.9.3

Bugs Fixed

    When using WSGITrustedProxies and WSGITrustedProxyHeaders in
    the Apache configuration, or --trust-proxy and --trust-proxy-header
    options with mod_wsgi-express, if you trusted the X-Client-IP
    header and a request was received from an untrusted client,
    the header was not being correctly removed from the set of
    headers passed through to the WSGI application.

    This only occurred with the X-Client-IP header and the same
    problem was not present if trusting the X-Real-IP or X-Forwarded-For
    headers.

    The purpose of this feature for trusting a front end proxy was
    in this case for the headers:

            X-Client-IP X-Real-IP X-Forwarded-For

    and was designed to allow the value of REMOTE_ADDR passed to
    the WSGI application to be rewritten to the IP address that a
    trusted proxy said was the real remote address of the client.

    In other words, if a request was received from a proxy the IP
    address of which was trusted, REMOTE_ADDR would be set to the
    value of the single designated header out of those listed above
    which was to be trusted.

    In the case where the proxy was trusted, in addition to
    REMOTE_ADDR being rewritten, only the trusted header would be
    passed through. That is, if X-Real-IP was the trusted header,
    then HTTP_X_REAL_IP would be passed to the WSGI application,
    but HTTP_X_CLIENT_IP and HTTP_X_FORWARDED_FOR would be dropped
    if corresponding headers had also been supplied. That the header
    used to rewrite REMOTE_ADDR was passed through still was only
    intended for the purpose of documenting where the value of
    REMOTE_ADDR came from. A WSGI application when relying on this
    feature should only ever use the value of REMOTE_ADDR and should
    ignore the header passed through.

    The behaviour as described was therefore based on a WSGI
    application not at the same time enabling any WSGI or web
    framework middleware to try and process any proxy headers a
    second time and REMOTE_ADDR should be the single source of
    truth. Albeit the headers which were passed through should have
    resulted in the same result for REMOTE_ADDR if the proxy headers
    were processed a second time.

    Now in the case of the client a request was received from not
    being a trusted proxy, then REMOTE_ADDR would not be rewritten,
    and would be left as the IP of the client, and none of the
    headers listed above were supposed to be passed through.

    That REMOTE_ADDR is not rewritten is implemented correctly when
    the client is not a trusted proxy, but of the three headers
    listed above, HTTP_X_CLIENT_ID was not being dropped if the
    corresponding header was supplied.

    If the WSGI application followed best practice and only relied
    on the value of REMOTE_ADDR as the source of truth for the
    remote client address, then that HTTP_X_CLIENT_ID was not being
    dropped should pose no security risk. There would however be
    a problem if a WSGI application was still enabling a WSGI or
    web framework specific middleware to process the proxy headers
    a second time even though not required. In this case, the
    middleware used by the WSGI application may still trust the
    X-Client-IP header and rewrite REMOTE_ADDR allowing a malicious
    client to pretend to have a different IP address.

    In addition to the WSGI application having redundant checks
    for the proxy headers, to take advantage of this, a client
    would also need direct access to the Apache/mod_wsgi server
    instance.

    In the case that only clients on your private network behind
    your proxy could access the Apache/mod_wsgi server instance,
    that would imply any malicious actor already had access to your
    private network and had access to hosts in that private network
    or could attach their own device to that private network.

    In the case where your Apache/mod_wsgi server instance could
    be accessed from the same external networks as a proxy forwarding
    requests to it, such as may occur if making use of a CDN proxy
    cache, a client would still need to know the direct address
    used by the Apache/mod_wsgi server instance.

    Note that only one proxy header for designating the IP of a
    client should ever be trusted. If you trust more than one, then
    which will be used if both are present is undefined as it is
    dependent on the order that Apache processes headers. This
    hasn’t changed and as before to avoid ambiguity you should only
    trust one of the proxy headers recognised for this purpose.

4.9.2

Bugs Fixed

    When using mod_wsgi-express in daemon mode, and source code
    reloading was enabled, an invalid URL path which contained a
    byte sequence which could not be decoded as UTF-8 was causing
    a process crash.

4.9.1

Bugs Fixed

    When using --enable-debugger of mod_wsgi-express to enable Pdb,
    it was failing due to prior changes to run Apache in a sub
    processes to avoid Apache being shutdown when the window size
    changed. This was because standard input was being detached
    from Apache and so it was not possible to interact with Pdb.
    Now when --enable-debugger is used, or any feature which uses
    --debug-mode, Apache will not be run in a sub process so that
    you can still use standard input to interact with the process
    if needed. This does mean that a window size change event will
    again cause Apache to shutdown in these cases though.  Update
    code so compiles on Python 3.11. Python 3.11 makes structures
    for Python frame objects opaque and requires functions to access
    struct members.

Features Changed

    Historically when a process was being shutdown, mod_wsgi would
    do its best to destroy any Python sub interpreters as well as
    the main Python interpreter. This was done in case applications
    attempted to run any actions on process shutdown via atexit
    registered callbacks or other means.

    Because of changes in Python 3.9, and possibly because mod_wsgi
    makes use of externally created C threads to handle requests,
    and not Python native threads, there is now a suspiscion that
    attempting to delete Python sub interpreters can hang. It is
    believed this may relate to Python core now expecting all Python
    thread state objects to have been deleted before the Python
    sub interpreter can be destroyed. If they aren’t then Python
    core code can block indefinitely. If the issue isn’t the
    externally created C threads that mod_wsgi uses, it might
    instead be arising as a problem when a hosted WSGI application
    creates its own background threads but they are still running
    when the attempt is made to destroy the sub interpreter.

    In the case of using daemon mode the result is that processes
    can hang on shutdown, but will still at least be deleted after
    5 seconds due to how Apache process management will forcibly
    kill managed processes after 5 seconds if they do not exit
    cleanly themselves. In other words the issue may not be noticed.

    For embedded mode however, the Apache child process can hang
    around indefinitely, possibly only being deleted if some higher
    level system application manager such as systemd is able to
    detect the problem and forcibly deleted the hung process.

    Although mod_wsgi always attempts to ensure that the externally
    created C threads are not still handling HTTP requests and thus
    not active prior to destroying the Python interpreter, it is
    impossible to guarantee this. Similarly, there is no way to
    guarantee that background threads created by a WSGI application
    aren’t still running. As such, it isn’t possible to safely
    attempt to delete the Python thread state objects before deleting
    the Python sub interpreter.

    Because of this uncertainty mod_wsgi now provides a way to
    disable the attempt to destroy the Python sub interpreters or
    the main Python interpreter when the process is being shutdown.
    This will though mean that atexit registered callbacks will
    not be called if this option is enabled. It is therefore
    important that you use mod_wsgi’s own mechanism of being notified
    when a process is being shutdown to perform any special actions.

    import mod_wsgi

    def shutdown_handler(event, **kwargs):
      print('SHUTDOWN-HANDLER', event, kwargs)

    mod_wsgi.subscribe_shutdown(shutdown_handler)

    Use of this shutdown notification was necessary anyway to
    reliably attempt to stop background threads created by the WSGI
    application since atexit registered callbacks are not called
    by Python core until after it thinks all threads have been
    stopped. In other words, atexit register callbacks couldn’t be
    used to reliably stop background threads. Thus use of the
    mod_wsgi mechanism for performing actions on process shutdown
    is the preferred way.

    Overall it is expected that the majority of users will not
    notice this change as it is very rare to see WSGI applications
    want to perform special actions on process shutdown. If you
    are affected, you should use mod_wsgi’s mechanism to perform
    special actions on process shutdown.

    If you need to enable this mode whereby no attempt is made to
    destroy the Python interpreter (including sub interpreters) on
    process shutdown, you can add at global scope in the Apache
    configuration:

    WSGIDestroyInterpreter Off

    If you are using mod_wsgi-express, you can instead supply the
    command line option --orphan-interpreter.

4.9.0

Bugs Fixed

    The mod_wsgi code wouldn’t compile on Python 3.10 as various
    Python C API functions were removed. Note that the changes
    required switching to alternate C APIs. The changes were made
    for all Python versions back to Python 3.6 and were not
    conditional on Python 3.10+ being used. This is why the minor
    version got bumped.  When using CMMI (configure/make/make
    install) method for compiling mod_wsgi if embedded mode was
    being disabled at compile time, compilation would fail.  When
    maximum-requests option was used with mod_wsgi daemon mode,
    and a graceful restart signal was sent to the daemon process
    while there was an active request, the process would only
    shutdown when the graceful timeout period had expired, and not
    as soon as any active requests had completed, if that had
    occurred before the graceful timeout had expired.  When using
    the startup-timeout and restart-interval options of WSGIDaemonProcess
    directive together, checking for the expiration time of the
    startup time was done incorrectly, resulting in process restart
    being delayed if startup had failed. At worst case this was
    the lessor of the time periods specified by the options
    restart-interval, deadlock-timeout, graceful-timeout and
    eviction-timeout. If request-timeout were defined it would
    however still be calculated correctly. As request-timeout was
    by default defined when using mod_wsgi-express, this issue only
    usually affect mod_wsgi when manually configuring Apache.

Features Changed

    Historically when using embedded mode, wsgi.multithread in the
    WSGI environ dictionary has reported True when any multithread
    capable Apache MPM were used (eg., worker, event), even if the
    current number of configured threads per child process was
    overridden to be 1. Why this was the case has been forgotten,
    but generally wouldn’t matter since no one would ever set up
    Apache with a mulithread MPM and then configure the number of
    threads to be 1. If that was desired then prefork MPM would be
    used.

    With mod_wsgi-express since 4.8.0 making it much easier to use
    embedded mode and have a sane configuration used, since it is
    generated for you, the value of wsgi.multithread has been
    changed such that it will now correctly report False if using
    embedded mode, a multithread capable MPM is used, but the number
    of configured threads is set to 1.

    The graceful-timeout option for WSGIDaemonProcess now defaults
    to 15 seconds. This was always the case when mod_wsgi-express
    was used but the default was never applied back to the case
    where mod_wsgi was being configured manually.

    A default of 15 seconds for graceful-timeout is being added to
    avoid the problem where sending a SIGUSR1 to a daemon mode
    process would never see the process shutdown due to there never
    being a time when there were no active requests. This might
    occur when there were a stuck request that never completed, or
    numerous long running requests which always overlapped in time
    meaning the process was never idle.

    You can still force graceful-timeout to be 0 to restore the
    original behaviour, but that is probably not recommended.

4.8.0

Bugs Fixed

    Fixed potential for process crash on Apache startup when the
    WSGI script file or other Python script file were being preloaded.
    This was triggered when WSGIImportScript was used, or if
    WSGIScriptAlias or WSGIScriptAliasMatch were used and both the
    process-group and application-group options were used with
    those directives.

    The potential for this problem arising was extremely high on
    Alpine Linux, but seem to be very rare on a full Linux of macOS
    distribution where glibc was being used.

    Include a potential workaround so that virtual environment work
    on Windows.

    Use of virtual environments in embedded systems on Windows has
    been broken ever since python -m venv was introduced.

    Initially virtualenv was not affected, although when it changed
    to use the new style Python virtual environment layout the same
    as python -m venv it also broke. This was with the introduction
    of about virtualenv version 20.0.0.

    The underlying cause is lack of support for using virtual
    environments in CPython for the new style virtual environments.
    The bug has existed in CPython since back in 2014 and has not
    been fixed. For details of the issue see
    https://bugs.python.org/issue22213.

    For non Window systems a workaround had been used to resolve
    the problem, but the same workaround has never worked on Windows.
    The change in this version tries a different workaround for
    Windows environments.

    Added a workaround for the fact that Python doesn’t actually
    set the _main_thread attribute of the threading module to the
    main thread which initialized the main interpreter or sub
    interpreter, but the first thread that imports the threading
    module. In an embedded system such as mod_wsgi it could be a
    request thread, not the main thread, that would import the
    threading module.

    This issue was causing the asgiref module used in Django to
    fail when using signal.set_wakeup_fd() as code was thinking it
    was in the main thread when it wasn’t. See
    https://github.com/django/asgiref/issues/143.

    Using WSGILazyInitialization Off would cause Python to abort
    the Apache parent process. The issue has been resolved, but
    you are warned that you should not be using this option anyway
    as it is dangerous and opens up security holes with the potential
    for user code to run as the root user when Python is initialized.

    Fix a Python deprecation warning for PyArg_ParseTuple() which
    would cause the process to crash when deprecation warnings were
    turned on globally for an application. Crash was occuring
    whenever anything was output to Apache error log via print().

Features Changed

    The --isatty option of mod_wsgi-express has been removed and
    the behaviour enabled by the option is now the default. The
    default behaviour is now that if mod_wsgi-express is run in an
    interactive terminal, then Apache will be started within a sub
    process of the mod_wsgi-express script and the SIGWINCH signal
    will be blocked and not passed through to Apache. This means
    that a window resizing event will no longer cause mod_wsgi-express
    to shutdown unexpectedly.  When trying to set resource limits
    and they can’t be set, the system error number will now be
    included in the error message.

New Features

    Added the mod_wsgi.subscribe_shutdown() function for registering
    a callback to be called when the process is being shutdown.
    This is needed because atexit.register() doesn’t work as required
    for the main Python interpreter, specifically the atexit callback
    isn’t called before the main interpreter thread attempts to
    wait on threads on shutdown, thus preventing one from shutting
    down daemon threads and waiting on them.

    This feature to get a callback on process shutdown was previously
    available by using mod_wsgi.subscribe_events(), but that would
    also reports events to the callback on requests as they happen,
    thus adding extra overhead if not using the request events.
    The new registration function can thus be used where only
    interested in the event for the process being shutdown.

    Added an --embedded-mode option to mod_wsgi-express to make it
    easier to force it into embedded mode for high throughput, CPU
    bound applications with minimal response times. In this case
    the number of Apache child worker processes used for embedded
    mode will be dictated by the --processes and --threads option,
    completely overriding any automatic mechanism to set those
    parameters. Any auto scaling done by Apache for the child worker
    processes will also be disabled.

    This gives preference to using Apache worker MPM instead of
    event MPM, as event MPM doesn’t work correctly when told to
    run with less than three threads per process. You can switch
    back to using event MPM by using the --server-mpm option, but
    will need to ensure that have three threads per process or
    more.

    Locking of the Python global interpreter lock has been reviewed
    with changes resulting in a reduction in overhead, or otherwise
    changing the interaction between threads such that at high
    request rate with a hello world application, a greater request
    throughput can be achieved. How much improvement you see with
    your own applications will depend on what your application does
    and whether you have short response times to begin with. If
    you have an I/O bound application with long response times you
    likely aren’t going to see any difference.

    Internal metrics collection has been improved with additional
    information provided in process metrics and a new request
    metrics feature added giving access to aggregrated metrics over
    the time of a reporting period. This includes bucketed time
    data on requests so can calculate distribution of server, queue
    and application time.

    Note that the new request metrics is still a work in progress
    and may be modified or enhanced, causing breaking changes in
    the format of data returned.

    Hidden experimental support for running mod_wsgi-express
    start-server on Windows. It will not show in list of sub commands
    mod_wsgi-express accepts on Windows, but it is there. There
    are still various issues that need to be sorted out but need
    assistance from someone who knows more about programming Python
    on Windows and Windows programming in general to get it all
    working properly. If you are interested in helping, reach out
    on the mod_wsgi mailing list.


To generate a diff of this commit:
cvs rdiff -u -r1.22 -r1.23 pkgsrc/www/py-mod_wsgi/Makefile
cvs rdiff -u -r1.19 -r1.20 pkgsrc/www/py-mod_wsgi/distinfo
cvs rdiff -u -r1.1 -r0 pkgsrc/www/py-mod_wsgi/patches/patch-configure \
    pkgsrc/www/py-mod_wsgi/patches/patch-src_server_wsgi__python.h

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: pkgsrc/www/py-mod_wsgi/Makefile
diff -u pkgsrc/www/py-mod_wsgi/Makefile:1.22 pkgsrc/www/py-mod_wsgi/Makefile:1.23
--- pkgsrc/www/py-mod_wsgi/Makefile:1.22        Wed Jan  5 20:47:37 2022
+++ pkgsrc/www/py-mod_wsgi/Makefile     Tue Nov 15 08:52:09 2022
@@ -1,8 +1,7 @@
-# $NetBSD: Makefile,v 1.22 2022/01/05 20:47:37 wiz Exp $
+# $NetBSD: Makefile,v 1.23 2022/11/15 08:52:09 wiz Exp $
 
-DISTNAME=      mod_wsgi-4.7.1
+DISTNAME=      mod_wsgi-4.9.4
 PKGNAME=       ${PYPKGPREFIX}-${APACHE_PKG_PREFIX}-${DISTNAME}
-PKGREVISION=   2
 CATEGORIES=    www python
 MASTER_SITES=  ${MASTER_SITE_PYPI:=m/mod_wsgi/}
 

Index: pkgsrc/www/py-mod_wsgi/distinfo
diff -u pkgsrc/www/py-mod_wsgi/distinfo:1.19 pkgsrc/www/py-mod_wsgi/distinfo:1.20
--- pkgsrc/www/py-mod_wsgi/distinfo:1.19        Sun Dec 19 14:12:48 2021
+++ pkgsrc/www/py-mod_wsgi/distinfo     Tue Nov 15 08:52:09 2022
@@ -1,7 +1,5 @@
-$NetBSD: distinfo,v 1.19 2021/12/19 14:12:48 wiz Exp $
+$NetBSD: distinfo,v 1.20 2022/11/15 08:52:09 wiz Exp $
 
-BLAKE2s (mod_wsgi-4.7.1.tar.gz) = f74642297af9ecfed637416cb118885acc11c3d1d8715bbd92d47503e3d009ed
-SHA512 (mod_wsgi-4.7.1.tar.gz) = 2c9d83737fe0ca5c599d3915e47047db2d06880ac3721c94350cd2d9ae930c20058e350f07c918dd301e50bf3433480e1bad479f4ffd382e6b2e42675352734e
-Size (mod_wsgi-4.7.1.tar.gz) = 498301 bytes
-SHA1 (patch-configure) = 7ece56413dfcb8de755dab722ebac632f3d1166f
-SHA1 (patch-src_server_wsgi__python.h) = 70c153e55642714d7172246748b6e4038e87d831
+BLAKE2s (mod_wsgi-4.9.4.tar.gz) = 39922e1c24dba83ca3e14288d722127fc0c2ad07c12e4ce15d5c24f55fbc6222
+SHA512 (mod_wsgi-4.9.4.tar.gz) = e99c062a8fa9fdb9ce50f8d902ff9c9b572f7c470bfad6db0ad34b52ee476814845331ebded86c62e335bbfd8887c56a3a62109c332a951f883314b8350a3ae4
+Size (mod_wsgi-4.9.4.tar.gz) = 497531 bytes



Home | Main Index | Thread Index | Old Index