tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Thread-local storage issues arose again? Firefox frequently crashes on 10.0 aarch64
Hi,
As I mentioned in
http://mail-index.netbsd.org/netbsd-users/2024/04/12/msg030915.html,
Firefox tab processes crash very frequently on NetBSD/aarch64 10.0.
Building it with PKG_OPTIONS.firefox+=debug-info revealed that when it
crashes it segfaults at one of these two places non-deterministically:
third_party/rlbox/include/rlbox_noop_sandbox.hpp:
rlbox_noop_sandbox_thread_data* get_rlbox_noop_sandbox_thread_data();
# define RLBOX_NOOP_SANDBOX_STATIC_VARIABLES() \
thread_local rlbox::rlbox_noop_sandbox_thread_data \
rlbox_noop_sandbox_thread_info{ 0, 0 }; \
namespace rlbox { \
rlbox_noop_sandbox_thread_data* get_rlbox_noop_sandbox_thread_data() \
{ \
return &rlbox_noop_sandbox_thread_info; \
} \
} \
static_assert(true, "Enforce semi-colon")
> ...
template<typename T, typename T_Converted, typename... T_Args>
auto impl_invoke_with_func_ptr(T_Converted* func_ptr, T_Args&&... params)
{
#ifdef RLBOX_EMBEDDER_PROVIDES_TLS_STATIC_VARIABLES
auto& thread_data = *get_rlbox_noop_sandbox_thread_data();
#endif
auto old_sandbox = thread_data.sandbox; // <-- CRASHES HERE!
thread_data.sandbox = this;
auto on_exit =
detail::make_scope_exit([&] { thread_data.sandbox = old_sandbox; });
return (*func_ptr)(params...);
}
media/libjpeg/simd/arm/aarch64/jsimd.c:
static THREAD_LOCAL unsigned int simd_support = ~0;
JSIMD_FASTST3 | JSIMD_FASTTBL;
> ...
LOCAL(void)
init_simd(void)
{
#ifndef NO_GETENV
char env[2] = { 0 };
#endif
#if defined(__linux__) || defined(ANDROID) || defined(__ANDROID__)
int bufsize = 1024; /* an initial guess for the line buffer size limit */
#endif
if (simd_support != ~0U) // <-- CRASHES HERE!
return;
simd_support = 0;
So both of these cases involve TLS, that is, tab processes segfault
while attempting to access thread-local variables. At run-time these
functions reside in libxul.so, which is dlopen'ed by the main process. I
recall there were a few issues in TLS handling in the past but
riastradh@ fixed them before we branched 10.0, right?
"readelf -r libxul.so" shows no R_AARCH64_TLS_TPR in its relocation
table but only shows R_AARCH64_TLSDESC, so I believe these variables use
local-dynamic model. I tried to create a minimal reproducer but it
didn't crash:
https://gist.github.com/depressed-pho/b6894fdaef94a1b9aa5459b1a2f65590
So I speculated that there were some kind of limit in the size of TLS
blocks that dlopen(3) could sanely handle, and libxul.so exceeded it. As
I mentioned in the previous mail, I modified /usr/pkg/bin/firefox based
on this speculation:
#!/bin/sh
LD_PRELOAD=/usr/pkg/lib/firefox/libxul.so /usr/pkg/lib/firefox/firefox "$@"
To my surprise this actually worked! Firefox hasn't crashed even once
since this modification! Help, riastradh@, TLS is convoluted and I have
nearly zero knowledge about this monstrosity!
Home |
Main Index |
Thread Index |
Old Index