Subject: Native pthreads issue with MySQL replication.
To: NetBSD-current Discussion List <current-users@NetBSD.ORG>
From: Andrew Gillham <gillham@vaultron.com>
List: current-users
Date: 10/27/2003 23:28:38
I'm using the pkgsrc mysql-server with native pthreads on my -current
system.
(MySQL 3.23.58 on i386 1.6ZE SMP box)
I have mysql setup to as a slave replication server, and it is running fine,
except for shutting down or restarting. It seems to shutdown ok when not
doing replication.
Apparently the replication thread is not shutting down correctly and the
mysqld process hangs, and I have to use 'kill -9' to cleanup.
The box is idle, so it is not a load issue and I can easily replicate it.
The mysql 'show processlist' command looks like this:
+--+------+---------+--+-------+----+---------------------+----------------+
|Id|User |Host |db|Command|Time|State |Info |
+--+------+---------+--+-------+----+---------------------+----------------+
|1 |system|none | |Connect|2 |Reading master update| |
|2 |root |localhost| |Query |0 | |show processlist|
+--+------+---------+--+-------+----+---------------------+----------------+
The normally running process:
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
11798 mysql 18 0 10M 2716K sigwai/0 0:00 0.00% 0.00% mysqld
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
mysql 11798 0.0 0.5 10012 2716 p0 Sa 10:44PM 0:00.07 /usr/pkg/.../mysqld
When I run 'mysqladmin shutdown' it doesn't change the state of the process,
but it doesn't exit.
Running mysqld under gdb didn't help, I ended up panic'ing the box after
doing 'kill -ABRT' on the mysql process, and then 'quit' in gdb.
The panic:
login: uvm_fault(0xe34a53c0, 0, 0, 1) -> 0xe
kernel: page fault trap, code=0
Stopped in pid 1772.1 (gdb) at netbsd:kpsignal2+0x11b: testl
%eax,0(%ebx,%edx
,4)
db{0}> tr
kpsignal2(e3e879cc,e3f03e64,1,e3e87d0c,0) at netbsd:kpsignal2+0x11b
psignal1(e3e879cc,1,1,e3e879cc,0) at netbsd:psignal1+0x29
orphanpg(e34df140,0,e3f03eec,c035b593,c076afa8) at netbsd:orphanpg+0x33
fixjobc(e3e87d0c,e34df100,0,e34a2fd8,0) at netbsd:fixjobc+0x76
exit1(e34a8e58,0,0,0,e3f03f5c) at netbsd:exit1+0x13f
sys_exit(e34a8e58,e3f03f64,e3f03f5c,1,7) at netbsd:sys_exit+0x23
syscall_plain(e3f03fa8,1f,1f,821001f,bfbf001f) at netbsd:syscall_plain+0x173
db{0}>
Anyway, I have a slightly older version of MySQL built statically with the
mit-pthreads including with MySQL and it works correctly, so I would
guess it is related to the native pthreads. Perhaps signal related?
If anyone is using MySQL with replication with native pthreads, please
let me know. Or if you have any ideas on how to debug this a bit more.
-Andrew