NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Strange semaphore behavior
Hello,
I develop for a long time RPL/2 (http://www.rpl2.net). This software
was mainly written on Linux and Solaris operating systems but I have
tested this software on NetBSD 6 and 7 without any special trouble.
I have installed NetBSD-8 on a new workstation and computations I have
launched aborts with an RPL/2 system error. The same computations run
without any trouble on Solaris or Linux. Maybe there are some bugs in
this software, and I have tried to debug.
RPL/2 aborts on semaphore operation. Thus I have replaced regular
sem_wait and sem_post by:
# define sem_wait(a) ({ int value; sem_getvalue(a, &value); \
printf("[%d-%llu] Semaphore %s (%p) "\
"waiting at %s() " \
"line #%d <%d>\n", (int) getpid(), (unsigned long long) \
pthread_self(), \
#a, a, __FUNCTION__, __LINE__, value), fflush(stdout); \
sem_wait(a); })
# define sem_post(a) ({ int value; sem_getvalue(a, &value); \
printf("[%d-%llu] Semaphore %s (%p) "\
"posting at %s() " \
"line #%d <%d>\n", (int) getpid(), (unsigned long long) \
pthread_self(), \
#a, a, __FUNCTION__, __LINE__, value), fflush(stdout); \
sem_post(a); })
Of course, all sources files were compiled again. I obtain a very long
output file that contains:
[2871-135822618777600] Semaphore &((*s_etat_processus).semaphore_fork)
(0x7b87aab6fc50) waiting at librpl_analyse() line #1021 <2>
LAST ERROR: Invalid argument
ERROR 2009 AT librpl_analyse() FROM analyse-conv.c LINE 1028
[2871-135822618777600] librpl_analyse() from analyse-conv.c at line
1028: BACKTRACE only defined in glibc
+++Système : Erreur dans la gestion des processus [2871]
OK. Faulty line is:
while(sem_wait(&((*s_etat_processus).semaphore_fork)) != 0)
{
...
}
and errno os EINVAL. A few line above, I have written :
if (sem_post(&((*s_etat_processus).semaphore_fork)) != 0)
{
...
}
that triggers no error. Of course, I have verified this semaphore was
not destroyed.
Real source code is:
# ifndef SEMAPHORES_NOMMES
if (sem_post(&((*s_etat_processus).semaphore_fork)) != 0)
# else
if (sem_post((*s_etat_processus).semaphore_fork) != 0)
# endif
{
(*s_etat_processus).erreur_systeme = d_es_processus;
return;
}
# ifndef SEMAPHORES_NOMMES
while(sem_wait(&((*s_etat_processus).semaphore_fork)) != 0)
# else
while(sem_wait((*s_etat_processus).semaphore_fork) != 0)
# endif
{
if (errno != EINTR)
{
(*s_etat_processus).erreur_systeme = d_es_processus;
return;
}
}
(SEMAPHORES_NOMMES is undefined on NetBSD operating system).
In a second time, I have done a grep on output file to check all
operations on this semaphore:
$ grep 0x7b87aab6fc50 out
...
[2871-135822618777600] Semaphore &((*s_etat_processus).semaphore_fork)
(0x7b87aab6fc50) waiting at librpl_analyse() line #1021 <1>
// SEM 1->0 OK
[2871-135822618777600] Semaphore &((*s_etat_processus).semaphore_fork)
(0x7b87aab6fc50) posting at librpl_analyse() line #1011 <0>
// SEM 0->1 OK
[2871-135822618777600] Semaphore &((*s_etat_processus).semaphore_fork)
(0x7b87aab6fc50) waiting at librpl_analyse() line #1021 <1>
// SEM 1->0 OK
[2871-135822618777600] Semaphore &((*s_etat_processus).semaphore_fork)
(0x7b87aab6fc50) posting at librpl_analyse() line #1011 <1>
// SEM 1->2 NOK !!!! Sem should be equal to 0 before post !
This semaphore is initialized with:
sem_init(&((*s_etat_processus).semaphore_fork), 0, 0);
and is not shared between process, only between threads.
Thus, in the same process (same PID) and same thread (same TID), a
value of a sem_wait() seems not to change semaphore value. Of course,
there is no reason to have a value greater than 1.
This code is built with:
schwarz# /usr/pkg/gcc8/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/usr/pkg/gcc8/bin/gcc
COLLECT_LTO_WRAPPER=/usr/pkg/gcc8/libexec/gcc/x86_64--netbsd/8.2.0/lto-wrapper
Target: x86_64--netbsd
Configured with: ../gcc-8.2.0/configure --disable-libstdcxx-pch
--with-system-zlib --enable-nls --with-libiconv-prefix=/usr
--enable-__cxa_atexit --with-gxx-include-dir=/usr/pkg/gcc8/include/c++/
--disable-libssp --enable-languages='c obj-c++ objc fortran c++'
--enable-shared --enable-long-long --with-local-prefix=/usr/pkg/gcc8
--enable-threads=posix --with-boot-ldflags='-static-libstdc++
-static-libgcc -Wl,-R/usr/pkg/lib ' --with-gnu-ld --with-ld=/usr/bin/ld
--with-gnu-as --with-as=/usr/bin/as --with-arch=nocona
--with-tune=nocona --with-fpmath=sse --prefix=/usr/pkg/gcc8
--build=x86_64--netbsd --host=x86_64--netbsd
--infodir=/usr/pkg/gcc8/info --mandir=/usr/pkg/gcc8/man
Thread model: posix
gcc version 8.2.0 (GCC)
and compilation flags are:
CFLAGS = -g -O2 -mtune=native -march=native -fno-strict-overflow
-malign-double -Wall -funsigned-char -Wno-pointer-sign
(on a i7-4770 workstation)
I have tried to write a simple program that could trigger this 'bug'
without any success. On my test program, the first bug appears after the
creation of more than 3000 threads... How can I be sure that the error
comes from my code and not from a NetBSD bug?
Best regards,
JB
Home |
Main Index |
Thread Index |
Old Index