Subject: Re: db->seq never gets to end
To: None <netbsd-help@NetBSD.org>
From: Jeremy C. Reed <reed@reedmedia.net>
List: netbsd-help
Date: 08/29/2007 18:20:14
On Tue, 14 Aug 2007, Alistair Crooks wrote:
> On Thu, Aug 09, 2007 at 10:57:39PM -0500, Jeremy C. Reed wrote:
> > Hopefully this list is okay ...
> >
> > I am using:
> >
> > for (r = db->seq(db, &dbk, &dbd, R_FIRST); !r;
> > r = db->seq(db, &dbk, &dbd, R_NEXT)) {
> >
> > But sometimes it never ends.
> >
> > I added a counter. And it would get to hundreds of thousands and data
> > would repeat. I only have less than 1500 keys. I'd also get dbd.size that
> > would be hundreds of thousands of bytes (but should only be 20 bytes).
> >
> > Is there any way to ask the hash(3) how many elements it has?
>
> Maybe it's just me, but I didn't even realise that you could
> do a sequential scan of a hashed database. dbopen(3) says:
>
> R_LAST and R_PREV are available only for the
> DB_BTREE and DB_RECNO access methods because they
> each imply that the keys have an inherent order
> which does not change.
>
> Given that statement, I don't see how R_FIRST and R_NEXT are
> any different, but I'm getting old and confused.
>
> Was this software meant to use a different version of Berkeley
> db?
>
> Regards,
> Alistair
I never saw any comment to your first thought.
The "spamd" software was written to use Berkeley DB as provided in the
base install of OpenBSD. I assume that it is near same as ours.
My spamd databases continue to get corrupted and I have many log entries
like:
Aug 29 12:26:46 ca spamd[11901]: can't delete 74.220.163.20
mail.pwhosting.com <> <Anne@lists.reedmedia.net> from spamd db (No buffer
space available)
Aug 29 12:25:43 ca spamd[11901]: queueing deletion of et.paqueta.com.br <> <Kief
fer@lists.reR!UF.PUF2uUF^Q
(notice missing end >)
Aug 29 17:04:31 c-0500 spamd[28330]: can't delete Rhein-Neckar.DE>
<reed@reedmedia.net>5^TUF^DSUF^UhUF^C from spamd db (No buffer space
available)
Should a Berkeley DB key and/or data be sanitized to replace special
characters before using?
I have been using spamd for over a year. Previously spamd used BTREE and
since January been using HASH. Their CVS log says: "Using DB_BTREE for
spamd is wrong, order is never required and the rebalancing really slags
big databases."
Note I have the problems on two different 3.1 servers (one i386 and the
other XEN3_DOMU on i386).
The side effects are that it often fails on "bogus entry in spamd
database" so my pf spamd-white table is not updated; new GREY entries are
created when WHITE entry for the IP already exists; pf table is sometimes
replaced with only a small amount of the WHITE entries; and sometimes
spamd loops through database entries thinking it has hundreds of thousands
of entries using up all memory (I stopped that by forcing a max limit to
amount of entries it can loop through).
Jeremy C. Reed