Subject: Re: RFC: memmem(3)
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Jaromir Dolecek <jdolecek@netbsd.org>
List: tech-userlevel
Date: 03/02/2003 21:54:17
There is even Boyer-Moore string search implementation in tree -
bm(3).
I keep meaning to write implementation of Aho-Corasik string
search implementation ...
Jaromir
der Mouse wrote:
[ Charset ISO-8859-1 unsupported, converting... ]
> >> + searching is usually much faster
> > That was my first thought too - if something needs this, it would
> > probably better use some more clever algorithm, like Rabin/Karp or
> > Knuth/Morris/Pratt (with similar interface changes as you describe).
>
> I'm not sure which algorithm I'm using in terms of names like those.
> The name Boyer/Moore comes to mind, but I'm not sure exactly what that
> is so I could be misled there.
>
> Here's the interface I'm using:
>
> void
> searchstr_maketbl(unsigned char (*table)[256],
> const unsigned char *string, unsigned int len,
> const unsigned char equiv[256]);
>
> int
> searchstr(const unsigned char *string, unsigned int stringlen,
> unsigned char (*table)[256], unsigned int tablelen);
>
> The first function is the overhead call I referred to; the second
> actually does the search. The interface is not very general (in terms
> of the underlying algorithm) because I wanted to avoid making the
> library routines allocate memory.
>
> Hm, the table argument to searchstr really ought to be
> const unsigned char (*)[256]. And of course 256 should be something
> more like UCHAR_MAX....
>
> /~\ The ASCII der Mouse
> \ / Ribbon Campaign
> X Against HTML mouse@rodents.montreal.qc.ca
> / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
>
--
Jaromir Dolecek <jdolecek@NetBSD.org> http://www.NetBSD.org/
-=- We should be mindful of the potential goal, but as the tantric -=-
-=- Buddhist masters say, ``You may notice during meditation that you -=-
-=- sometimes levitate or glow. Do not let this distract you.'' -=-