Subject: GNU metacharacter support for bsdgrep-devel
To: None <tech-userlevel@netbsd.org>
From: Bruce J.A. Nourish <bjan+tech-userlevel@bjan.net>
List: tech-userlevel
Date: 01/20/2004 01:23:53
Hey everyone,
Breaking my long tradition of merely slagging off other people's work,
rather than doing any myself, I give you support for GNU grep's
metacharacters in bgrep. I hope this brings the prospect of a free grep
in the NetBSD base system a step closer to reality. Specifically,
we now grok:
\< and \> - word begin and word end
\w (\W) - (non-) word characters
\b (\B) - (non-) word begin or end
This is done by preprocessing the pattern before it gets regcomp(3)'d.
Note that I am a novice C programmer and you should review the patch for
correctness before you do anything crazy, like applying it. I diffed
against the *unpatched* source that results from doing "make extract" in
textproc/bsdgrep-devel.
--- grep.c.orig 2003-11-08 13:10:08.000000000 -0700
+++ grep.c
@@ -37,6 +37,7 @@ __RCSID("$NetBSD: grep.c,v 1.43 2003/11/
#include <sys/types.h>
#include <sys/stat.h>
+#include <zlib.h>
#include <err.h>
#include <errno.h>
#include <getopt.h>
@@ -184,6 +185,49 @@ struct option long_options[] =
{NULL, no_argument, NULL, 0}
};
+static char *
+expand(char *pat, size_t *len)
+{
+ char *tmp, *rep;
+ size_t prelen, replen, postlen;
+
+ tmp = pat;
+ while((tmp = strchr(tmp, '\\')) != NULL) {
+ switch(*++tmp) {
+ case '<':
+ rep = "[[:<:]]";
+ break;
+ case '>':
+ rep = "[[:>:]]";
+ break;
+ case 'w':
+ rep = "[[:alnum:]]";
+ break;
+ case 'W':
+ rep = "[^[:alnum:]]";
+ break;
+ case 'b':
+ rep = "[[:<:][:>:]]";
+ break;
+ case 'B':
+ rep = "[^[:<:][:>:]]";
+ break;
+ default:
+ continue;
+ }
+ replen = strlen(rep);
+ prelen = tmp - pat - 1;
+ postlen = *len - prelen - 2;
+ pat = grep_realloc(pat, *len + replen);
+ tmp = pat + prelen;
+ memmove(tmp + replen, tmp + 2, postlen);
+ memcpy(tmp, rep, replen);
+ *len = prelen + replen + postlen;
+ tmp += replen;
+ }
+ return pat;
+}
+
static void
add_pattern(char *pat, size_t len)
{
@@ -200,6 +244,8 @@ add_pattern(char *pat, size_t len)
pattern[patterns] = grep_malloc(len + 1);
strncpy(pattern[patterns], pat, len);
pattern[patterns][len] = '\0';
+ if (!Fflag)
+ pattern[patterns] = expand(pattern[patterns], &len);
++patterns;
}
--
Bruce J.A. Nourish <bjan+public@bjan.net> http://bjan.freeshell.org