NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bin/46255: apropos(1) sometimes report unrelated responses



The following reply was made to PR bin/46255; it has been noted by GNATS.

From: Abhinav Upadhyay <er.abhinav.upadhyay%gmail.com@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Subject: Re: bin/46255: apropos(1) sometimes report unrelated responses
Date: Sun, 25 Mar 2012 13:57:28 +0900

 On Sun, Mar 25, 2012 at 8:00 AM,  <njoly%pasteur.fr@localhost> wrote:
 >>Number: =A0 =A0 =A0 =A0 46255
 >>Category: =A0 =A0 =A0 bin
 >>Synopsis: =A0 =A0 =A0 apropos(1) sometimes report unrelated results
 >>Confidential: =A0 no
 >>Severity: =A0 =A0 =A0 non-critical
 >>Priority: =A0 =A0 =A0 medium
 >>Responsible: =A0 =A0bin-bug-people
 >>State: =A0 =A0 =A0 =A0 =A0open
 >>Class: =A0 =A0 =A0 =A0 =A0sw-bug
 >>Submitter-Id: =A0 net
 >>Arrival-Date: =A0 Sat Mar 24 23:00:00 +0000 2012
 >>Originator: =A0 =A0 Nicolas Joly
 >>Release: =A0 =A0 =A0 =A0NetBSD 6.99.4
 >>Organization:
 > Institut Pasteur
 >>Environment:
 > System: NetBSD lanfeust.sis.pasteur.fr 6.99.4 NetBSD 6.99.4 (LANFEUST) #5=
 : Sat Mar 24 14:34:56 CET 2012 
njoly%lanfeust.sis.pasteur.fr@localhost:/local/src/Net=
 BSD/obj.amd64/sys/arch/amd64/compile/LANFEUST amd64
 > Architecture: x86_64
 > Machine: amd64
 >>Description:
 > Sometimes, apropos(1) return un-related results. By example, the `apropos=
  lfs'
 > command report correct entries that include the searched word but some
 > un-related ones for the LF word .
 >
 > newfs_lfs(8) =A0 =A0construct a new LFS file system
 > rump_lfs(8) =A0 =A0 mount a lfs image with a userspace server
 > scan_ffs(8) =A0 =A0 find FFSv1/FFSv2/LFS partitions on a disk or file
 > lfs_segclean(2) mark a segment clean
 > mvme68k/lpt(4) =A0parallel port driver
 > lfs_segwait(2) =A0wait until a segment is written
 > x86/lpt(4) =A0 =A0 =A0Parallel port driver
 > installboot(8) =A0install disk bootstrap software
 > PCRE(3) - Perl-compatible regular expressions
 > PCRE(3) - Perl-compatible regular expressions
 >
 > For the 10 results reported, 6 are correct and 4 are wrong (2 lpt and 2 P=
 CRE).
 >
 > Things are worse for `apropos crs' which only report pages with "cr" word=
 ,
 > not even a single "crs" result is found.
 >
 > njoly@lanfeust [~]> apropos -n 1000 crs | head
 > mvme68k/lpt(4) =A0parallel port driver
 > ...the driver. Minor Bit Function 128 Use the interruptless driver. (poll=
 ing) 64 Do not initialize the device on the port. 32 Automatic LF on CR. 16=
  Select 1.6uS strobe pulse width (default is 6.4uS) pcc(4) , pcctwo(4)
 > [...]
 > njoly@lanfeust [~]> apropos -n 1000 crs | grep -ic crs
 
 This is because of the stemmer. The stemmer strips off the suffix 's'
 from the ending of all the tokens in an attempt to reduce the tokens
 to their root word. This of course isn't right for technical terms
 like lfs or abbreviations etc. I think the fix for this would require
 writing a custom tokenizer for the FTS engine of Sqlite, which does
 not try to stem down such technical keywords, but it would be a bit of
 an undertaking :)
 
 On the other hand, since the new apropos(1) supports full text search,
 I think to get better millage out of it, it would be more useful to
 specify a bit more detailed queries. It is hard to get 100% relevant
 results but I hope to improve it.
 
 --
 Abhinav
 


Home | Main Index | Thread Index | Old Index