NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
re-introducing ARFE; teaching ARFE new tricks
ARFE is a new text-processing toolkit that I recently started sharing
with the NetBSD community. I introduced ARFE on tech-userlevel@ in
August,
<http://mail-index.netbsd.org/tech-userlevel/2015/08/11/msg009269.html>.
ARFE is an experiment in making UNIX text-processing more *applicable*
and *accessible*.
applicable: ARFE is designed to process record- and field-oriented
texts as they actually appear in UNIX and on the web: semi-regular,
defect-ridden, and conforming to no written specification.
accessible: ARFE avoids excessive abstraction, tricky notation, and
precarious flexibility to the greatest practical extent. ARFE
encourages users to be *concrete*. Typically an ARFE user expresses
the input and output forms they expect by providing an example
instead of a description, and ARFE tolerates some mistakes in the
examples users give. With ARFE, it isn't necessary to name input
data fields, or to count off input fields from the beginning of a
record.
I'm creating ARFE in the hope that it will invite non-programmers to
attempt text-processing automation, and that programmers will find that
it relieves some text-processing tedium so that they can tackle bigger
problems.
Since I introduced ARFE, I have added new data detectors: ARFE now
detects hexadecimal numbers, IPv4 and MAC addresses.
There is a new program in the ARFE toolsuite called TT for (t)ransform
(t)ext. TT transforms its input based on a match-/transform-template
pair that exemplify the changes to be made. The templates show what
a sample input looks like before TT transforms it, and after---you
could call the templates a before/after pair. I have attached a
match/transform pair that produces a digest of ifconfig(8) output.
ARFE is in the NetBSD CVS repository. The path to ARFE is
othersrc/external/bsd/arfe/.
Next steps for ARFE include adding more data detectors (floating point
numbers, IPv6 addresses), making ARFE prefer matches that are "compact",
and adding support for multiple records per input. After ARFE supports
multiple records per input, a couple of really interesting programs
should be possible. Alas, it may be at least as difficult to program an
algorithm for identifying record boundaries as to program everything in
ARFE that came before!
Dave
--
David Young
dyoung%pobox.com@localhost Urbana, IL (217) 721-9981
wm0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1492
capabilities=2bf80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Tx,UDP6CSUM_Tx>
enabled=2bf80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Tx,UDP6CSUM_Tx>
address: aa:ab:ac:ad:ae:af
media: Ethernet autoselect (1000baseT full-duplex)
status: active
input: 745177 packets, 455120099 bytes, 74377 multicasts
output: 540500 packets, 121290742 bytes, 253 multicasts
inet 0.0.0.0 netmask 0xffffffff broadcast 255.255.255.255
inet6 fe80::10f:20f:30f1:40f%wm0 prefixlen 79 scopeid 0x7
wm0 mtu 1492
link aa:ab:ac:ad:ae:af
inet 0.0.0.0 netmask 0xffffffff broadcast 255.255.255.255
inet6 fe80::10f:20f:30f1:40f%wm0 / 79 scopeid 0x7
Home |
Main Index |
Thread Index |
Old Index