tech-userlevel: Re: XML config file

Subject: Re: XML config file
To: Iain Hibbert <plunky@rya-online.net>
From: Jachym Holecek <freza@dspfpga.com>
List: tech-userlevel
Date: 07/22/2006 18:21:40
# Iain Hibbert 2006-07-22:
> On Fri, 21 Jul 2006, Jason Thorpe wrote:
> > Because I modeled this somewhat after the CFPropertyList / NSDictionary
> > stuff in OS X.  Specifically, NSDictionary has -writeToFile:atomically:,
> > which takes a dictionary and serializes it to an XML plist file, as well
> > as +dictionaryWithContentsOfFile: and -initWithContentsOfFile:.  NSObject
> > itself has no such method.
> 
> Ok, is it worth having a *ternalize_to_file functionality in the library?
> Although its fairly trivial to implement, having a standard function would
> be useful if usage becomes widespread..

*ternalize_{to,from}_file(int fd), #ifndef _KERNEL would be useful. One
other thing that would be nice is the ability to parse/dump dictionaries
across buffer boundaries. I have two specific use cases in mind:

  * The kernel may want to present a stream of ASCII-encoded events
    over a character device. In this case, we can never tell how large
    buffer the user will pass next. This is what I'd like to use for
    devmon -- if events can be just read(2) from a device, the user is
    free to use awk/whatever to handle events (and hence "has enough
    rope to hang himself").

  * Passing streams of dictionaries over nonblocking sockets. I guess
    everybody is tired of inventing private ASCII-based communication
    protocols to talk between processes. It would be a huge win if we
    could use proplib for this.

If we can agree the above makes sense, I could probably volunteer some
time to work on it. But maybe I'm just pushing proplib beyond its intented
usage.

> > I would be happy to add internalize / externalize routines for
> > prop_array_t (NSArray in OS X has similar methods to NSDictionary's
> > serialization support), but I don't want to add generic
> > prop_object_externalize() / prop_object_internalize() because it's
> > rather important, I think, to explicitly state in the code which you're
> > expecting and let the parser return an error when it gets the wrong one.
> 
> I'm easy since I have no need for the array anymore, but I dont see any
> big deal checking the object type - after all, you must to do that most of
> the time anyway.
> 
> 	dict = prop_object_internalize(xml);
> 	if (dict == NULL || prop_object_type(dict) != PROP_TYPE_DICTIONARY)
> 		errx(...);
> 
> 	obj = prop_dictionary_get(dict, "count");
> 	if (obj == NULL || prop_object_type(obj) != PROP_TYPE_NUMBER)
> 		errx(...);

It would also be sweet if one could check dictionary key names and
types while the thing is internalized -- at least I prefer to be strict
when parsing a configuration file, ie. warnx/errx when (syntactically
correct) garbage is found.

> > > As to the utility of using XML for config files, it is handy having the
> > > internalize/externalize function and I am going to use that, but after
> > > having looked at the output, my opinion is that its not very human
> > > friendly at all so better suited for private database files rather than
> > > configuration files.
> >
> > Perhaps I've simply grown so used to it... I now actively DISLIKE the flat
> > config files that BSD uses, because of their lack of structure :-)
> 
> Although I think a standardised configuration format and parser is a
> desireable thing, and I'm happy with the tree structure, I think that the
> lack of readability in XML is a major concern.
> 
> Simplistically,
> 
> item1 {
> 	key1	"string";
> 	key2	0x33;
> 	key3	[
> 		0x33;
> 		"string";
> 		true;
> 		true;
> 		<junk>;
> 	];
> 	key4	false;
> 	key5	<datablock>;
> 	key6	{
> 		key6.1		"string";
> 		key6.2		0xabcd;
> 	};
> };
> 
> would seem to be just as parseable, and much more readable though the
> contents of keys might need to be limited to an ascii word (no bad thing?)

I like this. Limiting key names to ASCII is not a problem, IMO.

> Another issue I had was that an internal data type might want to be an
> external string type - eg, an IP address is a 4 byte array but in a config
> file you would rather it show up as "127.0.0.1".

For userland use, one could prefix strings with a hint on how to interpret
them. Something like writing 'ip4#"127.0.0.1"' and them register a
string-to-datablock converter for "ip4" prefix. But this might just be
overcomplicating the whole thing...

	-- Jachym