[CODE] inetd

To: tech-userlevel%NetBSD.org@localhost
Subject: [CODE] inetd
From: tlaronde%polynum.com@localhost
Date: Thu, 8 Jun 2023 17:36:22 +0200
For the ones who want to look at it, the current state of my work is
here:

http://downloads.kergis.com/misc/inetd.tar.gz

You will at least see that I have rewritten a lot of parse.c and
parse_v2.c (and will in fact rewrite almost everything about the
parsing), inetd.c being the legacy and "easy" part (there are very
probably blunders added by me, since, if the code compiles, I haven't
run it yet: I always write everything and start debugging only when I'm
at least satisfy theoretically with what I have done).

(I have put assertions so no need to tell me: "it doesn't work", it is
not meant to at the present stage.)

And yes, there are still scalffolders, ladders, etc.: WIP.

So the code is here and it compiles but it is not functional yet.

My next timeline is 2023-06-19 for a functional version and 2023-06-26
for a fully functional one with "enough" testing done.

For the record, I have finally managed to understand what the parsing
code was doing and realized that what I thought initially were bugs,
were not: it is only that the way the code is written, it is almost
impossible for someone having not studied it carefully, to grasp at once
what is going on.

So here is how the old (present) code worked; and I will sketch after,
what the one I'm actually writing does:

Old code:

The manual page states that the config has to be given as absolute path
in normal mode, but that it can be a relative path in debug mode.

	=> This is so only by side effect. If in debugging mode, the process
	does not daemonize. The call daemon(0,0) does change CWD to
	root. This is why in normal mode, even if one has not specified a
	rooted path, it is always rooted, because it is always taken
	relative to CWD that happens to be '/' when not in debugging mode.

	New: the path is verified in every case to be rooted (leading '/')
	so that the result of checking or running does not depend on CWD.

The man page states that a lock is written. 

	=> It is not accurate: a lock is attempted but the return value is
	not used and hence several processes can be running in parallel
	until a fatal error stop one of them. The problem is that they can
	pollute the logs.

	New code: if there is a lock, it is /var/run so we are running
	against root: the process exits. If there is no lock but we can't
	write to /var/run, syslog is not used and messages go to stderr
	(this is allowed not for the checked, that runs before daemonization
	and before lock, but in case of a testing mode using builtins and
	not privilege ports).

When parsing, the code construct a linked list of servtab structure.
The implementation of the ".include" directive has not be made so that
it behaves like a C programmer or a sh(1) user expects: the effect is
not equivalent (minus continuations lines---escaped lines) to having
directly written the chunk in the file.

The implementation in fact is using config() with recursive calls. This
is what I absolutely missed when I glanced through the code. Hence such
"horrors" as:

	defhost = newstr(defhost);

(making a copy of the string and "loosing" the initial pointer) was
increasing my blood pressure.  But in fact, in another routine, far from
this one, the current value was saved, before, after a convoluted
succession of routines, config() was called again with a new config
(the included file). The copied string was just so that on "poping", it
will be fread, and the previous value restored. But it took me quite
some time to understand because the names or the comments were no help,
or a brief explanation of the process.

	=> But the result is that if someone wants to "include" a file with
	IPSEC policies: ".include ipsec.conf", assuming that it will apply,
	it was totally lost: when the task pops off, the preceding policy is
	restored. The inheritance is always from parent to child, and the
	"included" file is not considered a chunk of the config, but a
	child.

	New: in my new code, there is no recursion. The routine filling the
	line buffer simply switches to the new file to read the next line
	from the included file. The ".include" directive is handled as
	a sourced file in shell (or in #include in C). But the structure
	put in place could allow another directive to implement the previous
	behavior (but not with the "include" name).

In the old code, the loops were indeed prevented, contrary to what I
stated having not understood the recursion. And my initial bug report:

	save_defhost = defhost;
	new_file.abs = realpath(CONFIG, NULL); <--- here
	new_file.next = file_list_head;
#ifdef IPSEC
	save_policy = policy;
#endif
	/* Put new_file at the top of the config stack */
	file_list_head = &new_file;
	read_glob_configs(pattern);
	free(new_file.abs); 	<--- here
	/* Pop new_file off the stack */

was because "read_glob_configs()" didn't ring a bell: recursion was
going there:
	read_glob_configs() -> include_matched_path() -> config() /*
recursion */

	New code: the loops are detected too. But the list of structures
	allow to keep the information about the path from file to file,
	and line in the file, leading to the offendive reinclusion.

In the old code, since config() is recursively invoked, the code
"merges" the services. In fact, when reloading, first the linked list of
existing services were "unchecked" (.se_checked set to false) and the
linked list passed to config. When reading a new entry, the linked list
was walked down to find a same service and if found, the new definition
took precedence. Finally, when everything had been pop'ped out, the
remaining unchecked services were suppressed from the linked list.

	=> The problem there is that services could be run and stop several
	times during the process depending on how many times the service was
	redefined. Passing the current list to config() was also increasing
	the time to parse since there were supplementary structure to
	compare. And, finally, this was mixing parsing with execution while
	what was discussed: that with the new syntax, an error somewhere
	could have repercussions to following directives totally changing
	what was served from what was wanted.

	New code: there is one thing that could be taken. The new config
	is passed without stopping, before the new is validated, the
	services run from the old config. But on successful check, instead
	of stopping all, switching the config, and reloading, if a listening
	daemon is already spawned, and there is no change in options, just
	keep the running one without interrupting and respawning for
	nothing.

And for the rest, there are various corner cases with the syntax and one
or two bugs (in corner cases). But I will have to amend the man page,
and with the checker function it will be easy to put the syntax to a
strain test.
-- 
        Thierry Laronde <tlaronde +AT+ polynum +dot+ com>
                     http://www.kergis.com/
                    http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
Prev by Date: Re: How to profile NSS code?
Next by Date: inetd(8): cmdif as builtin
Previous by Thread: How to profile NSS code?
Next by Thread: [CODE] inetd
Indexes:
Home | Main Index | Thread Index | Old Index