tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: strftime(3) oddities with %s, %z
> Date: Wed, 2 Nov 2022 15:59:00 +0300
> From: Valery Ushakov <uwe%stderr.spb.ru@localhost>
>
> In other words, class tm doesn't have a public constructor that
> provides a way to specify TZ info. There are other factory methods
> that allow one to obtain an instance of tm that has the TZ info (in
> its private parts). ...
Suppose you create a struct tm _without_ gmtime(3) or localtime(3),
using designated initializers or memset for zero-initialization, with
only what is included in POSIX:
struct tm tm = {
.tm_sec = 56,
.tm_min = 34,
.tm_hour = 12,
.tm_mday = 1,
.tm_mon = 12 - 1, /* December */
.tm_year = 2021 - 1900,
.tm_wday = 3, /* Wednesday */
.tm_yday = 334, /* zero-based day of year (%j - 1) */
.tm_isdst = 0,
};
Nothing I've found in POSIX suggests you can't construct a struct tm
like this and use it with mktime, and the EXAMPLES section of
<https://pubs.opengroup.org/onlinepubs/009695399/functions/mktime.html>
certainly suggests you can -- indeed, tm_wday and tm_yday could even
be omitted. (If you think otherwise: Why do you think you can't
construct a struct tm like this?)
This struct tm doesn't specify a time zone in which to interpret the
calendar date. So what time_t do you get out of mktime(&tm), or what
number is represented by the string you get out of strftime(..., "%s",
&tm)?
First, in any particular context, I hope these should be the same!
(If you think otherwise: Why should they be different?)
If TZ=UTC, I think we can all agree that the answer should be
1638362096. (If you think otherwise: What should the answer be?)
Now what if TZ is not UTC, say TZ=Europe/Berlin? It obviously depends
on whether mktime and strftime examine TZ or tm_gmtoff. Here are some
possible rules to answer this:
1. mktime and strftime ignore tm_gmtoff and respect TZ, as if
tm_gmtoff did not exist.
In that case, we should get 1638358496, which is 1638362096 - 3600
because Europe/Berlin is +0100 at that calendar date, 1h ahead of
UTC.
This is the semantics that portable applications currently rely on,
so whatever the rule is had better agree with this!
2. mktime and strftime respect tm_gmtoff and ignore TZ.
In that case, we should get 1638362096 because tm_gmtoff=0 in this
code. But suddenly this is different from what portable
applications can rely on, so this can't be the right rule.
3. mktime and strftime respect tm_gmtoff if it is nonzero, meaning it
has been initialized by something not currently portable in POSIX,
and use TZ if tm_gmtoff=0.
In that case, we should get 1638358496, because tm_gmtoff=0 in this
code, so it interprets the struct tm in TZ=Europe/Berlin.
However, this has a funny side effect. Suppose we get struct tm
values from the following pseudocode:
TZ=Atlantic/Reykjavik localtime_r(1638362096, &tm_is);
TZ=Europe/Rome localtime_r(1638362096, &tm_it);
TZ=Israel localtime_r(1638362096, &tm_il);
TZ=Asia/Baghdad localtime_r(1638362096, &tm_iq);
Since these have filled in the time zones, it shouldn't matter what
TZ is set to when we feed tm_is/it/il/iq into mktime or
strftime("%s"), right?
Unfortunately, it _does_ matter. If we have, say, TZ=Europe/Rome,
then under this rule we would get:
tm_is: 1638358496
tm_it: 1638362096
tm_il: 1638362096
tm_iq: 1638362096
That's because with TZ=Atlantic/Reykjavik, tm_gmtoff=0. (Same with
some others like TZ=Europe/London, at least during the winter.)
So although this rule preserves the semantics of portably
constructed struct tm, it has wacky semantics for struct tm
constructed with a tm_gmtoff-aware localtime(3) -- I think we can
all agree this is obviously wrong.
4. mktime and strftime respect tm_gmtoff if tm_zone is nonnull, and
use TZ if tm_zone is null.
First, I'm not sure if tm_zone is always initialized to something
nonnull by localtime and gmtime -- it's unclear to me what naming
standard it follows, and I wouldn't be surprised if that included
sometimes leaving it as null. But let's suppose it is always
initialized to nonnull.
In that case, we should get 1638358496, because tm_gmtoff=0 in this
code, so it interprets the struct tm in TZ=Europe/Berlin.
But this avoids conflating zero-initialized tm_gmtoff with a
baked-in time zone of UTC, so with the various localtime calls in
case (3) we would always get 1638362096 out of mktime.
I think this might be closer to what uwe@ and dholland@ want: if
you didn't specify a time zone, then mktime uses TZ, but you can
specify a time zone or let localtime record what TZ was and it will
be passed on to mktime no matter what TZ is later.
However, this still changes semantics that portable applications
can currently rely on _even if they don't construct their own
struct tm objects_.
For example, POSIX currently guarantees that the following program
prints 1638362096 -- but under this rule, it would print
1638358496:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int
main(void)
{
time_t t = 1638358496;
struct tm tm;
setenv("TZ", "Europe/Berlin", 1);
tzset();
localtime_r(&t, &tm);
setenv("TZ", "UTC", 1);
tzset();
if ((t = mktime(&tm)) == -1) {
perror("mktime");
return 1;
}
printf("%lld\n", (long long)t);
fflush(stdout);
return ferror(stdout);
}
(This program may be a little silly, but it could be used to find
how long you have to wait from when it's a certain local time in
one place to when it is the `same' local time in another place.)
So while this rule might be a more sensible API design, it still
substantively changes the semantics of portable programs.
If you want a map from struct tm to time_t that recognizes the
difference between an input obtained by localtime and an input
obtained by gmtime, I don't think you can do that with mktime or
strftime("%s") without changing the semantics that existing programs
might rely on, silly as the original semantics may seem.
It seems to me either we need a new API, or we risk breaking existing
programs.
Home |
Main Index |
Thread Index |
Old Index