aboutsummaryrefslogtreecommitdiff
path: root/tzfile.5.txt
diff options
context:
space:
mode:
Diffstat (limited to 'tzfile.5.txt')
-rw-r--r--tzfile.5.txt463
1 files changed, 363 insertions, 100 deletions
diff --git a/tzfile.5.txt b/tzfile.5.txt
index 6e563e6bc6f5..945b7793da42 100644
--- a/tzfile.5.txt
+++ b/tzfile.5.txt
@@ -1,106 +1,369 @@
+TZFILE(5) File Formats Manual TZFILE(5)
+
NAME
+ tzfile - timezone information
+
+DESCRIPTION
+ The timezone information files used by tzset(3) are typically found
+ under a directory with a name like /usr/share/zoneinfo. These files
+ use the format described in Internet RFC 8536. Each file is a sequence
+ of 8-bit bytes. In a file, a binary integer is represented by a
+ sequence of one or more bytes in network order (bigendian, or high-
+ order byte first), with all bits significant, a signed binary integer
+ is represented using two's complement, and a boolean is represented by
+ a one-byte binary integer that is either 0 (false) or 1 (true). The
+ format begins with a 44-byte header containing the following fields:
- tzfile - time zone information
+ * The magic four-byte ASCII sequence "TZif" identifies the file as a
+ timezone information file.
-SYNOPSIS
- #include <tzfile.h>
+ * A byte identifying the version of the file's format (as of 2021,
+ either an ASCII NUL, "2", "3", or "4").
-DESCRIPTION
- The time zone information files used by tzset(3) begin with
- the magic characters "TZif" to identify them as time zone
- information files, followed by a character identifying the
- version of the file's format (as of 2005, either an ASCII
- NUL or a '2') followed by fifteen bytes containing zeroes
- reserved for future use, followed by six four-byte values of
- type long, written in a ``standard'' byte order (the high-
- order byte of the value is written first). These values
- are, in order:
-
- tzh_ttisgmtcnt
- The number of UTC/local indicators stored in the file.
-
- tzh_ttisstdcnt
- The number of standard/wall indicators stored in the
- file.
-
- tzh_leapcnt
- The number of leap seconds for which data is stored in
- the file.
-
- tzh_timecnt
- The number of "transition times" for which data is
- stored in the file.
-
- tzh_typecnt
- The number of "local time types" for which data is
- stored in the file (must not be zero).
-
- tzh_charcnt
- The number of characters of "time zone abbreviation
- strings" stored in the file.
-
- The above header is followed by tzh_timecnt four-byte values
- of type long, sorted in ascending order. These values are
- written in ``standard'' byte order. Each is used as a
- transition time (as returned by time(2)) at which the rules
- for computing local time change. Next come tzh_timecnt one-
- byte values of type unsigned char; each one tells which of
- the different types of ``local time'' types described in the
- file is associated with the same-indexed transition time.
- These values serve as indices into an array of ttinfo
- structures (with tzh_typecnt entries) that appears next in
- the file; these structures are defined as follows:
-
- struct ttinfo {
- long tt_gmtoff;
- int tt_isdst;
- unsigned int tt_abbrind;
- };
-
- Each structure is written as a four-byte value for tt_gmtoff
- of type long, in a standard byte order, followed by a one-
- byte value for tt_isdst and a one-byte value for tt_abbrind.
- In each structure, tt_gmtoff gives the number of seconds to
- be added to UTC, tt_isdst tells whether tm_isdst should be
- set by localtime (3) and tt_abbrind serves as an index into
- the array of time zone abbreviation characters that follow
- the ttinfo structure(s) in the file.
-
- Then there are tzh_leapcnt pairs of four-byte values,
- written in standard byte order; the first value of each pair
- gives the time (as returned by time(2)) at which a leap
- second occurs; the second gives the total number of leap
- seconds to be applied after the given time. The pairs of
- values are sorted in ascending order by time.
-
- Then there are tzh_ttisstdcnt standard/wall indicators, each
- stored as a one-byte value; they tell whether the transition
- times associated with local time types were specified as
- standard time or wall clock time, and are used when a time
- zone file is used in handling POSIX-style time zone
- environment variables.
-
- Finally there are tzh_ttisgmtcnt UTC/local indicators, each
- stored as a one-byte value; they tell whether the transition
- times associated with local time types were specified as UTC
- or local time, and are used when a time zone file is used in
- handling POSIX-style time zone environment variables.
-
- Localtime uses the first standard-time ttinfo structure in
- the file (or simply the first ttinfo structure in the
- absence of a standard-time structure) if either tzh_timecnt
- is zero or the time argument is less than the first
- transition time recorded in the file.
-
- For version-2-format time zone files, the above header and
- data is followed by a second header and data, identical in
- format except that eight bytes are used for each transition
- time or leap second time. After the second header and data
- comes a newline-enclosed, POSIX-TZ-environment-variable-
- style string for use in handling instants after the last
- transition time stored in the file (with nothing between the
- newlines if there is no POSIX representation for such
- instants).
+ * Fifteen bytes containing zeros reserved for future use.
+
+ * Six four-byte integer values, in the following order:
+
+ tzh_ttisutcnt
+ The number of UT/local indicators stored in the file. (UT is
+ Universal Time.)
+
+ tzh_ttisstdcnt
+ The number of standard/wall indicators stored in the file.
+
+ tzh_leapcnt
+ The number of leap seconds for which data entries are stored
+ in the file.
+
+ tzh_timecnt
+ The number of transition times for which data entries are
+ stored in the file.
+
+ tzh_typecnt
+ The number of local time types for which data entries are
+ stored in the file (must not be zero).
+
+ tzh_charcnt
+ The number of bytes of time zone abbreviation strings stored
+ in the file.
+
+ The above header is followed by the following fields, whose lengths
+ depend on the contents of the header:
+
+ * tzh_timecnt four-byte signed integer values sorted in ascending
+ order. These values are written in network byte order. Each is used
+ as a transition time (as returned by time(2)) at which the rules for
+ computing local time change.
+
+ * tzh_timecnt one-byte unsigned integer values; each one but the last
+ tells which of the different types of local time types described in
+ the file is associated with the time period starting with the same-
+ indexed transition time and continuing up to but not including the
+ next transition time. (The last time type is present only for
+ consistency checking with the POSIX-style TZ string described below.)
+ These values serve as indices into the next field.
+
+ * tzh_typecnt ttinfo entries, each defined as follows:
+
+ struct ttinfo {
+ int32_t tt_utoff;
+ unsigned char tt_isdst;
+ unsigned char tt_desigidx;
+ };
+
+ Each structure is written as a four-byte signed integer value for
+ tt_utoff, in network byte order, followed by a one-byte boolean for
+ tt_isdst and a one-byte value for tt_desigidx. In each structure,
+ tt_utoff gives the number of seconds to be added to UT, tt_isdst
+ tells whether tm_isdst should be set by localtime(3) and tt_desigidx
+ serves as an index into the array of time zone abbreviation bytes
+ that follow the ttinfo entries in the file; if the designated string
+ is "-00", the ttinfo entry is a placeholder indicating that local
+ time is unspecified. The tt_utoff value is never equal to -2**31, to
+ let 32-bit clients negate it without overflow. Also, in realistic
+ applications tt_utoff is in the range [-89999, 93599] (i.e., more
+ than -25 hours and less than 26 hours); this allows easy support by
+ implementations that already support the POSIX-required range
+ [-24:59:59, 25:59:59].
+
+ * tzh_charcnt bytes that represent time zone designations, which are
+ null-terminated byte strings, each indexed by the tt_desigidx values
+ mentioned above. The byte strings can overlap if one is a suffix of
+ the other. The encoding of these strings is not specified.
+
+ * tzh_leapcnt pairs of four-byte values, written in network byte order;
+ the first value of each pair gives the nonnegative time (as returned
+ by time(2)) at which a leap second occurs or at which the leap second
+ table expires; the second is a signed integer specifying the
+ correction, which is the total number of leap seconds to be applied
+ during the time period starting at the given time. The pairs of
+ values are sorted in strictly ascending order by time. Each pair
+ denotes one leap second, either positive or negative, except that if
+ the last pair has the same correction as the previous one, the last
+ pair denotes the leap second table's expiration time. Each leap
+ second is at the end of a UTC calendar month. The first leap second
+ has a nonnegative occurrence time, and is a positive leap second if
+ and only if its correction is positive; the correction for each leap
+ second after the first differs from the previous leap second by
+ either 1 for a positive leap second, or -1 for a negative leap
+ second. If the leap second table is empty, the leap-second
+ correction is zero for all timestamps; otherwise, for timestamps
+ before the first occurrence time, the leap-second correction is zero
+ if the first pair's correction is 1 or -1, and is unspecified
+ otherwise (which can happen only in files truncated at the start).
+
+ * tzh_ttisstdcnt standard/wall indicators, each stored as a one-byte
+ boolean; they tell whether the transition times associated with local
+ time types were specified as standard time or local (wall clock)
+ time.
+
+ * tzh_ttisutcnt UT/local indicators, each stored as a one-byte boolean;
+ they tell whether the transition times associated with local time
+ types were specified as UT or local time. If a UT/local indicator is
+ set, the corresponding standard/wall indicator must also be set.
+
+ The standard/wall and UT/local indicators were designed for
+ transforming a TZif file's transition times into transitions
+ appropriate for another time zone specified via a POSIX-style TZ string
+ that lacks rules. For example, when TZ="EET-2EEST" and there is no
+ TZif file "EET-2EEST", the idea was to adapt the transition times from
+ a TZif file with the well-known name "posixrules" that is present only
+ for this purpose and is a copy of the file "Europe/Brussels", a file
+ with a different UT offset. POSIX does not specify this obsolete
+ transformational behavior, the default rules are installation-
+ dependent, and no implementation is known to support this feature for
+ timestamps past 2037, so users desiring (say) Greek time should instead
+ specify TZ="Europe/Athens" for better historical coverage, falling back
+ on TZ="EET-2EEST,M3.5.0/3,M10.5.0/4" if POSIX conformance is required
+ and older timestamps need not be handled accurately.
+
+ The localtime(3) function normally uses the first ttinfo structure in
+ the file if either tzh_timecnt is zero or the time argument is less
+ than the first transition time recorded in the file.
+
+ Version 2 format
+ For version-2-format timezone files, the above header and data are
+ followed by a second header and data, identical in format except that
+ eight bytes are used for each transition time or leap second time.
+ (Leap second counts remain four bytes.) After the second header and
+ data comes a newline-enclosed, POSIX-TZ-environment-variable-style
+ string for use in handling instants after the last transition time
+ stored in the file or for all instants if the file has no transitions.
+ The POSIX-style TZ string is empty (i.e., nothing between the newlines)
+ if there is no POSIX-style representation for such instants. If
+ nonempty, the POSIX-style TZ string must agree with the local time type
+ after the last transition time if present in the eight-byte data; for
+ example, given the string "WET0WEST,M3.5.0,M10.5.0/3" then if a last
+ transition time is in July, the transition's local time type must
+ specify a daylight-saving time abbreviated "WEST" that is one hour east
+ of UT. Also, if there is at least one transition, time type 0 is
+ associated with the time period from the indefinite past up to but not
+ including the earliest transition time.
+
+ Version 3 format
+ For version-3-format timezone files, the POSIX-TZ-style string may use
+ two minor extensions to the POSIX TZ format, as described in
+ newtzset(3). First, the hours part of its transition times may be
+ signed and range from -167 through 167 instead of the POSIX-required
+ unsigned values from 0 through 24. Second, DST is in effect all year
+ if it starts January 1 at 00:00 and ends December 31 at 24:00 plus the
+ difference between daylight saving and standard time.
+
+ Version 4 format
+ For version-4-format TZif files, the first leap second record can have
+ a correction that is neither +1 nor -1, to represent truncation of the
+ TZif file at the start. Also, if two or more leap second transitions
+ are present and the last entry's correction equals the previous one,
+ the last entry denotes the expiration of the leap second table instead
+ of a leap second; timestamps after this expiration are unreliable in
+ that future releases will likely add leap second entries after the
+ expiration, and the added leap seconds will change how post-expiration
+ timestamps are treated.
+
+ Interoperability considerations
+ Future changes to the format may append more data.
+
+ Version 1 files are considered a legacy format and should not be
+ generated, as they do not support transition times after the year 2038.
+ Readers that understand only Version 1 must ignore any data that
+ extends beyond the calculated end of the version 1 data block.
+
+ Other than version 1, writers should generate the lowest version number
+ needed by a file's data. For example, a writer should generate a
+ version 4 file only if its leap second table either expires or is
+ truncated at the start. Likewise, a writer not generating a version 4
+ file should generate a version 3 file only if TZ string extensions are
+ necessary to accurately model transition times.
+
+ The sequence of time changes defined by the version 1 header and data
+ block should be a contiguous sub-sequence of the time changes defined
+ by the version 2+ header and data block, and by the footer. This
+ guideline helps obsolescent version 1 readers agree with current
+ readers about timestamps within the contiguous sub-sequence. It also
+ lets writers not supporting obsolescent readers use a tzh_timecnt of
+ zero in the version 1 data block to save space.
+
+ When a TZif file contains a leap second table expiration time, TZif
+ readers should either refuse to process post-expiration timestamps, or
+ process them as if the expiration time did not exist (possibly with an
+ error indication).
+
+ Time zone designations should consist of at least three (3) and no more
+ than six (6) ASCII characters from the set of alphanumerics, "-", and
+ "+". This is for compatibility with POSIX requirements for time zone
+ abbreviations.
+
+ When reading a version 2 or higher file, readers should ignore the
+ version 1 header and data block except for the purpose of skipping over
+ them.
+
+ Readers should calculate the total lengths of the headers and data
+ blocks and check that they all fit within the actual file size, as part
+ of a validity check for the file.
+
+ When a positive leap second occurs, readers should append an extra
+ second to the local minute containing the second just before the leap
+ second. If this occurs when the UTC offset is not a multiple of 60
+ seconds, the leap second occurs earlier than the last second of the
+ local minute and the minute's remaining local seconds are numbered
+ through 60 instead of the usual 59; the UTC offset is unaffected.
+
+ Common interoperability issues
+ This section documents common problems in reading or writing TZif
+ files. Most of these are problems in generating TZif files for use by
+ older readers. The goals of this section are:
+
+ * to help TZif writers output files that avoid common pitfalls in older
+ or buggy TZif readers,
+
+ * to help TZif readers avoid common pitfalls when reading files
+ generated by future TZif writers, and
+
+ * to help any future specification authors see what sort of problems
+ arise when the TZif format is changed.
+
+ When new versions of the TZif format have been defined, a design goal
+ has been that a reader can successfully use a TZif file even if the
+ file is of a later TZif version than what the reader was designed for.
+ When complete compatibility was not achieved, an attempt was made to
+ limit glitches to rarely used timestamps and allow simple partial
+ workarounds in writers designed to generate new-version data useful
+ even for older-version readers. This section attempts to document
+ these compatibility issues and workarounds, as well as to document
+ other common bugs in readers.
+
+ Interoperability problems with TZif include the following:
+
+ * Some readers examine only version 1 data. As a partial workaround, a
+ writer can output as much version 1 data as possible. However, a
+ reader should ignore version 1 data, and should use version 2+ data
+ even if the reader's native timestamps have only 32 bits.
+
+ * Some readers designed for version 2 might mishandle timestamps after
+ a version 3 or higher file's last transition, because they cannot
+ parse extensions to POSIX in the TZ-like string. As a partial
+ workaround, a writer can output more transitions than necessary, so
+ that only far-future timestamps are mishandled by version 2 readers.
+
+ * Some readers designed for version 2 do not support permanent daylight
+ saving time with transitions after 24:00 - e.g., a TZ string
+ "EST5EDT,0/0,J365/25" denoting permanent Eastern Daylight Time (-04).
+ As a workaround, a writer can substitute standard time for two time
+ zones east, e.g., "XXX3EDT4,0/0,J365/23" for a time zone with a
+ never-used standard time (XXX, -03) and negative daylight saving time
+ (EDT, -04) all year. Alternatively, as a partial workaround a writer
+ can substitute standard time for the next time zone east - e.g.,
+ "AST4" for permanent Atlantic Standard Time (-04).
+
+ * Some readers designed for version 2 or 3, and that require strict
+ conformance to RFC 8536, reject version 4 files whose leap second
+ tables are truncated at the start or that end in expiration times.
+
+ * Some readers ignore the footer, and instead predict future timestamps
+ from the time type of the last transition. As a partial workaround,
+ a writer can output more transitions than necessary.
+
+ * Some readers do not use time type 0 for timestamps before the first
+ transition, in that they infer a time type using a heuristic that
+ does not always select time type 0. As a partial workaround, a
+ writer can output a dummy (no-op) first transition at an early time.
+
+ * Some readers mishandle timestamps before the first transition that
+ has a timestamp not less than -2**31. Readers that support only
+ 32-bit timestamps are likely to be more prone to this problem, for
+ example, when they process 64-bit transitions only some of which are
+ representable in 32 bits. As a partial workaround, a writer can
+ output a dummy transition at timestamp -2**31.
+
+ * Some readers mishandle a transition if its timestamp has the minimum
+ possible signed 64-bit value. Timestamps less than -2**59 are not
+ recommended.
+
+ * Some readers mishandle POSIX-style TZ strings that contain "<" or
+ ">". As a partial workaround, a writer can avoid using "<" or ">"
+ for time zone abbreviations containing only alphabetic characters.
+
+ * Many readers mishandle time zone abbreviations that contain non-ASCII
+ characters. These characters are not recommended.
+
+ * Some readers may mishandle time zone abbreviations that contain fewer
+ than 3 or more than 6 characters, or that contain ASCII characters
+ other than alphanumerics, "-", and "+". These abbreviations are not
+ recommended.
+
+ * Some readers mishandle TZif files that specify daylight-saving time
+ UT offsets that are less than the UT offsets for the corresponding
+ standard time. These readers do not support locations like Ireland,
+ which uses the equivalent of the POSIX TZ string
+ "IST-1GMT0,M10.5.0,M3.5.0/1", observing standard time (IST, +01) in
+ summer and daylight saving time (GMT, +00) in winter. As a partial
+ workaround, a writer can output data for the equivalent of the POSIX
+ TZ string "GMT0IST,M3.5.0/1,M10.5.0", thus swapping standard and
+ daylight saving time. Although this workaround misidentifies which
+ part of the year uses daylight saving time, it records UT offsets and
+ time zone abbreviations correctly.
+
+ * Some readers generate ambiguous timestamps for positive leap seconds
+ that occur when the UTC offset is not a multiple of 60 seconds. For
+ example, in a timezone with UTC offset +01:23:45 and with a positive
+ leap second 78796801 (1972-06-30 23:59:60 UTC), some readers will map
+ both 78796800 and 78796801 to 01:23:45 local time the next day
+ instead of mapping the latter to 01:23:46, and they will map 78796815
+ to 01:23:59 instead of to 01:23:60. This has not yet been a
+ practical problem, since no civil authority has observed such UTC
+ offsets since leap seconds were introduced in 1972.
+
+ Some interoperability problems are reader bugs that are listed here
+ mostly as warnings to developers of readers.
+
+ * Some readers do not support negative timestamps. Developers of
+ distributed applications should keep this in mind if they need to
+ deal with pre-1970 data.
+
+ * Some readers mishandle timestamps before the first transition that
+ has a nonnegative timestamp. Readers that do not support negative
+ timestamps are likely to be more prone to this problem.
+
+ * Some readers mishandle time zone abbreviations like "-08" that
+ contain "+", "-", or digits.
+
+ * Some readers mishandle UT offsets that are out of the traditional
+ range of -12 through +12 hours, and so do not support locations like
+ Kiritimati that are outside this range.
+
+ * Some readers mishandle UT offsets in the range [-3599, -1] seconds
+ from UT, because they integer-divide the offset by 3600 to get 0 and
+ then display the hour part as "+00".
+
+ * Some readers mishandle UT offsets that are not a multiple of one
+ hour, or of 15 minutes, or of 1 minute.
SEE ALSO
- newctime(3)
+ time(2), localtime(3), tzset(3), tzselect(8), zdump(8), zic(8).
+
+ Olson A, Eggert P, Murchison K. The Time Zone Information Format
+ (TZif). 2019 Feb. Internet RFC 8536 <https://datatracker.ietf.org/
+ doc/html/rfc8536> doi:10.17487/RFC8536 <https://doi.org/10.17487/
+ RFC8536>.
+
+ TZFILE(5)