(2023-04-09) On reliable timekeeping on slow networks
-----------------------------------------------------
I am very passionate about timekeeping. I have a nice collection of watches,
some of which are capable of syncing via longwaves or even Bluetooth LE, and 
the protocol to do this already is reverse-engineered, at least to the 
extent of performing basic time synchronization tasks ([1]). I'll probably 
write a non-GUI tool for it as well once I figure out what the most optimal 
stack is to write it on top of. But, of course, these tools also need some 
source of truth. Something needs to set the time on our own client devices 
before we can pass it further or display it to the user.

Nowadays, all time synchronization and coordination over the Internet is
usually done via the NTP protocol. It's really well-engineered and takes 
into account a lot of factors and allows to receive accurate time all over 
the world. Can't really complain about that. One thing I can complain about 
though, is that it's too complex to reimplement from scratch, and some CVE 
reports about NTP server or client vulnerabilities just confirm that. On top 
of that, it involves a lot of overhead data in every synchronization packet, 
which might not be a lot in modern conventional networks, but pose a 
significant problem once we're talking about something like GPRS at 30 Kpbs, 
PSTN or CSD dialup at 9600 bps, AX.25 at 1200 bps or even slower 
transmission modes at 300 bps. In these conditions, every extra byte 
matters. The solution? Ye goode olde Time protocol (RFC 868), which some 
timekeeping networks (like time.nist.gov) still gracefully run on the port 
37 of their servers. It returns a 4-byte (32-bit) timestamp on a TCP 
connection or as a response to any UDP datagram, and that's it. The 
timestamp is expected to represent the number of seconds since the beginning 
of the year 1900 UTC, and is supposed to roll over every 136 years.

Now, I encourage you to only use the UDP mode of this protocol whenever you
need it, as TCP connections made to just retrieve a 4-byte payload both pose 
significant overhead for our purposes and don't make server admins happy 
either. And, just like with NTP, you still need some way to measure elapsed 
time locally with around a millisecond precision. Once it is sorted though, 
the algorithm to get more or less accurate time (although with a whole 
second resolution) is very simple and straightforward:

1. Prepare a tool for time measurement of steps 2 and 3 combined.
2. Send a random 32-bit datagram to the Time server.
3. Receive the 32-bit timestamp datagram FTIME from the Time server.
4. Record the execution time (in milliseconds) of steps 2 and 3 as ETIME.
5. Obtain the true Unix timestamp (in seconds) using the following formula:
   TRUETIME = |(1000*FTIME + ETIME/2 - 2208988799500) / 1000|
6. Emit the TRUETIME value for further processing. End of algorithm.

There is a couple of things that might need explanation here. First, the RFC
says the Time protocol expects an empty datagram in UDP mode, but IRL most, 
if not all, implementations accept any datagram and just discard its 
contents. The 4-byte length of the outgoing datagram was chosen to make the 
resulting IP packet have exactly the same length as the one we're going to 
receive, so we can then safely divide the elapsed time by 2 to get more or 
less accurate correction. Second, the constant 2208988799500 is the number 
of milliseconds between the start of the year 1900 (where Time protocol 
timestamps start) and the start of the year 1970 (where the Unix epoch 
starts), minus 500 milliseconds used for rounding the final result properly. 
So, starting with the year 2036 when the 32-bit Time protocol counter rolls 
over, we will be adding 2082758400500 here instead of subtracting 
2208988799500. And this is something that we need to know before applying 
the formula, but I hope no one will ever finds themselves in the situation 
they don't know whether or not the year 2036 already has come. But just in 
case, here is a more future-proof version of the same algorithm with a 
larger safety margin (until the year 2106):

1. Prepare a tool for time measurement of steps 2 and 3 combined.
2. Send a random 32-bit datagram to the Time server.
3. Receive the 32-bit timestamp datagram FTIME from the Time server.
4. Record the execution time (in milliseconds) of steps 2 and 3 as ETIME.
5. Obtain the true Unix timestamp (in seconds) using the following formula:
   TRUETIME =
     |(1000*FTIME + ETIME/2 - 2208988799500) / 1000| if FTIME > 2208988800,
     |(1000*FTIME + ETIME/2 + 2082758400500) / 1000| otherwise.
6. Emit the TRUETIME value for further processing. End of algorithm.

Somewhere between the year 2036 and 2106, you can switch to using the second
formula unconditionally, and this will prolong the algorithm to the year 
2172. Afterwards, you adjust the offset accordingly (500 + the amount of 
milliseconds that actually passed between the start of 1970 and the start of 
2172), and so on. This way, the 32-bit second counter can be reused forever.

This approach has the following advantages: only a single send-receive round
required, no transactional overhead thanks to UDP (4 bytes out, 4 bytes in), 
not having to rely on the local clock for anything but elapsed time 
measurement, and simple but accurate enough compensation for the roundtrip 
time. For really slow networks, I can't think of anything better at the 
moment. This is why I hope that even when NIST stops serving time using this 
protocol, someone still will. Maybe, I'll set it up right here on hoi.st, 
who knows.

--- Luxferre ---

[1]: https://git.sr.ht/~luxferre/RCVD