(2023-04-15) On sound in the world of minimalist computing
----------------------------------------------------------
Yes, I know, I know. I promised to continue the stream compression topic.
Well, I'm in progress of writing some C code to support my further research 
(if I can even call it so) in this area. And the results are to follow 
whenever this is complete. No rush, but I'm not going to abandon this in the 
middle, I've already written enough code to not just throw it away.

Today, however, I want to talk about computer audio in general and computer
music in particular. While I believe in the predominant role of fully 
analogue media to store sound and music in case our computers become small 
and weak and truly LPC devices, there still is some space for possibilities 
and exploration. Because as we know, the greatest hits among those fully 
created on computers were created on, or at least for, old 8-bit machines 
like Famicom and C64. The beauty of chiptunes was in pushing the capability 
limits of the soundchips of the time. While the soundchip essentially 
defined the basics of how you design your sounds, giving you a very limited 
amount of PSG channels, each with its own restrictions, you could unleash 
your creativity not only with the music itself but also with your own 
techniques to bypass these restrictions to not sound like everyone else. 
This is why, for instance, when I hear a music from some Famicom/NES game, I 
can tell for sure if it's Sunsoft, Natsume, HAL or Konami, even if I hear 
this music or see the game for the first time in my life. Because you can't 
confuse their sound engines, that's how original the sound is that they 
produce on the same five standard PSG channels.

But then, not every machine of the time even had a soundchip or any sound
output capabilities at all, except a simple piezo buzzer or a single 
frequency speaker. Aha! What if we turn this buzzer on and off fast enough 
to simulate any frequency we want? This is how the PFM (pulse frequency 
modulation) technique was invented, now often referred to as "1-bit music". 
The "1-bit" here merely refers to the fact that we operate on the output 
device that can only be in one of the two states, on or off. In reality this 
also meant that everything that soundchips did in hardware, here had to be 
done in software. That's why all PFM music authors also had their own 
speaker drivers bundled within the album, the game or whatever their music 
was created for. 

Starting with Amiga computers and IBM PCs with external sound cards and to
this very day, all audio is now generally output using the technique first 
introduced for Audio CDs, PCM (pulse-code modulation), where the signal is 
quantized to some finite amount of levels and sampled at some rate per 
second. The bigger the rate and the amount of levels are, the better the 
sound quality, but the more processing power is required too. Nowadays, 
sound generation is fully abstracted from the hardware layer and composers 
don't have to adapt to the chips of the machines they work with anymore. 
They just output sound in some format that can be represented as PCM data at 
the end of the day, and the system then takes care of the rest. Most of them 
don't even create music and effects in the form of pure PCM data - it is 
either decompressed from a more compact source (like MP3, OGG or FLAC) or 
synthesized (in which case the composer only deals with the notes, 
instruments and effects that a particular DAW or tracker software can offer, 
not to mention even higher layers of abstraction like Web Audio API). So, 
even the pure PCM, despite its ubiquity, isn't something that everyone 
touches directly these days.

It is, however, still there. And every modern OS in existence offers a more
or less straightforward way to output raw PCM data directly to the audio 
device (if you can't find a more straightforward way, you can always use 
less straightforward ways like the aforementioned Web Audio API, at the cost 
of higher resource usage, of course). A special case can be noticed when 
your signal is quantized to 256 levels and every level is represented with a 
single integer from 0 to 255 (or from -128 to 127, depending on how you look 
at it, but usually it's treated as unsigned), that is, a byte. Combined with 
the default PSTN-compatible sampling rate for most sound adapters, that is, 
8000 KHz, we get the default "raw" PCM mode, which is unsigned 8 bit 8KHz 
PCM, that requires no additional preconfiguration of the adapter to emit 
sound, at least in Linux-based OSes. Which is why, when OSS was a major 
thing and the audio device was represented with a single file in /dev/ 
(/dev/audio or /dev/dsp), people had fun by redirecting the contents of 
various (non-sound) files directly into this file and listening to what came 
out as the result. With ALSA, PulseAudio or whatever else abstraction layer 
you have, you still can pipe the output to something like properly 
parameterized sox-play to achieve the same effect:

cat somefile | play -traw -r8000 -b8 -e unsigned-integer -

Some files gave interesting sound patterns, others just gave noise, so
experienced listeners could even guess the file type by what they heard. But 
then, some people started thinking a step forward: "What if, instead of just 
catting existing files, we generate the raw sound data programmatically?" 
Among those people was the Finnish guy I already mentioned on this phlog, 
viznut. He had compiled a bunch of short C programs where he defined every 
sound output byte as a function of a single eternally incrementing integer 
variable t, and the standard output of these programs was the unsigned 8-bit 
8KHz PCM data to be redirected to the /dev/audio, aplay or sox-play. He 
called this concept "Bytebeat". And, if you read my previous explanation 
carefully, you can see where this name came from.

Originally, Bytebeat gained popularity in this very form, but then, as more
bytebeat players got ported to different environments including 
browser-based ones (with Web Audio API), it wasn't limited anymore to 
outputting single bytes or even integers (so, the so-called "floatbeat" 
spawned as well), it wasn't limited to only using the bitwise and basic 
arithmetic operators (because JS has trigonimetry and all other stuff in its 
standard library already) and it wasn't limited to just output samples at 
8KHz. On one hand, the removal of this limitations allowed people to 
transcend to other spaces of exploration, on the other hand, the same people 
started abusing JS's capabilities and built entire trackers inside the 
bytebeat expressions, returning to the traditional and not generative music, 
which kinda defeated the whole initial purpose and the idea to find music in 
purely mathematical expressions, preferably as short as possible.

I understand why the general public went the WAAPI route. Browsers are easy,
they allow for quick exploration and on-the-fly change of sound, not even 
having to compile the expression every time you change it. Less getting in 
the way between you and the formula, more possibilities for customization 
and general convenience. However, using a whole friggin' browser for this 
task is as far from the KISS way as possible. I'd rather use pure shell 
scripting but I know the bitwise operations in Bash are a bitch, it's a 
command language, not a general-purpose programming language, after all. So, 
we need something no less ubiquitous that would ideally be present even in 
Busybox but would save us from all the hassle with compilation.

Enter AWK. Yes, it's a full-featured programming language and yes, it's
present in Busybox, although probably not as powerful as GAWK. I don't have 
any good knowledge of AWK right now, I probably need to learn it at least as 
well as Bash and start using it on a daily basis. But, you know what, this 
example worked on my Arch:

seq 11111111 | busybox awk '{printf("%c",and($1,rshift($1,8)))}' | play -traw
-r8000 -b8 -e unsigned-integer -

Which means that yes, we do have bitwise operators in Busybox AWK and we can
port any classic C bytebeat formula to this language. I'll definitely 
experiment with this more and probably will create a separate document 
linked from the main hoi.st map that contains ports of some music formulas I 
liked or created myself. A synthesizer and a tracker, all in a tiny formula, 
now in AWK.

If AwkBeat isn't yet a thing, now is the time.

--- Luxferre ---