(2023-04-15) On sound in the world of minimalist computing ---------------------------------------------------------- Yes, I know, I know. I promised to continue the stream compression topic. Well, I'm in progress of writing some C code to support my further research (if I can even call it so) in this area. And the results are to follow whenever this is complete. No rush, but I'm not going to abandon this in the middle, I've already written enough code to not just throw it away. Today, however, I want to talk about computer audio in general and computer music in particular. While I believe in the predominant role of fully analogue media to store sound and music in case our computers become small and weak and truly LPC devices, there still is some space for possibilities and exploration. Because as we know, the greatest hits among those fully created on computers were created on, or at least for, old 8-bit machines like Famicom and C64. The beauty of chiptunes was in pushing the capability limits of the soundchips of the time. While the soundchip essentially defined the basics of how you design your sounds, giving you a very limited amount of PSG channels, each with its own restrictions, you could unleash your creativity not only with the music itself but also with your own techniques to bypass these restrictions to not sound like everyone else. This is why, for instance, when I hear a music from some Famicom/NES game, I can tell for sure if it's Sunsoft, Natsume, HAL or Konami, even if I hear this music or see the game for the first time in my life. Because you can't confuse their sound engines, that's how original the sound is that they produce on the same five standard PSG channels. But then, not every machine of the time even had a soundchip or any sound output capabilities at all, except a simple piezo buzzer or a single frequency speaker. Aha! What if we turn this buzzer on and off fast enough to simulate any frequency we want? This is how the PFM (pulse frequency modulation) technique was invented, now often referred to as "1-bit music". The "1-bit" here merely refers to the fact that we operate on the output device that can only be in one of the two states, on or off. In reality this also meant that everything that soundchips did in hardware, here had to be done in software. That's why all PFM music authors also had their own speaker drivers bundled within the album, the game or whatever their music was created for. Starting with Amiga computers and IBM PCs with external sound cards and to this very day, all audio is now generally output using the technique first introduced for Audio CDs, PCM (pulse-code modulation), where the signal is quantized to some finite amount of levels and sampled at some rate per second. The bigger the rate and the amount of levels are, the better the sound quality, but the more processing power is required too. Nowadays, sound generation is fully abstracted from the hardware layer and composers don't have to adapt to the chips of the machines they work with anymore. They just output sound in some format that can be represented as PCM data at the end of the day, and the system then takes care of the rest. Most of them don't even create music and effects in the form of pure PCM data - it is either decompressed from a more compact source (like MP3, OGG or FLAC) or synthesized (in which case the composer only deals with the notes, instruments and effects that a particular DAW or tracker software can offer, not to mention even higher layers of abstraction like Web Audio API). So, even the pure PCM, despite its ubiquity, isn't something that everyone touches directly these days. It is, however, still there. And every modern OS in existence offers a more or less straightforward way to output raw PCM data directly to the audio device (if you can't find a more straightforward way, you can always use less straightforward ways like the aforementioned Web Audio API, at the cost of higher resource usage, of course). A special case can be noticed when your signal is quantized to 256 levels and every level is represented with a single integer from 0 to 255 (or from -128 to 127, depending on how you look at it, but usually it's treated as unsigned), that is, a byte. Combined with the default PSTN-compatible sampling rate for most sound adapters, that is, 8000 KHz, we get the default "raw" PCM mode, which is unsigned 8 bit 8KHz PCM, that requires no additional preconfiguration of the adapter to emit sound, at least in Linux-based OSes. Which is why, when OSS was a major thing and the audio device was represented with a single file in /dev/ (/dev/audio or /dev/dsp), people had fun by redirecting the contents of various (non-sound) files directly into this file and listening to what came out as the result. With ALSA, PulseAudio or whatever else abstraction layer you have, you still can pipe the output to something like properly parameterized sox-play to achieve the same effect: cat somefile | play -traw -r8000 -b8 -e unsigned-integer - Some files gave interesting sound patterns, others just gave noise, so experienced listeners could even guess the file type by what they heard. But then, some people started thinking a step forward: "What if, instead of just catting existing files, we generate the raw sound data programmatically?" Among those people was the Finnish guy I already mentioned on this phlog, viznut. He had compiled a bunch of short C programs where he defined every sound output byte as a function of a single eternally incrementing integer variable t, and the standard output of these programs was the unsigned 8-bit 8KHz PCM data to be redirected to the /dev/audio, aplay or sox-play. He called this concept "Bytebeat". And, if you read my previous explanation carefully, you can see where this name came from. Originally, Bytebeat gained popularity in this very form, but then, as more bytebeat players got ported to different environments including browser-based ones (with Web Audio API), it wasn't limited anymore to outputting single bytes or even integers (so, the so-called "floatbeat" spawned as well), it wasn't limited to only using the bitwise and basic arithmetic operators (because JS has trigonimetry and all other stuff in its standard library already) and it wasn't limited to just output samples at 8KHz. On one hand, the removal of this limitations allowed people to transcend to other spaces of exploration, on the other hand, the same people started abusing JS's capabilities and built entire trackers inside the bytebeat expressions, returning to the traditional and not generative music, which kinda defeated the whole initial purpose and the idea to find music in purely mathematical expressions, preferably as short as possible. I understand why the general public went the WAAPI route. Browsers are easy, they allow for quick exploration and on-the-fly change of sound, not even having to compile the expression every time you change it. Less getting in the way between you and the formula, more possibilities for customization and general convenience. However, using a whole friggin' browser for this task is as far from the KISS way as possible. I'd rather use pure shell scripting but I know the bitwise operations in Bash are a bitch, it's a command language, not a general-purpose programming language, after all. So, we need something no less ubiquitous that would ideally be present even in Busybox but would save us from all the hassle with compilation. Enter AWK. Yes, it's a full-featured programming language and yes, it's present in Busybox, although probably not as powerful as GAWK. I don't have any good knowledge of AWK right now, I probably need to learn it at least as well as Bash and start using it on a daily basis. But, you know what, this example worked on my Arch: seq 11111111 | busybox awk '{printf("%c",and($1,rshift($1,8)))}' | play -traw -r8000 -b8 -e unsigned-integer - Which means that yes, we do have bitwise operators in Busybox AWK and we can port any classic C bytebeat formula to this language. I'll definitely experiment with this more and probably will create a separate document linked from the main hoi.st map that contains ports of some music formulas I liked or created myself. A synthesizer and a tracker, all in a tiny formula, now in AWK. If AwkBeat isn't yet a thing, now is the time. --- Luxferre ---