(2023-05-04) If stuck with Busybox but need to run Forth, use... Subleq ----------------------------------------------------------------------- It might sound even crazier than running Brainfuck on top of VTL-2 on top of AWK, but yeah, there is an implementation of eForth ([1]) for a specific variant of Subleq architecture, the details of which I'm going to share here shortly and programs for which are distributed as .dec files, where a .dec file is just a list of signed decimal integers separated by whatever whitespace delimiter. And yes, this file format looks like a perfect target for AWK, so I couldn't resist writing my own Subleq implementation in this language. And, to be honest, considering all the quirks AWK requires where plain C offers straightforward solutions, 26 SLOC is not a lot for such a useful single-instruction computer architecture. Now, I am going to share the ready-made .awk file as usual on the main page, but I want to go through every significant line of it here to explain what's going on. As I already said, this program accepts a single .dec file as input in the form [busybox] awk -f subleq.awk program.dec and processes it line by line. But first, we need to define two helper functions: function L(v) { # cast any value to unsigned 16-bit integer v = int(v) while(v < 0) v += 65536 return int(v%65536) } function getchar(c, cmd) { # POSIX-compatible getchar emulation with sh read (cmd="c='';IFS= read -r -n 1 -d $'\\0' c;printf '%u' \"'$c\"") | getline c close(cmd) return int(c) } These functions are mostly self-explanatory, but I have something to add. The L() function could be simplified if we could use bitwise operations, but I decided to stay on the POSIX side and emulated everything with conditions and the % operator. It also ensures the result stays integer before and after the conversion. The getchar() function is necessary to emulate Subleq's input logic and, as you can see, unlike the C version, here it requires some external shell processing so it is quite slow already. I only used POSIX-compatible command options for read and printf though. Here, we read a single character (which can be a newline, hence the null delimiter) and then display its decimal character code, which is cast to integer at the AWK side and returned to the caller after the subprocess is closed. Now, we can initialize our 64K virtual Subleq memory: BEGIN { for(pc=0;pc<65536;pc++) MEM[pc] = 0 # init the memory array pc = a = b = c = 0 # reset the program counter and other vars } Once we've done this, we can start matching on the integers within the file and filling our memory with the actual values: { for(i=1;i<=NF;i++) if($i ~ /^[-0-9][0-9]*$/) MEM[pc++] = L($i) } Here, the logic is like this. We iterate over every single input line, which can contain any amount of fields. The default delimiters are fine but we need to iterate over every field we encounter. If the field matches the regex for _signed_ integers (the first character can be either - or a digit, any next ones, if present, can only be digits), we cast it into a 16-bit unsigned value using our L() function, set it to the current memory cell and shift the pointer to the next one. Finally, once the entire file has been read and parsed, we can start the actual execution process in the END block: END { for(pc=0;pc<32768;) { a = MEM[pc++]; b = MEM[pc++]; c = MEM[pc++] # fill the cell addresses if(a == 65535) MEM[b] = L(getchar()) else if(b == 65535) printf("%c", MEM[a]%256) else { MEM[b] = L(MEM[b] - MEM[a]) # subtract the first 2 cells and cast if(MEM[b] == 0 || (MEM[b] > 32767)) pc = c # jump if result <=0 } } } Here's how it works. First, we reset our program counter PC once again to 0 and start sequentially reading three values in a loop: A, B and C. These are the cell addresses our OISC operates on every cycle. Now, and this is the first quirk of this particular Subleq variant, we have two special cases: if A is set to -1 (which is obviously cast to 65535 as unsigned 16-bit value), we input a character from standard input and set it to the cell at address B, and if B is set to -1, we output the contents of the cell at address A as a character to the standard output (which is done much easier in AWK and doesn't need a special method). If none of this special cases is true, we run the general Subleq logic: subtract the cell at address A from the cell at address B and write the result to the cell at address B, then jump to address C if this result is less then or equal to zero. However, you may notice that the code doesn't specify the condition exactly like this. What's the matter? The thing is, and here is the second quirk, that this specific implementation does all the casting beforehands and requires to map all negative subtraction results (from -32768 to -1) to the upper half of 16-bit range (from 32768 to 65535). And programs like eForth actually do check this to determine whether they are running on the correct Subleq VM version. For the same reason, we only iterate our program counter from 0 to 32767, as the program can only be loaded into the lower 32K of virtual memory. Higher PC values would be internally treated as negative and thus invalid. So, we change our <=0 condition to check if the result is zero or above 32767. This way, everything works as expected. And guess what, the .dec file of eForth does run under Busybox AWK too. Extremely slowly but surely. I recommend the version of subleq.dec found on the JS version of howerj's project, because the one from the repo is even slower despite having smaller size. So, even if you can't compile anything for the target system (whatever it might be) but have an awk command there, you can run eForth programs just via this Subleq emulator. Ain't it wonderful? --- Luxferre --- [1]: https://howerj.github.io/subleq.htm