[Feb 16 2025]

  fpga-accelerated alu, that recognizes common instruction combinations/sequences e.g. "add,mul,sub"
  at runtime and fuses these operations if possible / sensible.
  there would still be a regular alu but for some combinations we could use the fpga.
  we might need something like a fast instruction buffer before the pipeline through,
  the pattern matcher would work on that buffer.

  could be pretty dumb and counterproductive,
  this just came to my mind thinking about learned optimization on a hardware level