Why general-purpose programming languages suck for hardware design

I already talked about what's wrong with Register Transfer Level and why you should use a higher-level domain-specific language, because High-Level Synthesis cannot work with arbitrary programs. Yet I am still surprised when I read about or hear people who still firmly believe that using a general-purpose programming language (such as C, C++, C#, Java, Javascript, PHP, Python, Ruby... <insert here your favorite language>) for hardware design is a good idea. Let me try to prove you that this is not the case :-)

Proof

We'll make the assumption that there exists a system than can transform any simple program written in a general-purpose programming language to an efficient hardware implementation. This means that if I write a simple program, for example one that creates a list of random numbers and outputs them in reverse order, the system should produce efficient hardware. Below is the program I wrote in C++; output is emulated with printing on the screen. I used C++ because a lot of people like (or are used to) this language, especially in the embedded systems domain, so this is a good starting point I believe. Here is the code:

#include <list>
#include <iostream>
#include <cstdio>

using std::list;

class RandList {
private:
  list<int> data;
public:
  void update(int size) {
    for (int i = 0; i < size; i++) {
      data.push_back(rand() % size);
    }
  }

  void reverseWalk() {
    for(list<int>::reverse_iterator it = data.rbegin(); it != data.rend();     ++it) {
      std::cout << *it << std::endl;
    }
  }
};

int main(int argc, char* argv[]) {
  if (argc == 0) {
    fprintf(stderr, "expected one arg\n");
    return -1;
  }

  RandList *list = new RandList();
  list->update(atoi(argv[1]));
  list->reverseWalk();
  return 0;
}

This is a pretty standard implementation, I wrote a class that uses a linked list implemented by the "list" class from the STL (Standard Template Library). So what piece of hardware do you expect to be synthesized from this code? Before you answer, note that including a class from the STL means that behind the scenes your program actually includes a huge bunch of code. Also, creating the list and calling "push_back" allocates memory dynamically using std::allocator and "new". The list size is given at runtime, so the only thing you know is that it could be as large as 2^31 - 1 = 2147483647. And because "list" is a linked list, each "int" is actually stored in a node, which on a 32-bit system means at least 12 bytes : two 4 bytes pointers plus the 4 bytes "int"; and you must count headers added by the dynamic memory allocation system.

To create hardware that implements the same behavior as this program, you would need: a dynamic memory allocator, a virtual memory system (or dozens of gigabytes of RAM, just in case), and a very complex Finite State Machine to implement the behavior of all the methods that are called directly and indirectly (so this includes methods from "list", "reverse_iterator", "allocator", etc.). Our hypothetical system cannot create an efficient hardware implementation from this simple C++ program. Because it was its only condition of existence, such a system cannot exist for any simple program written in a general-purpose programming language, though it may exist for some class of programs.

Conclusion

I implemented an algorithm as a simple program with standard features used in many programs. The truth is that not all features are good, and not all semantics make sense in a general-purpose programming language if you want to target hardware. Maybe I should have used an array instead of a list, probably with a maximum size, and without dynamic memory allocation. If I wanted higher performance, I could have chosen to implement my own "list" class to use several memories in parallel, or a single memory with large numbers.

The conclusion is that if you want to use a programming language for efficient hardware design, you will have to strip all the features that are ill-fitted and remove/change its semantics accordingly. In the end you will no longer have something general, instead you end up with a specific programming language better known as a Domain-Specific Language.