Building a New Database Management System in Academia
When we started the Peloton project we decided to fork Postgres and then cut out the parts that we wanted to rewrite. Postgres' code is beautiful. It's well-documented. It's portable. It's a textbook implementation of a relational DBMS. But it is a bit dated and the overall architecture has some issues. The first problem that we encountered was that we had to convert it from ANSI-95 C to modern C11 to make it work with our new storage manager. My PhD student Joy Arulraj did this with some summer interns in about a month (see his C Postgres fork on Github). We then spent another month converting its runtime architecture from a multi-process, shared-memory model to a single-process, multi-threaded model. We deemed that this was necessary to support better single-node scale-up now and eventually go distributed in the future. One surprising thing that we found was that using Postgres' WIN32 code is easier to convert to pthreads than the Linux-specific portions of the code.