Mapping Programming Language Complexity

Aleksander Demko

February, 2014

Software development is fundamentally an exercise in managing complexity. Programmers juggle and connect algorithms, their states and the data they operate on. These algorithms are further grouped into modules or programs, which then connect users and computers together. This virtual world is unlike any other construction project and it's all malleable at any time.

As computers get more powerful, users expect more functionality, requiring programmers to make larger and more complex programs. Under short deadlines and often changing requirements, teams of programmers have to work together each creating and applying their code into the application, hoping that the whole mess actually runs and works. Weather this complexity is understood by future programmers coming to the project is often an afterthough, it it's considered at all.

Programming languages can help manage this complexity. They can do this with:

compile-time type checking which helps programmers structure their code such that it can be checked for type errors. Type errors are very easy to make and unless the programmer is constantly checking their types, they'll have crashes at run-time.
compile-time optimizations can then use the type information to automatically emit optimized machine code, giving programmers faster code with no additional work.
a supplemental run-time system can possibly do other optimizations, code checking or memory management (garbage collection).

With these factors in mind, I'd like to classify programming languages on where they generally push the complexity. One way to do this is to map this onto a triangle, with the tree ends being:

Heavily typed languages with lots of opportunities for type-enforcement and optimizations at compile-type push the complexity to the compiler
Languages that leave it up to the runtime to check for errors push complexity to the run-time
Languages that do nether leave it up to the developer himself

Laid out in an text diagram, this triangle would look like:

            compiler
               /\
              /  \
             /    \
            /      \
           /        \
  run-time ---------- developer

Superimposing some mainstream languages over this gets you:

              C++
               /\
              /  \
             /    \
            /      \
           /        \
   Java/C# ---------- C
               |
           JavaScript

These are the major mainstream languages that can represent that style of complexity management:

The C language, which was invented as a "portable assembly" language has decidedly few modern abstractions and no run time checking. C programmers are forced to mock up their own abstractions as the language lacks even basic string and collections types. Except for the smallest embedded platforms or specialized programs (like OS kernels or drivers) C has little to offer for building larger software systems.

C++ on the other hand has offers many standard paradigms (such as object oriented, generic & functional). Although perhaps daunting to the beginning programmer, it offers all the features needed to build large type-safe programs that through heavy compiler optimizations and in-lining, can compile down to high performance native code. This is were you ideally want to push complexity - to the compiler. It's essentially "free", bothering neither the user with run-time costs or the developer with menial work.

Java/C# provide decidedly fewer abstractions than C++, simplifying the language at the cost of expressive power. These applications also run under run-time systems that perform constant safety checking and memory management. This run-time and simplified language greatly aids programmers in producing software that is easier to test and debug, however this cost comes at the expense of run-time performance.

Between Java and C I've placed JavaScript, which represents all scripting languages, such as Python or Ruby. These languages have full run-times, but lack a compiler-pass to help with basic type checking, resulting in more cognitive load on the developer.

What are the trends?

Large line-of-business database applications - those that make up the bulk of software development - tend to be in the Java/C# family. Performance is not a concern in these situations as modern servers can easily service these work loads.

For even smaller applications, performance is even less of a concern. Scripting, without the requirements of type-safety allow such developers to "ramp up" quickly. However as the application reaches a modest size, refactoring and general code maintenance becomes very difficult without type-safety. This results in complexity being pushed towards the developer.

For applications that are both large and performance-critical, pushing complexity to the compiler is required. In fact, this is the ideal location for complexity for all types of applications as the compiler saves both the user and developers time and run-time costs. With the recent surge in native language development and growing concerned over battery life and data center costs, more applications might starting use a compiler-centric programming language.