Thursday, March 28, 2013

Is your language strongly, weakly, statically or dynamically typed?

Would you describe a programming language to be strongly typed, weakly typed, statically typed or dynamically typed? People often use the terms "strongly typed" and "statically typed" interchangeably, and likewise "weakly typed" and "dynamically typed" are often used to mean the same thing. I think this imprecision of terminology is unfortunate. That is, the definition of strong v.s. weak typing is so vague that I would claim it's possible to see them as completely orthogonal concepts.

For example, here is how I would classify 4 languages I have worked with:

  Strongly typed Weakly typed
Statically typed Java C
Dynamically typed Groovy JavaScript

Whether language's a type system is statically or dynamically typed is normally pretty clear: a  statically typed language does type checking at compile time, whereas a dynamically typed language does type checking at run time. Whether a language is strongly typed, however, is more vague. My preferred definition is that a strongly typed language enforces type safety, the correct use of data according to their type.

Java is obviously a statically typed language: the data type is explicitly stated in the code, and your code won't compile if the declared types conflict. You could fool the compiler by casting to Object and such, but the JVM won't be fooled. The moment you try something naughty with your object, you will get a ClassCastException.

Groovy is a dynamically typed language. The compiler will let you write code like the following (if you don't enable static type checking), dividing a number by a string:

def a = 4.6

def b = "A string"

print a / b

But while it compiles, the JVM knows what data types you're trying to divide, and slaps you around with an exception at run time. In other words, both languages are strongly typed: you can fool the compiler, but you cannot fool the runtime.

Not everyone might agree, but I'd argue that C (and C++) is a weakly typed language. While the type is declared explicitly to the compiler, it is not at all type safe. You can write code that treats a data as the wrong type:

printf("%d\n", (4.6 / *((int*) "A string")));

This divides a float by a string (masquerading as an integer), and then outputs the result as an integer. That the runtime would let you do such meaningless operations without complaint leads me to classify this language as weakly typed. Needless to say, the output is nonsensical, if your program did not crash in the process.

Finally JavaScript's type conversion rules are so permissive as to make many operations unpredictable and unreliable. It will happily let you do similarly nonsensical operation like:

var x = 4.6 / "A string"

Depending on whether the string can be parsed into a number, you could end up with a NaN or a number in x. The runtime will silently comply with your instructions, leading to unpredictable and sometimes perplexing results.

If I had my way, we would not be describing dynamically typed languages like Groovy or Ruby as "weakly typed" (they are not!). The strong v.s. weak terminology is not precise and should not be used altogether. I don't think there are many language advocates who would say that they love poor type safety, which is what "weakly typed" implies. Let's just stick to arguing over static typing v.s. dynamic typing, an argument over which there are legitimate points on either side.

No comments:

Post a Comment