Types

Stronger Emphasis on Type Systems

As seen in fp, functions are building blocks. As in mathematics, their input and output must have a pre-determined set or type such as the integers. Types are the computer science equivalent of sets and determine what we can expect from a function. Predictability is important, so in functional languages the types of functions are quite restrictive.

Functions can only accept a restricted set of inputs. This is also called a strong static type system, since the restriction may happen statically, before the program is run. This helps to catch errors at compile time, before it used by actual users.

What are types

Examples:

  • Primitive types: numbers, arrays, integers, strings
  • User types: objects, classes

Type system

  • tool to specify behavior of computer program at compile-time
  • can be used as constraints for software

Role of types

In both mathematics and computer science, types:

  • Abstract away, simplify and model problems.
  • Add constraints to problem models.

Think about types as specifying lower bounds and upper bounds to problems.

Different levels of typing

There is distinction between the following types of languages:

  1. Weakly-typed languages: make conversions between unrelated types implicitly;
  2. Strongly-typed languages: don’t allow implicit conversions between unrelated types.

Primitive data-types

In weakly typed languages such as Python, R, JS, the following is possible:

a = 123456789
a = "abcdef"

It is also possible to use non-boolean data as a boolean.

For example

let n = 0;
if (!n) {
  // do something
}

In strongly typed languages such as Haskell or Rust there are more constraints. It is not possible to use a number and boolean interchangeably.

For example, the following is disallowed:

let mut a : i8 = 123
a = "abc" // compile error

Simple Generic Data types

Some types collect related data in a structure. These container types are generic in the sense that they contain generic data.

Examples of generic data types are:

  • Lists
  • Classes
  • DataFrames

To use a generic type, you have to specify the content type of the contained type. For example in Rust, a vector or a list of number is denoted by Vec<u8>. In this case the type variable T in Vec<T> is set to u8

Function types

A special kind of generic type is a function type. In Rust, for example, a function that maps an integer on an integer is denoted by: fn(a: u8) -> u8. Function types such as this simple one are generic because it is possible to swap out the input or the output type.

For example, the identity function is defined in Rust as

fn id<T>(t: T) -> T {
    t
}

The type variable is T.

Reference types

The type of a reference to a variable of type u8 is written as &u8. The ampersand literally means a reference to a u8.

This simple type of references is restricted by the borrowing rules. If you want to escape the borrowing rules slightly and have more than a single owner and multiple references, you need a special type called an Arc. This is a reference counted pointer. When called like

let x = Arc::new(0)
let y = x.clone()

This will copy a reference x to 0 and create a copy of this reference. When it creates the copy reference, it increases an integer variable that keeps track of the number of references. When the number of references becomes zero, the integer being 0 is dropped.

Types = constraints

In general, a stronger type system means that more constraints can be imposed on your software. And what’s even better, is that these constraints can be verified at compile-time before the software is run.

A lack of constraints leads to unpredictable software. Unpredictable software leads to:

  • Angry users
  • Tired developers
  • Bad business

Languages such as C++ impose fewer constraints on the program, which makes them at first sight easier, because less rules have to be learned. But in the long term, you will encounter more bugs such as “data races” and others if you impose less contraints.

Algebraic Data Types (ADTs)

A special feature of Rust is algebraic data types which could also be found in Haskell.

Products

You can also bundle several values together into records. A record collects several items and gives them a name. The offical mathematical computer science name for a record is a product. A product is a type that can normally be created in any programming language, whether functional or not. They are known as products because the number of possible values they can have is the product of the number of possible values of the component parts, see What is an Algebraic Data Type (ADT)?.

Biology Example in Rust:

struct Cell {
    state: CellState,
    energy_level: u32,
}

In R, data-frames are lists of products. In JavaScript, JSON objects are also a form of products. C++ has structs as well. JavaScript and C++ have classes which, in some sense, are a kind of product since they bundle different items in one single entity.

Sums

We can consider a data type that can only take value in several disjoint instances For example, true or false but not both. This is called a discriminated union, disjoint union. An element of such a type is either one of the possible options.

Discriminated unions are known as sums because the number of possible values that can have is: the sum of the number of possible values of the parts that compose it.

For example, a cell would be in one of the following states:

enum CellState {
    Alive,
    Dead,
    Dormant,
}

Pattern matching

In Rust, (almost) all built-in data types can be matched against with pattern matching. In Python, you use the match keyword for this. Rust also uses the same keyword, but a difference with Rust is that it should be exhaustive. This means that all the possible arms of the match expression should be covered in Rust.

By combining sum types (enums) and product types (structs), we can model cell states and behaviors comprehensively. In Rust, we can implement a method for checking the cell state. This means functional programming is compatible with object-oriented programming.

impl Cell {
    fn check_cell_state(self) -> CellState {
        match cell.state {
            CellState::Alive if cell.energy_level > 10 => CellState::Alive,
            _ => CellState::Dead,
        }
    }
}