Functional programming in Rust

Written on 2023-12-18

Existing literature:

Functional programming as a concept dates back to abstract models for computing in 1940 such as lambda calculus. It was implemented for the first time in 1960 in Lisp. In functional programming all computations are treated as the result of applying mathematically sound functions without side effects.

Pure Functions

FP emphasizes pure functions, without side effects, which means that given the same input, a function will always produce the same output.

Consequences:

  1. Predictability: Since expressions can be substituted with their results, you know exactly what they will do based on their inputs. Also called referential transparency, meaning that a function call can be replaced with its resulting value without changing the program’s behavior
  2. Parallelization: functions can easily run in parallel because they don’t rely on shared mutable state.
  3. Cache-ability: The result of an expression can be cached and reused because it will always produce the same output for the same inputs.
  4. Ease of refactoring: You can freely refactor your code by replacing expressions with variables or other expressions without worrying about altering the behavior of your program.

Naming

Because functions are considered building blocks, they have to be properly named and have a clear declaration. A proper function name and declaration are a form of documentation. Using good names is like writing a love letter to the next person who will read your code. On top, functions should be simple to understand and take as few arguments as possible, certainly never more than a handful.

Immutability

A returning concept in functional programming are data structures that never change their state after creation, enabling safer concurrency and predictable code behavior. These are called immutable data-structures. Rust, despite not being a purely functional language, encourages the use of immutable data structures to ensure safety and concurrency without data races. All variables in Rust are immutable by default and have to be declared as mutable explicitly.

For example, the following will not compile

let x = 5;
x = 6; // This will cause a compile-time error because `x` is immutable.

In general, it is necessary, while modeling a system, to distinguish constant and mutable data from the very beginning.

This leads to:

  • Predictability: Immutable variables lead to more predictable code. When you see an immutable variable being used multiple times in a function or across functions, you can be confident that its value hasn’t been altered since initialization, which makes it easier to follow and reason about the program’s flow.
  • Concurrency safe: Rust aims to prevent data races at compile time. If multiple threads have access to immutable data, there’s no risk of one thread unexpectedly modifying the data while another is reading it. This reduces the need for locks and simplifies writing concurrent programs.
  • Compiler optimizations: Knowing that a variable won’t change allows the compiler to optimize code under the hood. It could cache the value or elide certain checks, leading to potentially better performance.

Function Composition

It is possible to build complex functions by composing simpler ones, each performing a discrete step in a sequence of operations. Because each step is pure, the composition is deterministic.

Method chaining

Imagine we have a list of Person structs (structs are a kind of objects), and we want to find the names of all people who are over 18 years old, collect them and sort them.

struct Person {
    name: String,
    age: u8,
}

let people = vec![
    Person { name: "Alice".to_string(), age: 30 },
    Person { name: "Bob".to_string(), age: 20 },
    Person { name: "Charlie".to_string(), age: 17 },
    Person { name: "Dave".to_string(), age: 23 },
];

We will do this by chaining methods. A method is a function associated with objects. To use it syntactically, you write the instance of the class, a dot and then the name of the function.

fn main() {
    let adult_names: Vec<String> = people.iter()
        .filter(|p| p.age > 18)
        .map(|p| p.name.clone())
        .collect()

    adult_names.sort()

    println!("Adults: {:?}", adult_names);
}

Let’s break down the method chaining above.

First, .iter() creates an iterator over the people vector. An iterator is a data structure that is used to iterate over a structure. Rust provides powerful iterator methods which can be chained together to perform complex operations in a clear and elegant manner.

  • .filter(|p| p.age > 18) filters out the Person instances that do not meet the condition (i.e., being older than 18).
  • .map(|p| p.name.clone()) transforms the iterator from Person instances to their associated name strings.
  • .collect() collects the transformed values into a vector Vec<String>.
  • with the addition of sort(), we sort the names

Method chaining allows for such transformations to be expressed succinctly, making the code both elegant and readable.

Recursion Over Loops

FP favors recursion for repetitive tasks instead of iterative loops. As in mathematics, think about differential equations, sometimes recursion can simplify the statement of problems and solutions to certain algorithms. This comes at the expense of potential performance drawbacks like stack overflow.

Example

An example is calculating the (n)th number in the Fibonacci sequence, which appears in various natural phenomena including branching in trees, the arrangement of leaves on a stem, the fruitlets of a pineapple, the flowering of an artichoke, and the unfurling of a fern.

fn main() {
    let n = 10;
    println!("The {}th Fibonacci number is {}", n, fibonacci(n));
}

// A recursive function to calculate the nth Fibonacci number
fn fibonacci(n: u32) -> u32 {
    match n {
        0 => 0,
        1 => 1,
        _ => fibonacci(n - 1) + fibonacci(n - 2),
    }
}

The 10th Fibonacci number is 55

Using recursion to solve such a problem can be clear and concise, especially when translating a naturally recursive mathematical definition into code. However, this naive recursive approach is inefficient for large (n) because it performs many redundant calculations. It has exponential time complexity due to the repeated work done at each level of recursion.

Tail-call optimization

In the case of the above algorithm, it would be more efficient to use an iterative approach. However, it can also be sufficient to use a tail-call. A tail call is a recursive call at the end of a function. A function written with a tail-call can be optimized by the compiler into a loop. This is done by just replacing the stack frame by itself, see What is tail call optimization?. In Rust, this is not done automatically for every function.

First-Class and Higher-Order Functions

Functions are pure and can be composed freely. On top of that, FP allows functions to be

  • assigned to variables,
  • passed as arguments,
  • returned by other functions.

The last two options allow for higher-order functions. These are functions that can take other functions as parameters or return them, leading to powerful abstractions. Many languages have a feature called decorators, which are in essence functions that take other functions as input and return modified versions of the functions. For example, such a decorator could be a timed version of the original function.