Misconceptions about Swift types

Suyash Srijan
7 min readApr 8, 2019

--

Most people are familiar with value types and reference types in Swift. For example — an enum and a struct is a value type, where as a class is a reference type.

Most people are also familiar with the differences between the two, however it’s not as simple as it looks. In this blog post, we will explore it from the implementation point of view, identify common misconceptions and present a different way of thinking about it.

Warning: A lot of it is implementation detail, which can always change, so keep that in mind. Also, there’s no guarantee that the compiler will always emit the code in the same way.

What are value and reference types?

Let’s start by first defining what a value is, by looking at an example:

var a = 42
var b = a

Now, many people will say that a and b have the same value. However, on an implementation level, a and b can be independently accessed and modified and thus cannot have the same value. So, b must have a copy of the value in a — or b must have a new value instance.

Now, we can define what a value type and reference type is:

A value type is a type which, when copied, results in a new value instance, containing its byte representation. A reference type is a type, which when copied, results in a new value instance, containing the reference to the instance of that type (note: things get a bit complicated when you mix them).

For example — structs, enums and tuples are value types, whereas classes, functions and closures are reference types.

Passing value types and reference types

Swift uses pass-by-value by default, which means regardless of whether you pass a value type or a reference type, you’re always passing it by value and not by reference — i.e. you’re passing a copy (or a new value instance, as explained above). For example:

let someKlass = Klass()
acceptClass(someKlass) // passing a copy of the reference to the instance of 'Klass'
let someSmallStruct = SmallStruct() // a small struct - 16bytes
acceptStruct(someSmallStruct) // passing a copy of the instance of 'SmallStruct'
let someLargeStruct = LargeStruct() // a large struct - 32 bytes
acceptStruct(someLargeStruct) // passing a copy of the reference to the instance of 'LargeStruct'

The compiler tries extremely hard to avoid making copies unless there is mutation (or the possibility of it) and even then, it will often pass a new value containing the reference instead, for example, when passing a struct that can’t fit on registers.

A special case is function arguments that are marked as inout, which uses copy-in-copy-out. When you pass a variable as an inout argument, what you’re really passing is, again, a copy (allocated temporarily on the stack), which you’re free to do use inside the function and then when the function returns, the value is then assigned back to the variable. This is why the variable has to be mutable — it cannot be a let!

Storing value types and reference types

If you ask someone “where are value and reference types stored?”, the common answer is going to be — “value types are stored on the stack, where as reference types are stored on the heap”. However, this isn’t completely true. Whether a type is a value type or reference type has absolutely nothing to do with where it is stored.

For example, let’s create a small struct (a value type) called Point, which stores two values — x and y.

Any idea where point will be stored on? The stack, right? Well, no. A small struct like this is actually stored on registers. A struct smaller than 3 pointers (this would be 24 bytes on a 64-bit system) is stored and passed around on registers, where as a large struct is stored and passed around on the stack.

Let’s look at the assembly for our Point struct generated by the compiler:

When a struct is created, the struct itself is not being stored anywhere — a struct is typically treated as a collection of independent values. Let’s take a deeper look:

Line 13 defines the initialiser for Point. Looking at lines 20–21, the two values passed into the initialiser (stored in rdi and rsi — swift’s calling convention defines them as the storage for the first and second arguments of a function) are moved over to rax and rdx registers. Then, the initialiser returns — that’s it.

We store 0 in rdi and rsi on lines 38–39 and then call the initialiser, which moves the values into the two general purpose registers described above (rax and rdx). Then, we move the values on to the stack from rax and rdx for further use.

So, as you can see, the struct is broken down into two individual values (one that represents x and one that represents y) and is stored completely on registers.

Even a class (a reference type) can be stored completely on the stack, if the compiler can prove that the allocation does not escape. For example, below is the same Point, represented with a class:

The assembly here is a little more tricky to dissect when the Swift runtime is involved, because there’s a lot of indirection, so we’re going to take one step back and look at the resulting SIL (Swift Intermediate Language — an intermediate language that sits between Swift and LLVM IR):

Looking at line 15, the compiler has promoted the allocation of Point to the stack. There is still reference counting involved, however the allocation happens on the stack rather than the heap.

The compiler can promote heap allocations to stack allocations if it can prove (through a technique known as escape analysis) that the allocation of an object will not escape the scope in which it was allocated. For example, a non-escaping closure will be allocated on the stack, rather than the heap.

As a bonus — types such as Data and String are also stored on the stack rather than the heap, if they are smaller than 15 bytes in case of Data and 15 UTF-8 code units (on 64-bit)/10 UTF-8 code units (on 32-bit) in case of String. However, that is a design decision rather than an optimisation performed by the compiler.

You typically do not have to think about where the object is going to be allocated, however there is no hard and fast rule in the compiler that says “all value types will always be stored on the stack and all reference types will always be stored on the heap”.

Value and reference semantics

A bit confused so far? It’s okay to be. This is where everything will start to make much more sense. Value and reference types are really an implementation detail — where something is stored in memory and how something is passed around is completely up to the compiler and thinking of types in terms of that might end up causing confusion in certain situations, such as when designing an interface.

So instead what we should really be thinking about is semantics of a type i.e what does it mean or how does it behave.

It’s best to explain with an example:

class Klass {
let value: Int

init(value: Int) { self.value = value }
}let instanceOfKlass = Klass(value: 0)
instanceOfKlass = Klass(value: 1) // error
instanceOfKlass.value = 1 // error
var copy = instanceOfKlass
copy.value = 2 // error

Here, Klass is a reference type. However, it has value semantics, in the sense that it’s immutable and hence you cannot change it. Hence, it behaves like a value type.

A simple definition of value semantics is if the only way to modify the value is through a single variable (or if it’s not possible to modify the value at all). Similarly, if it is possible to modify the value indirectly (through a reference) i.e. though multiple variables, then it means the type has reference semantics.

Here’s another example:

class AnotherKlass {
var value: Int
init(value: Int) { self.value = value }}let instanceOfAnotherKlass = AnotherKlass(value: 0) instanceOfAnotherKlass = AnotherKlass(value: 1) // error
instanceOfAnotherKlass.value = 1 // okay
let copy = instanceOfAnotherKlass
copy.value = 2 // okay

Here, Klass has reference semantics, because while the variable itself is immutable, the type has lost value semantics because value is now mutable i.e. it is now possible to modify value from two different references.

Value semantics can also be achieved using access control — i.e making a variable fileprivate so nothing outside a file can modify the variable, or by implementing copy-on-write on your value type.

It’s better to have semantics instead of types as the mental model because it makes thinking and reasoning about things easier — you’re no longer thinking about where something is stored in memory, rather you’re thinking about its behaviour.

I hope you enjoyed reading this blog post and the little deep dive into the implementation of value and reference types in Swift. There are a couple of other implemention details that I skipped to keep this post short and I also wanted to cover reference counting and more, but I’ll leave that for a later post on the Swift runtime, where we will explore these types (among other things) much more deeply.

--

--

Suyash Srijan

iOS Engineer at @theappbusiness. Swift compiler collaborator.