Sequences and Generators
If you’ve written code in Swift, you may have seen the for in
(or foreach) loop at some point. For those who aren’t acquainted with it, an simple example follows:
// prints "first", "second", and "third", each on its own line
for item in ["first", "second", "third"] {
println(item)
}
This loop runs once for each object in the array. Each time it runs, it binds the current object to the constant item
, allowing the code within the loop to perform some operation based on that object. Since in this case the array contains strings, item
is automatically inferred to be of type String
.
It turns out you can also write a foreach loop to iterate over the items in a dictionary…
for (number, name) in [1: "one", 2: "two", 3: "three"] {
println("The number \(number) is named \(name)")
}
…or a range.
var buffer = "0"
for index in 1..<10 {
buffer += " \(index)"
}
// buffer is now "0 1 2 3 4 5 6 7 8 9"
But Array
, Dictionary
, and Range
are Swift built-in types. What if we wanted to build our own collection and iterate over it using a foreach loop? Let’s try doing so and see what happens.
Building a Linked List
The linked list is a fundamental data structure, one of the first data structures taught to computer science students. It consists of a chain of nodes which each contain a value, as well as a reference to the next node in the chain.
One elegant way to reason about linked lists is to first imagine an empty list, a special list containing zero elements. We can then construct a list using the following technique:
- We start with the empty list:
[]
- We prepend the first item
A
to the empty list to form a list with one element:[A]--> []
- Then we prepend the second item
B
to that list to form a list with two elements:[B]--> [A]--> []
- …and so forth
A Swift linked-list implementation using classes and inheritance follows:
class List<T> { }
final class ConsList<T> : List<T> {
let value : T
let next : List<T>
init(_ value: T, next: List<T>) {
self.value = value
self.next = next
}
}
final class NilList<T> : List<T> { }
Linked lists are defined recursively. The NilList
class serves as our empty list and base case. All other lists are defined as a ConsList
containing a value and followed by a slightly smaller linked list.
Sequences and Generators
Now that we have a linked list type, we want to be able to iterate through the linked list nodes in order using a foreach loop.
The secret to doing so lies in the SequenceType
protocol. In Swift, protocols can confer special powers to types that choose to conform to them, and SequenceType
is no exception. Indeed, any type that adopts SequenceType
can be used with foreach loops.
The SequenceType
protocol defines a single function: generate()
, which takes no arguments and returns a Generator
. Let’s go ahead and begin extending our List
class:
// Initial attempt
extension List : SequenceType {
func generate() -> Generator {
// ???
}
}
In order to proceed, we need to know what generate()
is supposed to do, and what a Generator
is.
Generators
A generator is a helper object that is related to the sequence we want to iterate through. Generator
is not the name of a predefined Swift class, but rather an associated type belonging to the SequenceType
protocol.
You can think of associated types as a ‘wildcard’ type that you have to fill in when you write the code that makes your class or struct conform to the protocol. In this case, SequenceType
wants us to define our own generator type and then have our List
’s generate()
method return one of those generators.
Let’s sketch out a generator for our linked list type. If we look at the definition of Generator
, we notice that our new generator has to conform to the GeneratorType
protocol in order to qualify as a generator. GeneratorType
in turn defines a single function, next()
:
// Initial attempt
struct ListGenerator: GeneratorType {
mutating func next() -> Element? {
// ???
}
}
Note that Element
is another associated type, this one belonging to the GeneratorType
protocol. At this point, we have a bunch of interlocking components:
- We have our
List
, which needs to conform toSequenceType
… - …and to do so, we’ve defined a
ListGenerator
. - In turn, our
ListGenerator
has to conform toGeneratorType
… - …which means we need to figure out what
Element
is.
Purpose
To figure out how these components all fit together, we’ll step back and discuss the roles that our generator, generate()
, and next()
functions are intended to fulfill.
As mentioned before, a generator is an object associated with a sequence. At the beginning of the foreach loop, a generator for the sequence being iterated through is created using the generate()
method.
The job of the generator is to repeatedly vend out the ‘next’ item in the sequence whenever its next()
method is called. When the sequence is completely consumed, next()
returns nil
. In order to perform this job, the generator might need to keep track of state, such as its current position within the sequence.
Building ListGenerator
Let’s return to our ListGenerator
. We want next()
to return whatever type of element its parent List
contains. Since List
is generic, we should also make ListGenerator
generic, and replace Element
with T
:
struct ListGenerator<T> : GeneratorType {
mutating func next() -> T? {
// ???
}
}
Our ListGenerator
is going to walk through the list, advancing one node every time its next()
method is called. In order to keep track of this state, let’s add a property tracking the generator’s current position, currentNode
.
struct ListGenerator<T> : GeneratorType {
// Note: our ListGenerator is a generator for a List of type T...
var currentNode : List<T>
// ...and our next() function returns an element of type T
mutating func next() -> T? {
// ???
}
}
Now we can fill out our next()
function. We’ll also add an initializer for good measure.
struct ListGenerator<T> : GeneratorType {
var currentNode : List<T>
init(head: List<T>) {
currentNode = head
}
mutating func next() -> T? {
switch currentNode {
case let cons as ConsList<T>:
currentNode = cons.next
return cons.value
default: // e.g. nil node
return nil
}
}
}
next()
does one of two things. If the current node is a ConsList
, it advances the currentNode
to the next node and then returns the current value. If the current node is the NilList
, it just returns nil
.
Now let’s close the circle and finish implementing our generate()
function.
extension List : SequenceType {
func generate() -> ListGenerator<T> {
return ListGenerator(head: self)
}
}
All we’re doing here is creating a fresh ListGenerator
set to the first node in the list and returning it.
Trying it out
At this point, we’re done! Let’s try it out. (Don’t leave out the type annotation on myList
; Swift’s type inference engine still has issues.)
let myList : List<Int> = ConsList(10,
next: ConsList(15,
next: ConsList(18,
next: NilList())))
for item in myList {
println("the item is \(item)")
}
// Output:
// the item is 10
// the item is 15
// the item is 18
Pretty cool! Our foreach-compatible List
type takes its deserved place among the ranks of such exalted built-in types as Array
, Repeat
, and UnsafeBufferPointer
.
Lazy Sequences
What do collections like List
, Array
, and Dictionary
have in common? They all have a finite number of items. But sequences don’t need to be collections, nor do they even need to have a finite number of elements. A sequence can be something as unlikely as a constant flow of updates from the USGS’s live earthquake feed or the digits of pi.
It’s usually not feasible to pre-generate all the elements in an infinite sequence, so most such sequences are lazy; their elements are generated on-demand or as they become available. It turns out that the SequenceType
and GeneratorType
machinery we just learned about can be used to support lazy sequences in Swift.
Collatz sequence
The Collatz sequence, also known as the hailstone sequence, is a sequence defined by a simple algorithm that has a bizarre property: no matter what number you start off with, the sequence always seems to eventually reach 1. (Whether this is actually true for all numbers is not known.)
Let’s build the Collatz sequence in Swift. In this case, our generator will be doing all the work, so our actual Hailstone
struct is very simple:
struct Hailstone : SequenceType {
let start : Int
func generate() -> HailstoneGenerator {
return HailstoneGenerator(value: start)
}
}
The generator is responsible for advancing the sequence. Since the Collatz sequence is infinite in length, it probably won’t surprise you that next()
never returns nil
.
struct HailstoneGenerator : GeneratorType {
var value : Int
mutating func next() -> Int? {
let this = value
// If even, divide by 2. If odd, multiply by 3 and add 1.
value = (value % 2 == 0) ? (value / 2) : (3*value + 1)
return this
}
}
Now we can try it out! For fun, we’ll use the enumerate()
function (which works with any sequence type) to also get the indices of each hailstone number.
for (idx, hailstone) in enumerate(Hailstone(start: 91773)) {
println("hailstone #\(idx + 1) is: \(hailstone)")
if hailstone == 1 {
break
}
}
Can you build a Hailstone
instance such that this loop runs forever? Go study mathematics for a decade, prove that it’s possible or impossible before you turn 40, and win a Fields Medal!
Lazy sequences don’t need to be infinite, of course. With a little bit of work we could add a take
method to Hailstone
which would return another Hailstone
sequence consisting of the first n Collatz numbers.
let firstTwenty : Hailstone = Hailstone(start: 12345).take(20)
Or we could have our take
method return a different type like HailstoneSlice
, if we wanted to keep the distinction between infinite and finite sequences clear.
The map
Function
If you prefer functional programming, you may be familiar with the higher-order map function. This function takes in a collection of items of type T
, as well as a transforming function that maps inputs of type T
to outputs of type U
. It then returns another collection of items of type U
, created by applying the transformer to each item from the input collection.
Here is a very simple example of map
in action, taking in an array of integers and squaring each element:
func square(a: Int) -> Int { return a * a }
let output = map([1, 2, 3, 4, 5], square)
println("output is: \(output)")
// prints: output is: [1, 4, 9, 16, 25]
Conceptually, map
is similar to the foreach loop. Whereas the foreach loop contains code inside the loop body which has access to each element in turn, the map
function takes in a mapping function that is fed each element of the input collection in turn.
Swift’s standard library contains map
as both methods defined on certain types like Array
, as well as a map
function that works with any sequence type and returns an array:
/// Return an `Array` containing the results of mapping `transform`
/// over `source`.
func map<S : SequenceType, T>(source: S, transform: (S.Generator.Element) -> T) -> [T]
The function signature may look a bit intimidating, but it’s actually pretty straightforward:
source
is of typeS
, which is any type that’s a sequence (i.e. conforms toSequenceType
).transform
is a function that takes an input of typeS.Generator.Element
and returns an output of typeT
.S.Generator.Element
can be read as “the type of the elements returned whennext()
is called on the generator type returned byS
’sgenerate()
function”. Remember thatS
has already been defined to be a sequence.T
can be whatever you want it to be.- The function outputs an array of
T
-typed objects.
If we wanted our List
to have a map
method, we’d need to define it ourselves. However, since our List
conforms to SequenceType
, we can use it with the map
function without writing any additional code:
func square(a: Int) -> Int { return a * a }
let myList : List<Int> = ConsList(1,
next: ConsList(2,
next: ConsList(3,
next: ConsList(4,
next: ConsList(5, next: NilList())))))
let output = map(myList, square)
println("output is: \(output)")
// prints: output is: [1, 4, 9, 16, 25]
In fact, our List
(or any other SequenceType
that we might build in the future) is automatically compatible with any function that works with sequences. For example, if we have a list containing comparable elements such as integers, we can use the standard library’s minElement()
and maxElement()
functions to find the smallest or largest elements.
Conclusion
Sequences are an example of a powerful form of abstraction that Swift adopts to great effect. Rather than baking language functionality into the compiler and granting it to only a handful of privileged built-in types, Swift exposes language features as protocols that any type can choose to adopt:
- Equatable types can adopt
Equatable
and be checked for equality using==
. - Types with an ‘empty’ or ‘identity’ value can adopt
NilLiteralConvertible
to have the compiler automatically treatnil
as that identity value. - Types that can be abstracted as sequences can adopt
SequenceType
and be used in foreach loops (and many other places).
A final note: I am indebted to Colin Eberhardt’s blog post on sequences, which was one of the first good analyses of Swift’s sequence and generator system. It is somewhat outdated, but still worth a read.