Tour: Counting in Parallel

Starting with our program from the previous section, let's try to naïvely parallelize it.

The root of all parallelism in Guardian is the Pool::run() method. If we pass a lambda to this function, it will be executed in parallel. Guardian uses a common thread pool for its parallelism, so when we call Pool::run(), we're not spinning up a new thread. Rather, we're submitting a task to the thread pool. Remember this terminology: A unit of work submitted to the thread pool is called a task.

Since we want to count to 100, we can spin up five tasks, each of which will increment the counter 20 times, for a total of 100.

Here's the code:

namespace program;

import gu4::Out;

value class Program {
    entry fn main() -> void {
        let mutable counter = 0;

        for i in 0..5 {             // Line A
            Pool::run() {           // Line B
                for j in 0..20 {    // Line C
                    counter = counter + 1;
                    if counter % 10 == 0 {
                       Out::println("counter reached: " + counter);
                    }
                }
            }
        }

        Out::println("Done!");
    }
}

The main thread executes the outermost loop on line A. This loop will take care of submitting our five tasks.

Line B contains the call to Pool::run(). Notice how it is followed by a block of code. This is an example of a suffix lambda, one of Guardian's special syntaxes. Under the hood, suffix lambdas are simply passed to the function they follow as the final argument.

The innermost loop on line C comprises the body of the task. It is executed in parallel by the thread pool.

Unbounded Suffix Lambdas

One thing you might notice about this program visually is how nested it is. This is a relatively mild example of a common problem that pops up when using continuation-based programming. You may have heard the term "callback hell" if you have dabbled in JavaScript. In Guardian, we call this problem the Sideways Pyramid of Doom, after the triangular shape created by deep nesting.

Guardian has a few ways to help you avoid this problem. Let's look at one now:

namespace program;

import gu4::Out;

value class Program {
    entry fn main() -> void {
        let mutable counter = 0;

        for i in 0..5 {             // Line A
            Pool::run():            // Line B /* Beginning of task body */      
            for j in 0..20 {        // Line C
                counter = counter + 1;
                if counter % 10 == 0 {
                   Out::println("counter reached: " + counter);
                }
            }                       /* End of task body */               
        }

        Out::println("Done!");
    }
}

We have converted the suffix lambda beginning on line B from bounded form to unbounded form, which is denoted by the colon :. The semantics are the same, but instead of enclosing the body of the lambda in curly braces, the body opens after the colon and closes at the end of the enclosing scope.

Capture Rules

Now that we've chipped away at the sideways pyramid, let's try compiling our program with grun:

[Type Inference Pass] ERROR IN scratch/Program.gu4: Cannot capture '::counter' in this context because it is mutable.
See near [12:20]

counter = counter + 1;
^^^^^^^               

So, we get a type error. This is because we're trying to capture a mutable binding in a lambda. Guardian imposes strict rules on what we are allowed to capture. This has to do with the concept of temporal scope, which you can read more about in the section about Guardian's safe concurrency model. The gist of it is that capturing something by reference in a lambda implies sharing that something between threads. Guardian only allows us to share state which is either immutable or encapsulated in a thread-safe data structure. Our counter is of a value type, int, which is immutable, but the binding itself was declared as mutable, meaning that the binding itself is un-thread-safe mutable state.

So, what can we do? We can't make the counter immutable, because then we wouldn't be able to increment it at all. Instead, let's enclose it in a GuardVar:

namespace program;

import gu4::Out;

value class Program {
    entry fn main() -> void {
        let counter = new GuardVar<int>(0);      // Line A

        for i in 0..5 {
            Pool::run() {
                for j in 0..20 {
                    Guard::runGuarded(counter) |cnt|:      // Line B
                    cnt.set(cnt.get() + 1);
                    if cnt.get() % 10 == 0 {
                       Out::println("counter reached: " + cnt.get());
                    }
                }
            }
        }

        Out::println("Done!");
    }
}

Line A creates a GuardVar which holds the counter. A GuardVar is simply a wrapper which employs a Guard to protect a mutable value from being accessed concurrently. GuardVar is considered a vault type in Guardian's type system, meaning it is eligible to be captured and thereby shared between threads.

To access the counter, we use the Guard::runGuarded() function, which takes a GuardVar and a task as its arguments. The task, when executed, will be passed a reference to the GuardVar's value (enclosed in a Var, which is why we need to use .get() and .set(); the current version of Guardian doesn't have any sugar for this yet). The |cnt| syntax simply declares the lambda's parameter.

By the way, unlike many languages, Guardian allows us to shadow bindings. So, if we choose, we can reuse the name counter for the lambda parameter, instead of needing to invent a new name every time. Here's how that looks:

namespace program;

import gu4::Out;

value class Program {
    entry fn main() -> void {
        let counter = new GuardVar<int>(0);      // Line A

        for i in 0..5 {
            Pool::run() {
                for j in 0..20 {
                    Guard::runGuarded(counter) |counter|:      // Line B
                    counter.set(counter.get() + 1);
                    if counter.get() % 10 == 0 {
                       Out::println("counter reached: " + counter.get());
                    }
                }
            }
        }

        Out::println("Done!");
    }
}

Let's run this code with grun:

[Type Inference Pass] ERROR IN scratch/Program.gu4: Future returned by the call to '_::Guard::runGuarded()' is discarded.
Hint: You may want to pass this future to a function, call .then(), or assign with `let`.
See near [12:20]

Guard::runGuarded(counter) |counter|:      // Line B
^    

Okay, we have another compiler error. This time, Guardian is complaining about a Future being discarded.

This message tells us that we have run afoul of the must-consume rule, which you can read about in more detail here. Essentially, this rule stipulates that any time we receive a future from a function call, we must either consume it or pass it along, either by returning it or passing it to a function. When we pass the future along, the recipient is then responsible for satisfying the must-consume rule. This rule exists to ensure that we always make progress towards the termination condition of the program.

You can see the value of the must-consume rule by carefully reading the program. We fire off five tasks with Pool::run(), but we never "wait" for the tasks to actually complete before printing the "Done!" message and exiting the program. If Guardian had accepted this program and compiled it as-is, it would be incorrect. The call to Guard::runGuarded() exhibits the same problem.

Futures and Latches

Since we want our program to wait for the completion of a finite number of tasks, this is a perfect use case for a countdown latch. This is a thread-safe (vault) data structure which holds a counter. We initially set the counter to the number of tasks we want to wait for, and then decrement it each time a task completes. When the counter reaches zero, a future Fut is completed. With futures, we can schedule tasks to be executed after a certain condition is met.

Here's our code modified to use CountdownLatch:

namespace program;

import gu4::Out;

value class Program {
    entry fn main() -> Fut<void> {                 // Line A
        let counter = new GuardVar<int>(0);
        let outerLatch = new CountdownLatch(5);    // Line B

        for i in 0..5 {
            Pool::run():

            let innerLatch = new CountdownLatch(20);    // Line C

            for j in 0..20 {
                Guard::runGuarded(counter) |counter|:
                counter.set(counter.get() + 1);
                if counter.get() % 10 == 0 {
                   Out::println("counter reached: " + counter.get() + " in task " + i);   // Line D
                }
                
                innerLatch.signal();             // Line E
            }

            innerLatch.getFut().then():          // Line F
            outerLatch.signal();                 // Line G
        }

        return outerLatch.getFut().then():       // Line H
        Out::println("Done!");
    }
}

There is a lot going on here, so let's break it down gradually.

In line A, we changed the return type of main() from void to Fut<void>. Typically, when we want a parallel program to terminate after some asynchronous work completes, we would block in the main method until the work is complete. An example of this is calling join() on a future. Blocking is antithetical to Guardian's principles, and the language is designed to prevent you from writing any code that blocks. However, no matter how much we try to eliminate blocking, we'll always have to block in the main method to prevent the program from terminating before all of our asynchronous work is complete. Fortunately, Guardian provides some sugar to hide this "necessary evil." If we return a Fut<void> from the entry function, the program will terminate when that future completes. When combined with the must-consume rule, this paradigm makes it relatively more difficult to write programs with ill-formed termination conditions.

In line B, we create a CountdownLatch with the initial value of 5 (the number of tasks we want to wait for). This corresponds to the number of tasks we enqueue in the outer for loop.

In line C, we create a CountdownLatch with the initial value of 20 (the number of iterations we want to wait for in each task). This corresponds to the number of runGuarded tasks we enqueue in the inner for loop.

In line D, for illustrative purposes, we added the task number to the output.

In line E, after incrementing the counter a single time, we signal the inner latch. This causes the inner latch to decrement its counter. When the counter reaches zero, the inner latch is completed, and the Fut returned by innerLatch.getFut() on line F is completed.

In line F, we wait for the inner latch to complete. This is necessary because we want to wait for all of our increment operations to complete before we signal the outer latch. "Wait" is not the right word to use here, because we never block in Guardian; "waiting" for something to complete actually implies scheduling a continuation to be executed when a future completes. We do this by calling then() on the future, providing the continuation as a suffix lambda.

In line G, we signal the outer latch. This causes the outer latch to decrement its counter. When the counter reaches zero, the outer latch is completed, and the Fut returned by outerLatch.getFut() on line H is completed.

In line H, we wait for the outer latch to complete before printing the "Done!" message. The call to then() produces a future, which we need to consume in order to avoid violating the must-consume rule. Since this particular future is the final operation in this program, we can simply return it from the entry function to terminate the program.

Let's run the program:

counter reached: 10 in task 2
counter reached: 20 in task 1
counter reached: 30 in task 1
counter reached: 40 in task 4
counter reached: 50 in task 1
counter reached: 60 in task 3
counter reached: 70 in task 0
counter reached: 80 in task 3
counter reached: 90 in task 0
counter reached: 100 in task 2
Done!

Success!

Looping Constructs

As it is now, this program is a little ugly and verbose. Notice that we create two latches that simply track the completion of a specific for loop; we even have to redundantly pass in the number of loop iterations as an argument to the CountdownLatch constructor.

Fortunately, Guardian provides some looping constructs in the Loop class to make patterns like this more concise and expressive: Loop::marchingFor and Loop::parFor. These functions work similarly to a for loop, but the loop body is given by a suffix lambda which yields a Fut for each iteration. Once all these Futs have completed, the Fut returned by the call to Loop::marchingFor or Loop::parFor will complete.

Here are the differences between parFor and marchingFor:

Here's our program refactored to use the looping constructs in Loop:

namespace program;

import gu4::Out;

value class Program {
    entry fn main() -> Fut<void> {
        let counter = new GuardVar<int>(0);

        return Loop::parFor(0, 5) |i| {                      // Line A
            yield Loop::marchingFor(0, 20) |j| {             // Line B
                yield Guard::runGuarded(counter) |counter|:  // Line C

                counter.set(counter.get() + 1);
                if counter.get() % 10 == 0 {
                   Out::println("counter reached: " + counter.get() + " in task " + i);
                }
            };
        }.then():                                            // Line D

        Out::println("Done!");
    }
}

In line A, we use Loop::parFor to run the body of the loop asynchronously, in parallel. All iterations of the loop are executed concurrently. We also hoisted the return statement from its previous location because both Loop methods return a Fut<void>.

In line B, we use Loop::marchingFor to run the body of the loop asynchronously, in sequence. Each iteration of the loop waits for the previous iteration to complete before executing. We could also have used Loop::parFor here with the same effect, because the call to runGuarded() will serialize access to the counter variable, but we chose to use marchingFor here for clarity of intent. We also add a yield clause, because the body of the suffix lambda we pass to parFor needs to return a Fut<void>. Why yield and not return? We will explain the reasoning in a moment, but for now, just know that you need to use yield instead of return when you want to return from the body of a lambda. If you get this wrong, the compiler will help you.

In line C, we add a yield clause for the same reason as in line B; we want the Futs returned by each stage of the computation to "bubble up" to the top level so that we have something to return from the body of main().

In line D, we use then() to schedule a continuation to be executed when the future returned by Loop::parFor completes. This continuation simply prints the "Done!" message. Once we print the message, we want the program to terminate, so we return this continuation's future in line A.

Let's compile and run this program:

counter reached: 10 in task 2
counter reached: 20 in task 1
counter reached: 30 in task 2
counter reached: 40 in task 1
counter reached: 50 in task 2
counter reached: 60 in task 1
counter reached: 70 in task 2
counter reached: 80 in task 1
counter reached: 90 in task 0
counter reached: 100 in task 0
Done!

Looking good!

This program is simple enough that it's not really necessary, but in Guardian you will often find yourself wanting to decompose different stages of an asynchronous operation into smaller functions. As a learning exercise, we will try this out with our counting program.

In another language, this would usually entail some tedious juggling with futures, but Guardian has another feature to make our lives easier: async functions.

Async Functions

In asynchronous programming, there is a common pattern wherein you have a function which performs some asynchronous work and returns a Fut representing the completion of that work. In Guardian, async functions help you write these functions more concisely. Instead of having to explicitly construct, complete, and return a Fut, you can simply mark the function as async. This will cause the function body to implicitly create and return a Fut of the appropriate type. To complete this implicit future, you can use async return, in the same way that you would use return in a normal function. The nice thing about async return is that it will work inside arbitrarily deeply nested lambdas, which return does not.

This distinction between return and async return is why Guardian makes the decision to conjugate return as yield inside of a lambda: it's to preserve clarity of intent. Without yield, async return and return would affect control flow in different ways when inside a lambda, which would be confusing since the keyword return is used in both places.

With that context established, here are the different conjugations of return clauses:

Again, without yield, return would behave differently inside a lambda than outside, while async return would always behave the same.

Anyway, here's our counting program refactored to use async functions:

namespace program;

import gu4::Out;

value class Program {
    async entry fn main() -> Fut<void> {
        let counter = new GuardVar<int>(0);

        Loop::parFor(0, 5) |i| {
            yield addToCounter(counter, 20, i);
        }.then():

        Out::println("Done!");
        async return;
    }

    async static fn addToCounter(counter: GuardVar<int>, n: int, taskNum: int) -> Fut<void> {
        Loop::marchingFor(0, n) |i| {
            yield incrementCounter(counter, taskNum);
        }.then():

        async return;
    }

    async static fn incrementCounter(counter: GuardVar<int>, taskNum: int) -> Fut<void> {
        Guard::runGuarded(counter) |counter|:

        counter.set(counter.get() + 1);

        if counter.get() % 10 == 0 {
            Out::println("counter reached: " + counter.get() + " in task " + taskNum);
        }

        async return;
    }
}

We have converted the body of the inner loop to the function incrementCounter(), and the body of the outer loop to the function addToCounter(). These functions both return a Fut<void> to indicate completion. In fact, if we didn't, the must-consume rule would bite us.

Each of these functions is marked as async. This allows us to use async return in the function body.

Note well that, unlike in some other languages which use the async keyword, Guardian has no concept of "colored functions", the latter being a popular way of referring to the problem suffered by languages like JavaScript where async functions can only be called from within other async functions. In Guardian, you can freely mix async and non-async functions in the same program. An async function and a non-async function which return the same type are indistinguishable to the caller.

Let's make sure our program still works:

counter reached: 10 in task 3
counter reached: 20 in task 4
counter reached: 30 in task 1
counter reached: 40 in task 4
counter reached: 50 in task 1
counter reached: 60 in task 4
counter reached: 70 in task 1
counter reached: 80 in task 0
counter reached: 90 in task 0
counter reached: 100 in task 3
Done!

Success!

Next

In the next section, we'll look at how to use async functions to write a parallel Fibonacci code.