luiscubal's code sketchbook: 2013

Monday, December 23, 2013

Comparison of two worlds - Await and Yield in C# and JavaScript

A few months ago, I published the source code of NWarpAsync.Yield, a C# library that emulated "yield" (a C# language feature) on top of "await" (another C# language feature).
This project is possible because await and yield are essentially the same feature. This is something some members of the JavaScript developer community noticed before I did and have exploited it to their benefit: JavaScript (as of ECMAScript version 6, which is still not finalized) already enables C#-style simple asynchronous programming.

I'll cover the main differences between the two approaches here, as well as why the JavaScript approach wouldn't work as nicely in C#, even though something similar would be possible. I do have a working prototype showing how it can be done (check the first link in the "Further Reading" section).

Required Knowledge

In order to understand this post, you should understand what generator functions are. Knowing C# and JavaScript will help greatly.

Yield in C#

I'd like to start this post by pointing out that yield in C# is not identical to yield in JavaScript. Unfortunately, as we'll see, yield in C# is less flexible than yield in JavaScript.

In C#, generator functions are those that return IEnumerable, IEnumerable<T>, IEnumerator or IEnumerator<T>. For the purposes of this post, we'll focus on IEnumerable<T>.
For those who are not familiar with C# terminology, an IEnumerable<T> is an object that can be iterated on.

Here's what a typical C# generator function looks like:

public IEnumerable<int> GetValues(int count)
{
   for (int i = 0; i < count; ++i)
   {
      yield return i;
   }
}

This method can now be used like any other method.

Enumerables in C# are classes that declare a method named "GetEnumerator" that in turn returns an IEnumerator<T>. "Enumerator" is the C# name for "iterator".

It is possible to iterate multiple times on the same instance returned by our GetValues method. Every time GetEnumerator is called, the function starts from scratch.

IEnumerator itself is an interface. The most important members of this interface are MoveNext and Current.
When MoveNext is called, the function is executed up to the point of the next "yield return" (or the end of the method, if it there aren't any more "yield return" statements).
MoveNext returns true if the generator reached a "yield return" statement, false if it reached the end.

The value returned by the "yield return" statement can be read by the "Current" property.

This has two important implications:

No code in the function is executed until "MoveNext" is first called;
The code is lazily evaluated, meaning that if, after a yield return statement is executed, MoveNext is never called again, then no more code is executed.

The second implication means we can get away with generators that return infinite collections without having the application hang, provided that the caller stops calling "MoveNext"(which will always return true in this case) at some point.

public IEnumerable<int> GetAllIntegers()
{
   // This method is valid.
   // Just don't use it in a foreach loop without a break statement.
   int i = int.MinValue;
   for (;;)
   {
      // Wraps around back to negative numbers on overflow.
      yield return unchecked(i++);
   }
}

So let's review the contract associated with yield in C#:

The caller calls a method that returns an IEnumerable
The caller calls enumerable.GetEnumerator()
The caller calls enumerator.MoveNext() followed by enumerator.Current as many times as it wants to.

To handle finally blocks, the IEnumerator<T> interface also defines a Dispose method. When the Dispose method is called, the code in the applicable "finally" blocks is executed and the enumerator is considered terminated.

Yield in JavaScript

Yield in JavaScript is recent. So recent, in fact, that the standard it is described in is still just a draft.

Let's take a look at what yield in JavaScript looks like:

function * generator(count) {
   for (var i = 0; i < count; ++i) {
      yield i;
   }
}

Dynamic typing aside, the main differences we can see so far between JavaScript and C# are:

Generator functions in JavaScript must be explicitly marked as such with the "*" after the "function" keyword (in C# it was inferred by looking at the function body):
"yield" is used instead of "yield return".

However, the two languages do not have identical semantics.
In C#, generator functions can either return an enumerable or an enumerator.
In JavaScript, the generator function always returns the iterator (enumerator) directly.

The iterator is an object with a method called "next", which replaces "MoveNext" and "Current". The next() function returns an object that indicates the yielded value and whether the function is over.

function * generator() {
   yield 1;
   return 2;
}
var g = generator();
g.next(); //returns { value: 1, done: false }
g.next(); //returns { value: 2, done: true }
g.next(); //throws exception

Note that "return" is a valid statement in generator functions, as opposed to C#, where mixing the two styles in one function results in a compile error.

Another big difference is that yield in C# is a statement, but yield in JavaScript can be used in an expression. This means the following is valid:

function * generator() {
   return yield 1;
}
var g = generator();
g.next(); //returns { value: 1, done: false }
g.next(); //returns { value: undefined, done: true }

So now, what does the "yield" expression return? In this case, it returned null but clearly it wouldn't be very useful if that was always the case.
Indeed, the yield expression returns the argument of the "next" function. Since we did not specify any arguments, undefined was returned, but that can change:

function * generator() {
   return yield 1;
}
var g = generator();
g.next(); //returns { value: 1, done: false }
g.next(42); //returns { value: 42, done: true }

This isn't the only feature iterators have. Generator objects have another method called "throw". When "throw" is called, execution of the generator function resumes. However, instead of returning a value, yield throws an exception.

function * generator() {
   try {
      yield 1;
      return "success";
   } catch (e) {
      return e;
   }
}
var g = generator();
g.next(); // { value: 1, done: false }
g.next(); // { value: 'success', done: true }
g = generator();
g.next(); // { value: 1, done: false }
g.throw('my exception'); // { value: 'my exception', done: true }
}

As we'll see, these two differences -- yield being an expression and JavaScript iterators having a "throw" function -- are what make asynchronous programming easy with JavaScript yield but cumbersome with C# yield, which ultimately explains why C# needed something new.

Asynchronous programming

In many ways, asynchronous functions are a lot like generators. They perform operations, but then are paused waiting for other events to happen before they can be resumed.

page := Wait for Document to be retrieved from the Internet
words := page.Words
return words.Count

In languages such as Java, the easiest way to do this is to start a new thread and run the operations in that thread, blocking whenever a slow operation is found.
We'd not be using threads to take advance of the multi-core age or anything like that. We'd be using threads to avoid blocking the main program thread because the alternative - to manually set up events and listeners - is boring, ugly and error-prone. Of course, threads have their own overhead.

If we can "pause" and "resume" functions, we no longer need to use threads for basic asynchronous programming.
In C#, the chosen syntax for this was:

public async Task<int> GetWordCount(string url) {
   var page = await DownloadDocumentAsync(url);
   var words = page.Words;
   return words.Count();
}

Note that asynchronous methods are marked with the "async" keyword and return a Task instance.
In JavaScript, there are libraries (e.g. Q) that implement async/await using yield.

var getWordCount = Q.async(function*(url) {
   var page = yield downloadDocument(url);
   var words = page.words;
   return words.count();
});

The end-result is similar. But let's see what Q does to transform yield into await.

Q is a library to implement Promises in JavaScript. In Q, promises are objects with a few special functions. One of the most important is "then". The then function takes two callbacks as arguments which it calls when the function is successfully terminated (first callback) or when the function is terminates with an exception (second callback).

When the next() function is executed, it can yield a promise, return the final value or throw an exception.
If it throws an exception, then Q calls the error callback.
If it returns a value, then Q returns that value.
If it yields a promise, then Q calls the then function on that promise to register two callbacks. The success callback of that promise goes back to the beginning of these steps to call next() again (with the argument of next being the result of the promise) and so on. The error callback of that promise uses the iterator.throw function to notify the generator function that an error occurred. As such, errors in the promise can be caught by a normal try...catch statement.

Emulating Await in C#

In C#, yield wouldn't work as nicely as a replacement for await.
For starters, yield is more restrictive than await. Yield can not be used in anonymous delegates/lambdas. But that's not the main reason.

"value = await promise;" is very useful but it can only possibly compile because "await promise" is an expression. Additionally, unlike what happens with await and JavaScript yield, there is no way to notify the generator function when an exception is supposed to happen.

Still, it can be done. It's not pretty, but it can be done. I've updated my NWarpAsync project to include a library(EmulateAwait) that does exactly this. This is how it looks like:

using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using NWarpAsync.EmulateAwait;

class Program
{
    static IEnumerable<TaskIteration<int>> AsyncMethod(AsyncContext<int> context)
    {
        int arg1 = context.Argument<int>(0);
        string arg2 = context.Argument<string>(1);

        yield return context.Await(Task.FromResult(arg1 + arg2.Length));
        int result = context.GrabLastValue();

        yield return context.Return(result * 2);
    }

    public static void Main()
    {
        Task<int> task = AsyncBuilder.FromGenerator<int>(AsyncMethod, 10, "Hello");
        Console.WriteLine(task.Result); //Prints 30
    }
}

And that's it. I have reused the TPL here because it existed even before await was added to C#. The goal of this library is to allow developers to use yield and await, so whatever the library uses internally is fair.

Whereas NWarpAsync.Yield had a few practical uses, this one does not. It is merely an exercise in "Could X be done to do Y?"
Well, the answer is: Yes, it can. But it shouldn't.

If you're a C# user, stick to await.
But if you're someone developing a new programming language and deciding which features to include, I hope this post was helpful.

Merry Christmas.

Saturday, October 26, 2013

Introducing NWarpAsync - Emulating yield return using await

When C# 5.0 arrived, it came with a brand new language feature - async functions.
Although asynchronous programming was already possible in previous versions of C#, it was cumbersome. Previous async patterns included the usage of methods such as BeginInvoke or Task.ContinueWith. These patterns were both based on callbacks and they were cumbersome to use.

With C# 5.0, those methods are (mostly) a thing of the past, and async/await is "The Next Big Thing".

I have designed a small library I called NWarpAsync that uses async/await for purposes that are not what it was primarily designed for. It's called "NWarpAsync" because it twists async. It starts with a "N" because, really, what open-source project for .NET doesn't?

Required Knowledge

In order to understand this post, you'll need to understand C#, including yield return and the new async/await features.

Implementing Yield with Await

The first (and so far only) feature of NWarpAsync is emulation of the yield return feature (available since C# 2.0) using await.
I got the idea from watching slides about using yield in node.js to emulate await and I wondered how easy it would be to implement the reverse.

This might not seem very compelling but this is mostly a proof-of-concept anyway and it does nicely support a few cases that C# yield does not.
Here is how it is used:

using System.Linq;
using NWarpAsync.Yield;

class Program
{
    static void Main()
    {
        foreach (int value in new Yielder(async yieldSink => {
            await yieldSink.Yield(1);
            await yieldSink.YieldAll(Enumerable.Range(2, 3));
            await yieldSink.Yield(5);
        }))
        {
            //Prints 1 2 3 4 5 (each number in a different line)
            Console.WriteLine(value);
        }
    }
}

As you see, using await as yield requires the use of a few helper classes/helper methods, but I believe the final result is still reasonably easy to understand.
YieldAll is equivalent to yielding all elements in an enumerable (or doing nothing if the enumerable is null). F# supports this (with the yield! operator), but C# has no short equivalent. Instead, we're stuck with if (x != null) foreach (var value in x) yield return value;

How it is works

The most important class of NWarpAsync.Yield is the Yielder<T> class. A Yielder<T> takes an asynchronous function (the "generator" or "builder" function) and makes it available as a plain old IEnumerable<T>.
The function that Yielder<T> takes in turn receives a parameter of type YieldSink<T>, which exposes the yielding methods. Most of the magic is done in Yielder and YieldSink.

Yielder.GetEnumerator starts the whole process. It creates an enumerator that executes a step in the function Yielder was constructed with every time MoveNext is called. Once the function stops executing (either because it has really returned or because it is awaiting the result of the Yield method), MoveNext checks if the function has yielded any values. If so, then that is value of IEnumerator.Current. If not, then the enumerator has finished, and MoveNext returns false.

YieldSink.GetAwaiter().OnCompleted registers the action that the next MoveNext will execute.

If the enumerator is disposed (e.g. with LINQ Take) before the function has finished executing, then an exception is thrown in the current execution point of the generator function to allow IDisposable resources to be cleaned up properly and finally clauses to execute.

Main Problems and Limitations

It took some effort to make exceptions work properly.
Initially, I made Yielder<T> receive an Action<YieldSink<T>> (as opposed to a function returning a Task). It turns out that unhandled exceptions are nasty for void asynchronous methods.
Switching to Task made this easier.

Unfortunately, the default behavior of Tasks means that the consumer of Yielder will never see "normal" exceptions. Instead, it'll always receive an AggregateException. This could probably be fixed, but I'm not sure it matters. I'll fix it when I'm convinced it's important.

The second problem occurs when the developer writes sink.Yield(value), but forgets to await it. The result is an ugly exception on the next Yield call which simply could not happen with the native yield. In other words, it's easier to get this approach wrong.

Advantages

If await was implemented back in C# 2.0 and there was a new proposal to add yield, I think this library would pretty much kill the arguments in favor of it. Yield can purely be a library feature if await is a language feature. Of course, it's the other way around and yield is here to stay, so that point does not hold.
That said, if language designers want to make a new language that includes await and are wondering if yield matters, I hope this can be helpful.

As I mentioned above, this library supports a F# yield!-like feature that C# does not have.

However, the biggest advantage is that, unlike yield, async/await can be used in lambdas. This means that, for the first time in C#, it is possible to make an anonymous generator function! This is what I did in the example I showed above.
You can even yield enumerations of enumerations using LINQ SelectMany.

Conclusion

Async/Await is a powerful language feature that allows a method to "pause" and then "resume" later.
Features like yield do the same. As such, it's not too hard to implement yield on top of async/await with the help of a proper library.

I'm curious to find out what other "missing" language features can be implemented using async/await. Perhaps some sort of asynchronous collection?

Wednesday, October 16, 2013

Null Value Analysis

One of the features I implemented in NRefactory this summer of code was null value analysis. I'll describe here what it is, what it is useful for and what its limitations are.

Required Knowledge

You'll need to understand basic programming concepts, such as how if statements work;
It will help if you understand what directed graphs and control flow graphs (CFGs) are;

The Problem Null Analysis Tries to Solve

Null Value Analysis is a tool (meant to be used by code issues) that tries to predict if a given value is null or not.
Knowing if a value is null is necessary to detect some redundancies, null reference exceptions and unreachable code.

object o = null;
if (o == null) {
   //The if statement is redundant, this is always executed
} else {
   //Unreachable code. This never happens.
}
string s = o.ToString(); //NullReferenceException here
//Since the previous line throws an exception, the following
// code is unreachable.
Console.WriteLine("Unreachable");

Code issues can use this information to warn the user about potential bugs and redundancies.
There are currently two code issues that use null analysis: ConstantNullCoalescingConditionIssue (that detects code like always_null ?? something) and UseOfMemberOfNullReference (that detects always_null.SomeMember). There are other issues that could benefit from this, but that feature was not yet implemented.

Overview

Null analysis is based on the concept of Control Flow Graph (CFG). The CFG for the above method is:

What the null value analyser does is traverse this graph and figure out how each statement changes the local variables.
Let's see what happens:

This method has no parameters, so it doesn't need to setup any parameter variables;
It enters the method;
It declares a new variable called "o" which is "definitely null";
It checks the if (o == null) condition. Because both "o" and "null" are known to be "definitely null", the analyser concludes that the result of the expression is "known to be true";
Because the expression is known to be true, it visits the if block, but not the else block;
It leaves the if block (since it's empty);
It checks the expression o.ToString(). Since ToString is not an extension method nor System.Nullable.HasValue, we know that for this statement to be successful, o must not be null. And yet, o is known to be null. So the null analysis considers this to throw an exception (and indeed it does - a NullReferenceException);
The node is not in a try block, so we end the analysis.

Notice how nodes #8, #9 and #10 are not even visited.

Now, let's see how null analysis handles loops.

object o = null;
while (o == null) {
   Console.WriteLine(o == null ? "Foo" : "Bar");
}

//The end of the loop is never reached

It can't just keep visiting the loop over and over again. It's an infinite loop, so the analysis would be infinite as well. Instead, it just revisits the loop if it hasn't already visited the loop once for the current set of variable states. But let's see how it behaves in this particular case:

object o is defined as being "definitely null";
Since o is definitely null (because it was declared as such in the previous line), and so is null (by definition), the loop condition is true. It hasn't visited the loop yet, so it'll do so for [o=DefinitelyNull].
It visits Console.WriteLine. Because o is known to be null, the expression will always return "Foo".
The loop ends. Because all variables (just o, in this case) are just the way they were when the loop began [o=DefinitelyNull], there's no need to visit it again.
Note that the code after the loop is never visited, because the condition is always true.

Now consider the following:

object o = null;
object p = null;
while (o == null) {
   Console.WriteLine(p == null ? "Foo" : "Bar");
   p = 1;
}
//The end of the loop is never reached

In this case, each node in the loop is visited twice. The first time for "p is definitely null" and the second time for "p is definitely not null".

The analyser keeps a dictionary (map) containing the local variable states. For instance, in the example above, the contents of the loop are visited once with the state [o=DefinitelyNull p=DefinitelyNull] and a second time with the state [o=DefinitelyNull p=DefinitelyNotNull].
When the node is visited, the analysis tries to figure out what the state of the variables will be after it.

For instance, when the analyser finds the expresssion p=1, it modifies the state dictionary to have [p=DefinitelyNotNull] (while keeping the remaining variables unchanged).

Expressions

Initially, the plan was to have a pure null value analysis. That is, everything was treated being null, not-null or perhaps-null-not-really-sure.
For variables, this turned out to be reasonable in many cases. For expressions, it wasn't.
The problem are expressions such as (x == null ? x : 1). What does that do? Well, if x is null, it returns null. If x is not null, then it returns a value that isn't null. More to the point, the behavior depends on whether "x == null" is true or false.
The analyser can only handle an expression like this if it keeps track of what's true and what's false. As such, I eventually decided that even if only the null status of variables should be stored in the state, keeping track of the boolean value of expressions was essential.
Not only that, the analyser detects that expressions such as "x == null" can only be true if x is null. So even if the value of x is not known at all, it knows that the code in if(x == null) will only be executed if x is null. So the code embedded in the if statement will have [x=DefinitelyNull] even if x was not previously known.

Another problematic case are lambdas. In C#, lambdas are capable of capturing variables.
Consider this code:

object o = null;
Action action = () => { o = 1; };
Foo (action);
//Some more code goes here

So what will be the value of o later in this function? We don't know. It might be null if action is executed. But when will action be executed? We don't know. Maybe imediately, maybe after another method is caled, maybe randomly, by another thread. Again, we don't know.

In these cases, the analyser pretty much gives up and classifies o as captured unknown. Not only it doesn't know the value, it won't ever know the value. Even if we set the variable later, it'll still remain captured.
The only way for a captured variable to become uncaptured is to go out of scope, and then declared again. Also, C# 5 foreach loops uncapture the variable every time the loop starts.
Note that although variables can be captured unknown, expressions can not. The result of an expression with a captured variable is just unknown. If we have a variable named "x" that is captured and run "y = x;", then the value of y is unknown, but it isn't captured. We can always do "y = null;" and it'll become DefinitelyNull, unlike captured variables.

When the algorithm finds ||, &&, ?: or ?? conditions, it tries to exclude operands that are known to never be executed (e.g. y in x ?? y, when x is DefinitelyNotNull). When that is not possible, it tries to "merge" the results of the expression. For instance, (x == null ? null : (x = null)) gives us a result of null and [x=DefinitelyNull] because that's what happens in both outcomes.

Query expressions are more complicated, since we need to consider the possibility of "from x in emptyEnumerable".

Semantics and Annotations

Unfortunately, null value analysis works poorly across functions.

Consider the following code:

int? Foo() {
   return 1;
}
void Bar() {
   //Assume this is the function being analysed
   var x = Foo();
   var y = x ?? 1;
}

In this case, x is never null, so the null coallescing statement is useless.
However, null value analysis doesn't know that. It'll treat "x" as an unknown value.

Fortunately, there's a way around that. Although null value analysis can't figure out what Foo returns, we can help it with annotations.

Null Value Analysis supports a subset of JetBrains.Annotations. These annotations are C# attributes that can be used to provide hints about the possible return values of methods and delegates. Additionally, they can also be applied to fields, properties, delegates and parameters (including out parameters).

[JetBrains.Annotations.NotNull]
int? Foo() {
   return 1;
}
void Bar() {
   //Assume this is the function being analysed
   var x = Foo();
   var y = x ?? 1;
}

In this modified version, the analyser will trigger the warning.

You can use JetBrains' assembly or you can write your own attribute classes. All the analyser cares about is the full class name of the attribute. If you get that right, then it doesn't matter where it comes from.

Annotations can also be used to identify a method as an assertion. Assertions are used to indicate that the program can only continue operating correctly if certain conditions are true -- it is very common to use assertion methods in unit tests.

using JetBrains.Annotations;

class TestClass
{
   [AssertionMethod]
   void Assert([AssertionCondition(AssertionConditionType.IS_TRUE)] bool condition) {
   }
   void TestMethod(string x)
   {
      Assert(x != null);
   }
}

In this case,the analyser will assume that x is not null after the assertion. Normally, the assertion method will throw an exception or abort the program when the condition fails.

Main Limitations

Throughout this post I've mentioned some of the problems of null value analysis. Still, this is important enough to deserve a dedicated section.

First of all, as I mentioned above, null value analysis does not keep track of the value of variables aside from null/not-null.
In particular, this means the analyser is clueless when it comes to recognizing patterns such as this:

bool x = true;
object o = null;
if (x) {
   o = "";
}

In this example, o will always end with the value "", but the analyser doesn't know that. It doesn't even recognize that o can't end up null. Because it doesn't keep track of the value of x, so it must assume that there's a chance the if statement won't be entered.
This limitation only applies to variables. The analyser does detect when specific expressions always return true.
I am convinced that boolean values could be added to the analyser. But the algorithm would be too slow for integers and even ocasionally get stuck in infinite loops for strings.
The reason this would happen is because every time a previously visited node is reached more than once (this happens e.g. in loops) the analyser must decide whether to re-visit the node. For that, it checks what values the variables in scope had the last times that node was visited. If the variables are the same, nothing changed, so there's no point in visiting the node again. If the variables changed, then revisiting the node is necessary for correctness.
This works for null/not-null because there are only two possible values. Even considering states such as "Maybe Null" or "Unknown, Captured by Lambda", the total number of states remains low. Moreover, many states "include" others. That is, if a node was visited for [x=MaybeNull], it won't be revisited for [x=DefinitelyNull], because [DefinitelyNull] is included in the [MaybeNull] scenario.
This means that there is an upper bound to the number of times a node can be visited (depending on the number of variables in scope and the possible values a variable can have). Most functions have few variables changing values all the time.
For booleans, everything is still okay. But for integers, the number of possible values a variable can have goes up and, with it, the number of times a node could be revisited. A lot.

int i = int.MinValue;
while (i != int.MaxValue) {
   ++i;
}

This small function triggers a sore spot in an hypothetical analyser that tracks the value of integers. The null analyser isn't too bothered by this loop. It only enters it once.
But if the analyser kept track of the value of i, then it would detect that the value of the loop has changed every iteration. Because of that, it'd revisit the loop. But that would change the value again. So the analyser would visit the loop again. And again. And again. Around 2^32 times.

Strings have infinite-for-all-practical-purposes possible combinations, and as such would trigger infinite loops in the "smarter" analyser.

For integers and strings, a completely different algorithm would be required.

The second major limitation is the way expressions are handled when it comes to control flow.
The Null Value Analyser is based on NRefactory's built-in Control Flow Graph classes. Unfortunately, these classes are very statement-centric and do not handle expression-level control flow.
In other words, the CFGs know how to treat if statements, but not ? : expressions.

My solution was to patch a limited form of expression control flow detection on top of the statement CFG.
Whenever an if statement is encountered, the analyser handles the condition, then schedules two nodes to be visited -- the "true" case node, and the "false" case node. Two visits.
Whenever || is encountered, the analyser handles the left side, then handles the right side, then combines the result of both sides and continues. When the expression statement is over, the next node is scheduled to be visited based on the combined result. One visit.
This means the analyser is faster but less accurate for expressions than for statements.

if (x == null || y == null) {
   if (x != null) {
      // y is always null here, but the analyser doesn't know that,
      // because the top if statement only saw the combined result
      // of the || expression.
   }
}

The third major limitation is related to missing method annotations. Specifically, Null Value Analysis does not implement the complex ContractAnnotationAttribute. This attribute would allow programmers to give the analyser highly specialized hints, such as "this function never returns true if the parameter is null".
Because Null Value Analysis doesn't support any equivalent annotation, methods such as string.IsNullOrEmpty can't be supported properly.

The final major limitation is the lack of external annotations.
The analyser currently only recognizes annotations in the source code itself. For third-party code, the only way to add annotations is to modify the source code. This is undesirable (not to mention highly impractical) and external annotations are definitely on the wishlist, but nothing is currently available.
In short, most libraries (including standard .NET libraries) are currently hostile to the analyser.

Final Words

I hope null value analysis ends up being useful. It can certainly work reasonably well in theory and in a few test projects (notably NRefactory, MonoDevelop and xsp) but the world of code is a lot vaster than that, so it's still unknown if it'll survive under the harsh light of real-world usage. Time will tell, I guess.

This year's GSoC is over and I passed the final evaluation. Overall, I think null analysis was the hardest thing I wrote.
I enjoyed working in MonoDevelop and NRefactory and I'll certainly try to contribute again, if I have some time to spare. I actually learned a few new C# features and I got exposed to some patterns and paradigms I hadn't given much though before.

And that's it for today.

Thursday, August 8, 2013

Introducing C# Split Project

There are many good reasons to have many projects in one solution.

Having more projects means having looser coupling.
If each project has an assigned purpose, this tends to ensure better cohesion.
Finally, if the projects are libraries, and the users of those libraries only need a few of the provided features, only the few projects with those features need to be referenced, ensuring less code is imported (which might mean smaller application packages).

Unfortunately, we don't always have many projects in a solution. Sometimes we just have a few... or just one - and we'd like to take portions of those few projects and create new ones.
Doing so currently involves manually creating the projects, adding the proper references, moving the files and praying that the coupling of our files was sufficiently loose to allow us to get away with it.

Introducing Split Project, one of the features I've implemented as part of my Google Summer of Code project.
Let's start with a basic project with only the following files:

//Program.cs
using System;

namespace HelloWorld
{
    class MainClass
    {
        public static void Main(string[] args)
        {
            PrintService.PrintMessage ("Hello, World!");
            PrintService.Printer = new SpecialConsolePrinter ();
            PrintService.PrintMessage ("Hello again");
        }
    }
}

//IPrinter.cs
using System;

namespace HelloWorld
{
    public interface IPrinter
    {
        void PrintMessage(string message);
    }
}

//DefaultConsolePrinter.cs
using System;

namespace HelloWorld
{
    public class DefaultConsolePrinter : IPrinter
    {
        public void PrintMessage(string message)
        {
            Console.WriteLine(message);
        }
    }
}

//SpecialConsolePrinter.cs
using System;

namespace HelloWorld
{
    public class SpecialConsolePrinter : IPrinter
    {
        public void PrintMessage(string message)
        {
            Console.WriteLine("[SPECIAL] {0}", message);
        }
    }
}

//PrintService.cs
using System;

namespace HelloWorld
{
    public static class PrintService
    {
        IPrinter printer;
        public IPrinter Printer
        {
            get {
                if (printer == null) {
                    printer = new DefaultConsolePrinter ();
                }
                return printer;
            }
            set {
                if (value == null) {
                    throw new ArgumentNullException ("value");
                }
                printer = value;
            }
        }

        public static void PrintMessage(string message)
        {
            Printer.PrintMessage(message);
        }
    }
}

And now, we want to see if we can move some portions of this project to a new one.
We can do this manually, of course. PrintService depends on IPrinter and DefaultConsolePrinter and Program depends on PrintService and SpecialConsolePrinter. But doing this manually is no fun and particularly error-prone, especially when we start considering bigger projects.

In this example, we chose to move PrintService to the new project. Because PrintService depends on IPrinter and DefaultConsolePrinter, both have been automatically selected.
Note how "PrintService.cs" appears in "Dependants". This means that the file is required because of that file. There is no way to move PrintService without moving IPrinter and attempting to uncheck IPrinter will also uncheck PrintService.
On the other hand, Program.cs, AssemblyInfo.cs and SpecialConsolePrinter.cs aren't selected because PrintService does not depend on those.
The "Ok" button appears disabled because we have not yet chosen a name for the project.

This time, we have also selected Program.cs. Program.cs depends on everything except on AssemblyInfo. Note that some files (such as IPrinter) show both Program.cs and PrintService.cs in "Dependants". This is because both files that were directly selected by the user depend on them.

Split Projects supports an additional feature. Instead of selecting files one by one, you can select whole folders to be moved.

In this case, we have selected the "Backend" folder to be moved. All files within have been marked as directly chosen to be moved by the user.

Let's see what happens when we click the "Ok" button:

And that's it. The project compiled and ran without any errors.

Future Work

There are a few additional features and cases I'd like to try and add to the dialog in the feature:

Sometimes, the dialog does not work in the presence of certain useless "using Some.Namespace;" declarations in a moved file. If no class in Some.Namespace is moved, then the useless "using" declarations will become compile errors. This can be easily fixed by removing the erroneous declarations, but it can be a nuisance anyway.
If a file has many type declarations, then it would be nice having the option to move only certain type declarations to the new project.

I hope this improves the lives of programmers trying to organize their solutions better.

Monday, July 8, 2013

Google Summer of Code Week 3

Last week I got less work done than the previous ones. That said, I still got some things to show.

First of all, I finally managed to complete Extract Enum, though I ended up moving it back to MonoDevelop.

The main problem o DoGlobalOperationOn is the lack of NRefactory unit testing infrastruture. Whereas most actions and issues are easily tested and, therefore, require only minimal testing in the IDE, DoGlobalOperationOn must be manually tested all the way, which is massively inconvenient.

Also, not all extract enum features are fully available yet. It is not yet possible to choose which fields to include in the new enum due to a bug in Xwt.Gtk. Once that's fixed, I believe everything will work as expected.

I also investigated ILSpy to see how easy it would be to reuse code from it to implement a new code action: Convert LINQ fluent syntax to LINQ query. This would essentially be the opposite action of what I made last week.

The portion of ILSpy I studied was the decompiler. The decompiler takes .NET code (e.g. an executable or class library) and generates C# code back from it. It includes a variety of features including the ability to convert generated LINQ code back to LINQ queries.
The decompiler of LINQ queries seems to work in two phases: First, it generates simple queries, and then combines those queries:

list.Where(x => x > 0).Select(y = > y * 2);
//After phase 1
from y in from x in list
          where x > 0
          select x
select y * 2;
//After phase 2
from x in list
where x > 0
into y
select y * 2;

The code for the first phase (introduce queries) seems pretty solid, but the code for the second phase is far less so. It depends heavily on the particular output of the decompiler and, as such, is unsuitable for use in user code.

This week, I'll try to implement global operation unit testing and some other actions.

Monday, July 1, 2013

Code reuse and Global Operations - Google Summer of Code Week 2

This is the second week of my Google Summer of Code project. Last week, I implemented only two small and simple code actions. This week, I got a bit more ambitious.
Technically, some of this work was finished after the Midnight of Sunday, I'll leave it to the reader to decide if that's "cheating".

Null coallescing and pattern matching

The first action I'll be talking about is ConvertIfToNullCoalescing action.

//This action takes this:
object x = Foo();
if (x == null) x = Bar();

//And replaces it by this:
object x = Foo() ?? Bar();

This is a very simple action - similar to the stuff I wrote last week.
What changes this time is the heavy use of pattern matching.

With pattern matching, it's possible to define a certain pattern (e.g. if (someExpression) doSomething();) and check if a node matches that pattern. For instance, the pattern if (<Any>) { <Any> } recognizes if (true) { return 0; }, but not while (true) {} nor if (true) return 0;.

It's also possible to have named nodes. With named names, it's possible to know what matched certain parts of the pattern. For instance, using pattern if (<condition> = <Any>) {} to match if (true) {}, named nodes tell us that the condition is true.

Of course, this is a very simple example. Pattern matching gets more useful in more complex cases.
Also, note that patterns are created as normal C# objects - the string representation I used above was merely intended as an example.

Dispose and IDisposable

Another simple action I did was search for types with a void Dispose() method that didn't implement IDisposable. This issue detects and fixes the problem for classes, though it is disabled for interfaces.

public class Test
{
    public void Dispose() {}
}
//Is converted to
public class Test : System.IDisposable
{
    public void Dispose() {}
}

The issue is disabled for interfaces. I initially implemented that feature but ultimately decided to remove it due to a problematic edge case with explicit implementation:

interface IA {
    void Dispose();
}
class B : IA {
    void IA.Dispose() {}
}
//Let's say the action converted this to:
interface IA : System.IDisposable
{
}
class B : IA {
    void IA.Dispose() {} //ERROR here
}

The only way for this action to be effective for explicit interfaces would be to use a global operations, and I only figured out how to do this effectively on Sunday. More on global operations later.

LINQ Query and LINQ Fluent

One of the best features of .NET and C# is LINQ.
LINQ brings a declarative/functional taste to C# and greatly improves the readability of list operations - like SQL but better.

There are many ways to use LINQ in C#.

One could use the methods of System.Linq.Enumerable, such as Enumerable.Where(myEnumerable, SomeFunction). This could be acceptable if there was no better way of doing things -- which there is;
Use those same methods from Enumerable but in a different way. Those are extension methods and, therefore, can be used as myEnumerable.Where(SomeFunction). This is much better than the first option, especially because it's a Fluent Interface and therefore method calls can be chained -- myEnumerable.Where(Foo).Select(Bar).
The last option is the query syntax. Instead of using the method calls, the programmer can use special syntax designed specifically for this purpose. More on this later

Query syntax is translated by the C# compiler to fluent syntax.

var seq = from item in myEnumerable
          where Foo(item)
          select Bar(item);

//Is converted to:
var seq = myEnumerable.Where(item => Foo(item)).Select(item => Bar(item));

Because query syntax is blindly converted to method calls, it can be used for a lot more than just Enumerables. Anything with the correct methods (e.g. Where and Select) can be used with query syntax, it doesn't need to be an extension method - a normal instance method works just fine.

And, in fact, the argument doesn't even need to be of type System.Func.

LINQ is not just used for enumerables. Some of the best uses of LINQ are related to databases.

int idToFind = 10;
var userWithId = from user in database.Users
                 where user.Id == idToFind
                 select new { user.Name, user.Score };
Console.WriteLine("User {0} has {1} point(s).", userWithId.Name, userWithId.Score);
//The query is translated to
int idToFind = 10;
var userWithId = database.Users.Where(user => user.Id == idToFind)
                     .Select(user => new { user.Name, user.Score });
Console.WriteLine("User {0} has {1} point(s).", userWithId.Name, userWithId.Score);

In this case, database.Users is not an enumerable. It's an instance of IQueryable.

Instances of IQueryable actually read the contents of the passed lambdas -- they use Expression Trees. And it works just fine.

So which is best? Query syntax or fluent syntax? It depends.
In some cases, fluent syntax is the simplest and most compact solution.
In other cases, query syntax is the most readable option - since joins and other complex queries do not involve complex method calls with anonymous types.

The action I implemented converts query syntax to fluent syntax. It should be capable of handing any query. So, whenever a query would be best written with fluent syntax, there's no need to manually convert it (and risking introducing new bugs!) - just let the code action do it.

This code action turned out to require less effort than I expected, since NRefactory already had a class to do what I needed - QueryExpressionExpander. Sadly, I didn't know about that class, so I ended up reading the relevant portion of the C# specification and writing my own code. Later, when I found out about QueryExpressionExpander, I rewrote a significant portion of my own work to use it - fixing a bug in NRefactory and adding a feature to QueryExpressionExpander along the way.

As it turns out, the C# specification is surprisingly readable. I was expecting an impenetrable wall of technical text but instead I was greeted by a text with lots of examples and simple to understand.

Convert to Enum - Global Operations

The last action I wrote this week is again a tale of finding the right class to use.

Anyway, the story of this action started with an old bug report (and by old I mean half a year old). An user with lots of Java code automatically converted to C# had lots of fields like public const int SOME_PREFIX_SOME_NAME and wanted to convert all those fields to enumerations. Code actions to the rescue!

//We have this
public const int SOME_PREFIX_A = 1;
public const int SOME_PREFIX_B = 2;
//We want this
public enum SOME_PREFIX : int
{
    A = 1,
    B = 2
}

I think it's not an overstatement to say this was the hardest action to implement so far. And I'm not even sure I'm done yet.

The main problem with this action that references to SOME_PREFIX_A become references to SOME_PREFIX.A. This means changing files besides the one with the field declaration.
It's not a simple find-and-replace either. We don't want to change strings nor comments. And we don't want to replace different fields with the same name.
As if things weren't bad enough, enumerations aren't implicitly cast to integers. The "perfect" version of this action would also replace method parameter types. This would soon become tremendously complex.
I opted to develop a more limited version of this conversion. Instead of replacing types everywhere, this just adds a type cast to the enum underlying type. So instead of SOME_PREFIX.A, we'd have ((int) SOME_PREFIX.A).

So, back to global operations. A global operation is an operation that changes files other than the current one. NRefactory has a method just for that: script.DoGlobalOperationOn.

Now, it would be just fine if I were to just use that and be done with it. Unfortunately, I only discovered that method (and what it really did) very late. Before finding it out, I had to go on a tour discovering MonoDevelop source code. While the information I learned has helped me understand MonoDevelop better, I went full circle and ended up where I started - in NRefactory.

This code action is still not fully tested, so it might not work perfectly in some edge cases yet.

Conclusion

This week's work was harder than last week's. Still, I managed to get most of my planned work done.
Additionally, I'm gradually learning about new APIs of NRefactory, so I expect things to get easier as I get more experienced.

The main problem this week was spending time figuring out how to do things, only to find out there was an API just for that. Fortunately, the number of APIs in NRefactory is finite, so I'm hoping this situation won't last forever.

Sunday, June 23, 2013

Google Summer of Code - Week 1

First of all, the good news: I have been accepted into Google Summer of Code 2013. I'll be working with MonoDevelop and NRefactory to add new source analysis features.
I'll start by explaining what these projects are, then I'll show my work this week and finish with a commentary of what I saw and learned this week.

An Introduction

MonoDevelop is a cross-platform IDE that supports several programming languages. Among those languages, the most important one is C#.

NRefactory is a C# library that allows applications to parse and manipulate C# source code (it also supports a few other languages, but my project won't cover those).
With NRefactory, it is possible to analyze source code for patterns and programmatically modify the code. This can be extremely helpful when refactoring.

NRefactory includes code actions and code issues.

Actions are simple ways to modify code to something slightly different. For instance, there is a code action to turn a variable declaration with type inference into a variable declaration with the explicit type.

//This:
var x = new List<int>();
//Becomes:
List<int> x = new List<int>();

Issues detect anti-patterns: Code that is likely hard to maintain and a potential source of bugs, or just poor taste - often, issues can be automatically fixed.
For instance, there is an issue that detects when a variable is assigned to itself.

class SomeType
{
    int x;

    public Constructor(int x) {
        x = x; // <-- This is useless, the programmer probably meant this.x = x
    }
}

Actions and issues are not the only features of NRefactory. It can be used to implement code completion, format code and pretty much anything that requires code manipulation.

MonoDevelop integrates NRefactory to provide a first-class programming experience. Right-clicking the code shows the applicable issues and actions. Issues are harder to ignore than actions, since the IDE will show those more prominently - bad code usually appears with a curly green underline.

Before proceeding, it might be a good idea to read Using NRefactory for analysing C# code on CodeProject. This post will be easier to understand if you do.

My Project - Week1

This summer, I will be adding new source analysis features to NRefactory and MonoDevelop. This is officially my first week of work. Because I've been busy with exams, I haven't been able to fully dedicate myself to this project. But still, I did manage to get some decent work done.

This week, I implemented one code issue, one code action and a bug fix.

lock(this) and MethodImplOptions.Synchronized

The code issue detects a synchronization anti-pattern.

Proper synchronization is a critical - yet hard - part of multi-threading. When synchronization is done right, threads work correctly. Poorly done synchronization can cause lots of headaches (see deadlocks and race conditions).

Because synchronization is so error-prone, it should be as surprise-free as possible. Synchronization code should be straightforward and have as few "gotchas" as possible - ideally, none.

If it's not obvious what code is locking what mutexes, then the code really isn't straightforward. If some random programmer using a library can lock a mutex used by the library and the original library author doesn't know about it - that can be a huge problem.

In C#, locking is done with the lock statement. Here's an example of what it looks like:

class SynchronizedExample
{
    string someString = string.Empty;
    public string SomeString {
        get { return someString; }
    }

    public void AddToString(string stringToAdd) {
        lock (this) { //this line is bad, we'll soon see why
            someString += stringToAdd;
        }
    }
}

The code above ensures there are no race conditions when appending data to a string. The this object is used as a mutex so two different objects can access their own strings simultaneously in different threads, but multiple threads can't append to the same string at the same time - if they try, one of them is forced to wait.

lock(this) is not a good idea.

The reason is that anyone with a SynchronizedExample instance can lock on it.

var example = new SynchronizedExample();
lock (example) { ... }

In practice, this means the original author of SynchronizedExample can never really know when the mutex is locked. This code is ticking bomb. It isn't blowing up, but it could at any time.

Ticking bombs are good candidates for code issues. The new LockThisIssue detects this problem and suggests a fix. The code above is underlined and right-clicking shows a message warning the programmer that lock(this) is bad. Clicking on auto-fix turns it into this:

class SynchronizedExample
{
    string someString = string.Empty;
    public string SomeString {
        get { return someString; }
    }

    object locker = new object();
    public void AddToString(string stringToAdd) {
        lock (locker) {
            someString += stringToAdd;
        }
    }
}

Well, the comment wouldn't be deleted, but other than that, everything is fine now. Synchronization remains correct, but now it's gotcha-free. No unpleasant surprises waiting to happen.

This is not all this issue corrects.

.NET has an attribute called MethodImplAttribute. This attribute exposes several features. One of those features is synchronization.
Exactly why .NET has this "feature" is unclear but, in short, it is roughly equivalent to adding a lock (this) statement that includes the entire method.
It sucked before, it still sucks now. So the issue detects the problem, removes or fixes the attribute and adds the proper lock statement.

Auto-Linq Sum

As I said before, actions allow transforming code. The code doesn't have to be a red flag. Actions can just show an alternative. Use it if you like the alternative, ignore it if you don't.

The action I implemented converts some foreach loops to LINQ expressions.

int result = 0;
foreach (int x in intList) {
    result += x;
}
//Is converted to:
int result = intList.Sum();

It also supports if/else statements in the loop, multiple statements and more complex expressions.

int result = 0;
foreach (int x in intList) {
    if (x > 0) {
        result += x * 2;
    }
}
//Is converted to:
int result = intList.Where(x => x > 0).Sum(x => x * 2);

Using a loop is not wrong, so it's not a code issue. But I believe loop-to-LINQ is still a refactoring tool worth having.

Final Notes

Both MonoDevelop and NRefactory are interesting projects and I expect my work to make it to future versions. In fact, the two major features I described in this post have already been merged into master.

Here are a few of the things I've found:

git submodules: Essentially, when Project A depends on Project B, the git repository for Project A can include the git repository for Project B as a submodule. The submodules have their own separate commits and branches - they are essentially independent since they are a separate repository. This seems like a good way to handle dependencies.

Unit Tests: NRefactory has the most extensive unit test suite I've ever seen and it's pretty cool. So far, I haven't had many opportunities to see unit tests in action. Most of the unit tests I've seen were really just examples, but NRefactory is a real-world case that really proves how much tests can help.
For instance, when testing a code action, the programmer can say which action to test, the original source code and the intended output. If they match, the test passes. Otherwise, the test fails.
One can just add a bunch of normal cases, a few harder ones (for edge cases) and tests to detect potential problems. NRefactory was the first time I ever used Test-Driven Development in a real-world scenario. I've found myself writing tests when the code action was merely a stub and I'm satisfied with the results.

Linking: The API does not always behave the way I'd expect it to behave. I've been bitten in a few ways since I started, though I'm getting used to the details quickly.
One example is linking: script.Link is a neat feature. It tells the IDE that some identifiers are actually one and the same.
If an action creates lots of statements in the code and they all refer to the same variable, Link lets the user choose a name and the whole code will be fixed to use that name.
Now, Link is not supposed to be applied to code that's been removed. This is perfectly acceptable (why change what no longer exists?), but nothing in the code detects when it happens accidentally. It is what I'd call "undefined behavior".
The tests just silently ignore it. Everything will seem to work except for some puzzling extra whitespace that shouldn't exist. But in MonoDevelop, that same code will behave differently - and generate code that's just wrong. Of course, the solution is not to use it that way, but again that does not make debugging easy when it happens by accident.
See the bug report (pictures included).

Edge cases: C# is a big language with lots of features. This is great for the programmer, but it does mean that tools such as NRefactory must be tested in lots of edge cases. In fact, pretty much all bugs I found in NRefactory (the ones that weren't my fault, that is) were related to features the original author just didn't think of when writing the code: type inference, anonymous types, extension methods, generics and the many things that can be on the right side of a lambda expression (is it a statement? An expression? Is is a method with void return type?).

This concludes my work for the week.

Sunday, April 21, 2013

Asynchronous OpenGL texture loading with C# and OpenTK

Games often have a large quantity of resources to load. When streaming is not an option (or not a good option), loading screens are one common solution.
Loading screens should be as short as possible (to reduce player frustration) and should include some sort of animation to let the player know that the game isn't frozen.
This post focus on two questions:

How can we take advantage of multi-core processors to reduce loading times?
How can we create/load OpenGL textures (and other OpenGL resources) and render something to the screen at the same time?

We'll be exploring the following methods to implement loading screens:

Synchronous texture loading.
Single-threaded alternating texture loading.
Helper loading thread.
Multi-threaded loading with Parallel.ForEach.
Task-based loading with TaskCompletionSource or a custom TaskScheduler.

Required Knowledge

The code is written in C#, so basic knowledge of C# is necessary.
Basic OpenGL knowledge would help. We'll be using OpenTK in this post - a .NET library that allows C# programs to use OpenGL (and a few other APIs as well).

Synchronous Texture Loading

First, we'll explore how a synchronous texture loading system could work, its advantages and its shortcomings.
To create a texture in OpenGL, we need to follow these steps:

Read the contents of the texture file
Decode them
Create a new OpenGL texture name (see glGenTextures).
Send the texture data to OpenGL with glTexImage2D.
Set the image parameters.
Release the data of the texture file we loaded.

To load and decode textures, we'll use the System.Drawing.Bitmap class.

using (var bitmap = new Bitmap(filename)) {
    width = bitmap.Width;
    height = bitmap.Height;
    textureId = GL.GenTexture();
    GL.BindTexture(TextureTarget.Texture2D, textureId);
    var bitmapData = bitmap.LockBits(
        new Rectangle(0, 0, bitmap.Width, bitmap.Height),
        System.Drawing.Imaging.ImageLockMode.ReadOnly,
        System.Drawing.Imaging.PixelFormat.Format32bppArgb);
    try
    {
        GL.TexImage2D(TextureTarget.Texture2D, 0, PixelInternalFormat.Rgba,
            bitmap.Width, bitmap.Height, 0, PixelFormat.Bgra, PixelType.UnsignedByte,
            bitmapData.Scan0);
    }
    finally
    {
        bitmap.UnlockBits(bitmapData);
    }
    GL.TexParameter(TextureTarget.Texture2D,
        TextureParameterName.TextureMinFilter, (int)TextureMinFilter.Nearest);
    GL.TexParameter(TextureTarget.Texture2D,
        TextureParameterName.TextureMagFilter, (int)TextureMagFilter.Nearest);
}

This approach is simple. And simplicity is a feature.
The main problem of this approach is that it is blocking. While the textures are being loaded, nothing else can be executed. That includes rendering.
Until all textures are loaded, the game freezes. This is undesirable.
I've tested this method to load 1500 textures and it took around 2200ms. That's two seconds waiting for something to happen with no visual feedback.

Loading Screen

So now the big question is how to render and load at the same time.
One possible answer is to load a few textures per frame.
For instance, if we have 1500 textures to load, we could load 50 each frame and render normally.
After 30 frames, the loading would be over.

This approach works because we never load too many textures in each frame (and each texture individually takes only a short time to load) so the loading screen is somewhat smooth.
Since it's just a loading screen, it probably won't have anything too heavy anyway and frame rate drops are acceptable.

I haven't measured this method, but I'd expect it to take a bit longer than the blocking one.

But we still have one remaining question: How can we use multi-core processors to speed this up?

Concurrency

There are many ways to implement concurrency.
The best known method is probably multi-threading.
Essentially, if you want to execute three tasks at once you create three threads. Each thread then executes a task.
All running programs have at least one thread.
The thing about threads is that aside from sharing memory, they are essentially independent of each other. This is, of course, a feature. But thinking about multiple pieces of code executing at the same time tends to be harder than thinking about the equivalent single-threaded code, since threads introduce several brand new types of errors.
As a result, it is currently harder to write correct multi-threaded programs than to write correct single-threaded programs.
So why bother?
While the usefulness of threads is somewhat limited (but still existent) in single-core processors, it's on multi-core (or at least hyper-threaded) processors that threads really shine.
Typically, the operating system will assign threads to processor cores. Often, we'll have different threads running on different cores. It's when this happens that we get true parallelism (as opposed to preemptive multitasking). In theory, parallelism can speed up programs up to N times (where N is the number of cores). EDIT: In some rare cases, it can be higher than N - See super-linear speedup.
In practice, however, not all tasks can be parallelized and multithreading adds some overhead - so the speed up is typically smaller than that. How much smaller? It depends. In some cases, the multi-threaded program might actually be slower than the single-threaded one.

So how do we use threads to improve our loading screen?
The first option is to create a helper thread that does the loading, while our main thread is rendering.
This approach essentially offers very little advantage over our load-few-textures-per-thread.

It can only use up to two cores - so if we have more than that tough luck.
Only one core does the loading. We can expect one of the cores to be significantly busier than the other one. So our program won't be anywhere near twice as fast as the first one. I haven't really tested this approach, but I expect it to take around 2200ms as well (the time the blocking method takes). Perhaps a bit longer, due to the overhead of multi-threading.

The first issue we'd encounter if we tried this method is that OpenGL would refuse to load our textures from the loader thread.
To understand why, we need to understand what OpenGL contexts are.

OpenGL contexts

A context represents the state of OpenGL and all sorts of info that OpenGL needs to know in order to do anything.
That means no OpenGL command can be executed without a context.
Many libraries isolate the creation of a context from the programmer so that OpenGL can be used in blissful ignorance.
All that bliss comes crashing down when threads are introduced.

Each thread can only have one context (called the current context) and each context can only be current to one thread. A context doesn't necessarily have to be the current context of any thread, but a context that's never made current is worthless.
Changing the current context of a thread is an expensive operation and, as such, should be avoided as much as possible.

Contexts can be independent of each other or they can share resources such as textures. By default, context sharing is enabled in OpenTK.

So for our loader thread proposal to work, we'd have to create a new OpenGL context and make it current for the loader thread. Then it could work.
But at this point, we still don't take full advantage of multi-threading.

Parallel.ForEach

Threads incur overhead. That much is a fact. This overhead can be mitigated but never eliminated.
This overhead is acceptable when the gains are enough to exceed it. But the gains of multi-threading are limited.
Let's simplify the problem and ignore the whole mess of I/O bound vs CPU bound and hyperthreading. Then we can say that if there are more threads than processor cores, the threads can't be all run in parallel. We get some parallelism, but the operating system has to use other methods to ensure that all threads get to run.
Then we still pay the cost of threads, but performance starts to degrade.
So let's say using 10 threads gives us the best result, but we have 1000 textures to load. How do we split work across the 10 threads?
The obvious answer is "let each thread load 100 textures each".

But not all textures necessarily take the same time to load. One small texture will typically be faster to load than one large texture. What is a thread gets 100 fast textures and one thread gets 100 slow textures? Clearly, we'd not be taking advantage of threads properly.
And how do we even know 10 threads are the best?

These are all hard problems.
In .NET, this problem was solved with the introduction of the Task Parallel Library. There's no need to reinvent the wheel in this case.

Here's how the code would look like:

//Sequential code
foreach (var textureToLoad in texturesToLoad) {
    LoadTexture(textureToLoad);
}

//Parallel code
Parallel.ForEach(texturesToLoad, textureToLoad => {
    LoadTexture(textureToLoad);
});

This example assumes, of course, that LoadTexture is thread-safe. In my own code, I decided to store all loaded textures in a Dictionary. However, Dictionary is not thread-safe, so I had to replace it with a ConcurrentDictionary. Always be careful about that sort of things when writing multithreaded code.
Although we could rewrite this example using PLINQ Select, I've decided to stick to Parallel.ForEach because we're not done yet. We're going to use a different Parallel.ForEach overload - and I'm unaware of any PLINQ equivalent.

The first problem is, once again, OpenGL contexts. Since Parallel.ForEach may use multiple threads, we need to set a new context for each thread. And, of course, destroy the context when we're done.

The second problem is that Parallel.ForEach is blocking. Even though it runs in parallel, it still makes our main thread wait for all the worker threads to finish their work. We do not want this.
Here's the fixed code:

//We'll be covering what the Task class is and what it does soon.
Task.Factory.StartNew(() =>
{
    Parallel.ForEach(texturesToLoad,
    () =>
    {
        //This is the code to initialize each new thread
        var window = new GameWindow(width, height, graphicsMode, "", gameWindowFlags);
        window.MakeCurrent();

        return window;
    },
    (textureToLoad, state, window) =>
    {
        //This is the code to load each texture
        LoadTexture(textureToLoad);

        return window;
    },
    (window) =>
    {
        //This is the code to finalize each thread
        window.Dispose();
    });
});

This solves our problem and takes advantage of parallelism to achieve faster results.
In my own tests, loading 1500 textures (the same textures we've loaded before) takes almost 2000ms. This is faster than our previous code but still somewhat disappointing.
We can do better than this.

Tasks and continuations

We've used Task.Factory before, but now it's time to really cover what it is.

Essentially, a Task is a unit of work. The best part about using tasks is that it opens up the way to use C# 5 syntax sugar that makes asynchronous programming less painful to use.
Though the prospect of using syntax sugar in the future is appealing, we will not actually be doing that today.
Today, we're going low-level.

So, how do we create a task anyway?
We've created a task in our example above using Task.Factory.StartNew. This is the simplest way to do it.

Task.Factory.StartNew(() => {
    Console.WriteLine("Hello from task");
});

The example above "just works". Once that code is executed, it'll create a task that's (potentially) executed in parallel.
Notice that the task is started as soon as it's created. This behavior is not strictly required. It's possible to construct a Task that's not immediately started with "new Task(action)", but by convention all functions that return tasks start them unless there's a really good reason not to.

At this point, Tasks don't do much we couldn't already do with Parallel.ForEach, with the added inconvenience that we can't set the OpenGL context of each thread.

For tasks to become more useful, we first need to introduce the concept of task continuations.
The purpose of task continuations is to allow performing some action only after a task has been completed.

Task.Factory.StartNew(() => 3)
    .ContinueWith(task => Console.WriteLine(task.Result));

In the example above, first we launch a task that returns 3 and we indicate that when that task is completed, we want to print the result in the screen.
With this, we can create tasks that depend on other tasks.

Task.Factory.StartNew(() => 3)
    .ContinueWith(task => {
        Task.Factory.StartNew(() => {
            Console.WriteLine(task.Result);
        });
    });

This example launches a second task after the first one has been completed. We'll soon see why this matters.

Loading textures in two steps

We'll now be trying a different approach to load our textures.
Up until now, we've been trying to parallelize the entire loading process. Now, we'll instead parallelize only a portion of the work.
Because we're parallelizing less work, we could expect performance to degrade but the opposite seems to happen.

Let's go back to the steps required to load a texture:

Read the contents of the texture file
Decode them
Create a new OpenGL texture name.
Send the texture data to OpenGL.
Set the image parameters.
Release the data of the texture file we loaded.

The most expensive steps are the first two. What if we parallelized only these and kept the remaining ones single threaded?

foreach (var textureToLoad in texturesToLoad)
{
    var baseTask = new Task<Bitmap>(() =>
    {
        var bitmap = new Bitmap(textureToLoad.AsFilename());
        return bitmap;
    });

    baseTask.ContinueWith(task =>
    {
        ScheduleToBeLoadedInMainThread(() => {
            var bitmap = task.Result;

            LoadTextureFromBitmap(bitmap);

            bitmap.Dispose();
        });
    });

    baseTask.Start();
}

For each texture to load, we launch a new task. Once that task is completed, we run a second operation, which schedules the bitmap to be loaded later by the main thread.

Now, we combine this approach with our alternating loading approach.
We go to UpdateFrame method and execute up to N actions (load up to N textures). The exact value of N depends on how much work you're willing to let each individual frame do. In my tests, I've used N=50.
We'll soon see how we can schedule work to be run in main thread.

Right now, we'll focus on a different problem. How to tell when we've finished loading.
If we're in a loading screen, we want to know we've finished loading so we can move on to the next screen.
For this, I've used a counter that indicates how many operations are left. When this counter reaches 0, we know we're done. Because the counter is read and modified by multiple threads, we must be careful and use atomic operations as necessary.

int counter = 0;
var baseTasks = new List<Task>();
foreach (var textureToLoad in texturesToLoad)
{
    var baseTask = new Task<Bitmap>(() =>
    {
        var bitmap = new Bitmap(textureToLoad.AsFilename());
        return bitmap;
    });

    baseTask.ContinueWith(task =>
    {
        ScheduleToBeLoadedInMainThread(() => {
            var bitmap = task.Result;

            LoadTextureFromBitmap(bitmap);

            bitmap.Dispose();

            if (Interlocked.Decrement(ref counter) == 0)
            {
                //We have no further textures to load
                ready = true;
            }
        });
    });

    baseTasks.Add(baseTask);
}

counter = baseTasks.Count;

foreach (var baseTask in baseTasks)
{
    baseTask.Start();
}

This is good and all, but the ready variable does not really allow the user of this code to fully take advantage of tasks. For instance, what if we want to add a continuation to run after we've finished loading our threads?

TaskCompletionSource

TaskCompletionSource (TCS, for short) is enables a whole new way to create tasks.

Up until now, we would start a new task by giving .NET a new action to run. The code would execute and then the task continuations would be executed.

With TCS, we tell C# "Hey, we're doing some stuff. We'll let you know when we're done and what the result was."
So while the tasks we've used up until now were somewhat active - they decided when they were done - the TCS paradigm is more passive - we decide when they are done.

TCS allows us to run code using a completely custom mechanism, and notify C# when we're done to allow continuations to be executed as they normally would.

public static Task WaitOneSecond() {
    var tcs = new TaskCompletionSource<object>();

    new Thread(() => {
        Thread.Sleep(1000);
        tcs.SetResult(null);
    }).Start();

    return tcs.Task;
}

Let's study the example above.
This function creates a new pending task. Then, we create a new thread that sleeps for one second and lets C# know when the waiting time is over.
We use TaskCompletionSource<object> because C# doesn't offer a non-generic TaskCompletionSource. TaskCompletionSource<object>.Task gives us an instance of Task<object> which can, fortunately, be implicitly cast to Task.
Note that we don't really "start" the task because it's up to us to decide how and when the task is executed.

We could use this function like this:

WaitOneSecond().ContinueWith(task => { Console.WriteLine("We're done!"); });

This code snippet waits one second and then prints a message on the screen.

In order to use TCS in our texture loader code, we only need to make very small adjustments.
(EDIT

int counter = 0;

var tcs = new TaskCompletionSource<object>();

var baseTasks = new List<Task>();
foreach (var textureToLoad in texturesToLoad)
{
    var baseTask = new Task<Bitmap>(() =>
    {
        var bitmap = new Bitmap(textureToLoad.AsFilename());
        return bitmap;
    });

    baseTask.ContinueWith(task =>
    {
        ScheduleToBeLoadedInMainThread(() => {
            var bitmap = task.Result;

            LoadTextureFromBitmap(bitmap);

            bitmap.Dispose();

            if (Interlocked.Decrement(ref counter) == 0)
            {
                //We have no further textures to load
                tcs.SetResult(null); //Tell TCS we're done!
            }
        });
    });

    baseTasks.Add(baseTask);
}

counter = baseTasks.Count;

foreach (var baseTask in baseTasks)
{
    baseTask.Start();
}

return tcs.Task;

And we're done.

There's one last missing piece. The ScheduleToBeLoadedInMainThread function.
This function can be implemented in many ways. We'll see one way that's easily reusable by other related classes (that also need to run code in the OpenGL thread) and integrates nicely with other tasks.

TaskFactory and TaskScheduler

First of all, how would we implement ScheduleToBeLoadedInMainThread with TaskCompletionSource? We'd keep a list of actions to execute, then provide a method to add a new action to this list and return a TaskCompletionSource.
Our UpdateFrame method would take elements from this list, execute the action and then notify the TCS.
Now we need to keep track of actions and their respective completion sources.
This works and it's simple. However, we'll cover a different approach here.

So far, we've created new tasks with the Task constructor or Task.Factory.StartNew(action).
Let's see what factories are with further detail.

A TaskFactory is exactly what the name tells us it is: A factory that creates tasks.
This is useful when we want to create many tasks with a bunch of shared special properties

var taskFactory = new TaskFactory();
taskFactory.StartNew(() => { Console.WriteLine("Do something"); });

This example is not particularly impressive. We're not doing anything we couldn't do before.
The thing is, now we can change the properties of the factory and let all created tasks share those same properties.
The one we're going to change is TaskScheduler.

A scheduler is a class that queues tasks and controls when and how they are executed. We'll be subclassing TaskScheduler.

We want our TaskScheduler to store the pending tasks and then allow its owner to decide when to execute a task.

class UpdateTaskScheduler : TaskScheduler
{
    ConcurrentQueue<Task> tasks = new ConcurrentQueue<Task>();

    protected override void QueueTask(Task task)
    {
        tasks.Enqueue(task);
    }

    protected override bool TryExecuteTaskInline(Task task, bool taskWasPreviouslyQueued)
    {
        return false;
    }

    protected override bool TryDequeue(Task task)
    {
        return false;
    }

    protected override IEnumerable GetScheduledTasks()
    {
        return new List(tasks).AsReadOnly();
    }

    public override int MaximumConcurrencyLevel
    {
        get
        {
            return 1;
        }
    }

    bool ExecuteNextTask()
    {
        Task task;
        if (tasks.TryDequeue(out task))
        {
            bool result = base.TryExecuteTask(task);
            return true;
        }
        return false;
    }

    public static TaskScheduler CreateNew(out Func executionFunction)
    {
        var scheduler = new UpdateTaskScheduler();
        executionFunction = scheduler.ExecuteNextTask;

        return scheduler;
    }
}

Let's see what this does.
First, we offer a mechanism to queue tasks. We disable task dequeuing so once a task is queued, it must run.
We also disable task inlining. Check the Further Reading section for more details.
The GetSchedulerTasks() method is used for Visual Studio debugging. We just return our list of pending tasks.
MaximumConcurrencyLevel indicates that only one single task from this scheduler is ever running at any given point in time.
Our ExecuteNextTask function executes a task. It returns true if it did manage to execute a task, false if there were no more tasks to execute.
We offer a CreateNew static method to create new schedulers. The out delegate allows the caller to use the ExecuteNextTask method of its scheduler.

To call the tasks in our main thread, we first create the scheduler and corresponding factory:

UITaskScheduler = UpdateTaskScheduler.CreateNew(out executeTask);
UITaskFactory = new TaskFactory(new CancellationToken(),
    TaskCreationOptions.HideScheduler,
    TaskContinuationOptions.HideScheduler, UITaskScheduler);

We add the following code in our UpdateFrame method to execute the tasks.

for (int i = 0; i < 50 && executeTask(); ++i)
{
}

Now, we're ready to complete our texture loader.

var uiFactory = Application.CurrentApplication.UITaskFactory;

int counter = 0;

var tcs = new TaskCompletionSource<object>();

var baseTasks = new List<Task>();
foreach (var textureToLoad in texturesToLoad)
{
    var baseTask = new Task<Bitmap>(() =>
    {
        var bitmap = new Bitmap(textureToLoad.AsFilename());
        return bitmap;
    });

    baseTask.ContinueWith(task =>
    {
        uiFactory.StartNew(() => {
            var bitmap = task.Result;

            LoadTextureFromBitmap(bitmap);

            bitmap.Dispose();

            if (Interlocked.Decrement(ref counter) == 0)
            {
                //We have no further textures to load
                tcs.SetResult(null); //Tell TCS we're done!
            }
        });
    });

    baseTasks.Add(baseTask);
}

counter = baseTasks.Count;

foreach (var baseTask in baseTasks)
{
    baseTask.Start();
}

return tcs.Task;

And that's it.

Final Adjustments

We have one final problem.
Running the code above may cause an OutOfMemoryException.
The problem is that we're loading lots of Bitmaps to memory, but we're not exporting them to OpenGL and disposing them fast enough.
We could try to set MaximumDegreeOfParallelism, but that wouldn't solve the problem. MaximumDegreeOfParallelism merely tells us how many tasks can be run at the same time, not how many tasks we can have pending in our UI scheduler.
One possible solution uses semaphores. We create a new semaphore (count=100) when the start the texture loading method, use semaphore.WaitOne() before allocating a new Bitmap and semaphore.Release() after bitmap.Dispose().
That way, there are never more than 100 bitmaps in memory waiting to be loaded to OpenGL.(EDIT: I'm not sure if this is a good idea. Semaphores are blocking, and tasks should generally avoid using blocking methods).
And that's it.

Conclusion

In my own tests, this final proposal takes 1100ms. That's twice as fast as the first approach.
While one second is not that long, this approach could be further tweaked to support slower use cases.
Also, we now have a framework to allow easily mixing worker threads and UI calls with C#'s TPL. Since we've used C# Tasks, our code is now compatible with the new C# 5 syntax sugar, so we could await our texture loading.
That's all.

Monday, December 23, 2013

Required Knowledge

Yield in C#

Yield in JavaScript

Asynchronous programming

Emulating Await in C#

Further Reading

Saturday, October 26, 2013

Required Knowledge

Implementing Yield with Await

How it is works

Main Problems and Limitations

Advantages

Conclusion

Further Reading

Wednesday, October 16, 2013

Required Knowledge

The Problem Null Analysis Tries to Solve

Overview

Expressions

Semantics and Annotations

Main Limitations

Final Words

Further Reading

Thursday, August 8, 2013

Future Work

Monday, July 8, 2013

Monday, July 1, 2013

Null coallescing and pattern matching

Dispose and IDisposable

LINQ Query and LINQ Fluent

Convert to Enum - Global Operations

Conclusion

See Also

Sunday, June 23, 2013

An Introduction

My Project - Week1

lock(this) and MethodImplOptions.Synchronized

Auto-Linq Sum

Final Notes

Further Reading

Sunday, April 21, 2013

Required Knowledge

Synchronous Texture Loading

Loading Screen

Concurrency

OpenGL contexts

Parallel.ForEach

Tasks and continuations

Loading textures in two steps

TaskCompletionSource

TaskFactory and TaskScheduler

Final Adjustments

Conclusion

Further Reading