internal class Test
{
private static IEnumerable<int> GetCounter()
{
return new <GetCounter>d__0(-2);
}
private sealed class <GetCounter>d__0 : IEnumerable<int>, IEnumerable, IEnumerator<int>, IEnumerator, IDisposable
{
// Fields
private int <>1__state;
private int <>2__current;
private int <>l__initialThreadId;
public int <count>5__1;
public <GetCounter>d__0(int <>1__state)
{
this.<>1__state = <>1__state;
this.<>l__initialThreadId = Thread.CurrentThread.ManagedThreadId;
}
private bool MoveNext()
{
switch (this.<>1__state)
{
case 0:
this.<>1__state = -1;
this.<count>5__1 = 0;
while (this.<count>5__1 < 10)
{
this.<>2__current = this.<count>5__1;
this.<>1__state = 1;
return true;
Label_0046:
this.<>1__state = -1;
this.<count>5__1++;
}
break;
case 1:
goto Label_0046;
}
return false;
}
IEnumerator<int> IEnumerable<int>.GetEnumerator()
{
if ((Thread.CurrentThread.ManagedThreadId == this.<>l__initialThreadId) && (this.<>1__state == -2))
{
this.<>1__state = 0;
return this;
}
return new Test.<GetCounter>d__0(0);
}
IEnumerator IEnumerable.GetEnumerator()
{
return ((IEnumerable<Int32>) this).GetEnumerator();
}
void IEnumerator.Reset()
{
throw new NotSupportedException();
}
void IDisposable.Dispose()
{
}
int IEnumerator<int>.Current
{
get
{
return this.<>2__current;
}
}
object IEnumerator.Current
{
get
{
return this.<>2__current;
}
}
}
}
I've shown the whole code again so that we can easily see the differences:
- Obviously, the iterator type now implements IEnumerable<int> but it also still implements IEnumerator<int>. It's very odd for something to be iterable and an iterator at the same time. It's an optimisation for a common case, as we'll see in a minute.
- The implementation of IEnumerator<int> remains exactly the same - Reset stil throws an exception, Current still just returns the current value, and MoveNext()has the same logic in.
- The method creating the iterator instance pass the constructor an initial state of -2 instead of 0.
- We have an extra private variable, <>l__initialThreadId, which is set in the constructor to reflect the thread which created the instance.
- GetEnumerator() either sets the state to 0 and returns this or it creates a new instance of the iterator which starts in state 0.
So, what's going on? Well, the most common use case (by far) is that an instance of IEnumerable<T> is created, then something (like a foreach statement) callsGetEnumerator() from the same thread, iterates through the data, and disposes of the IEnumerator<T> at the end. The original IEnumerable<T> is never used after the initial call to IEnumerator<T>. Given the prevalence of that pattern, it makes sense for the C# compiler to choose a pattern which optimises towards that case. When that's the behaviour, we only create a single object even though we're using it to implement two different instances. The state of -2 is used to represent "GetEnumerator() hasn't been called yet" whereas 0 is used to represent "I'm ready to start iterating, although MoveNext() hasn't been called yet".
However, if you either try to call GetEnumerator() either from a different thread, or when it's not in a state of -2, the code has to create a new instance in order to keep track of the different states. In the latter case you've basically got two independent counters, so they need independent data storage. GetEnumerator() deals with initializing the new iterator, and then returns it ready for action. The thread safety aspect is there to prevent two separate threads from independently callingGetEnumerator() at the same time, and both ending up with the same iterator (i.e. this).
That's the basic pattern when it comes to implementing IEnumerable<T>: the compiler implements all the interfaces in the same class, and the code lazily creates extra iterators when it has to. We'll see that there's more work to do when parameters are involved, but the basic principal is the same.
Choosing between interfaces to return
Normally, IEnumerable<T> is the most flexible interface to return. If your iterator block doesn't change anything, and your class isn't implementing IEnumerable<T>itself (in which case you'd have to return an IEnumerator<T> from your GetEnumerator() method, of course), it's a good choice. It allows clients to use foreach, iterate several times, use LINQ to Objects and general goodness. It's definitely worth using the generic interfaces instead of the nongeneric ones. From here on I'll only refer to the nongeneric interfaces in the text, but each time I'll mean both forms. (In other words, there's an important distinction between IEnumerable and IEnumerator, but from this point on I won't distinguish between IEnumerable and IEnumerable<T>).
State management
There are up to x pieces of state that the iterator type needs to keep track of:
- Its "virtual instruction pointer" (i.e. where it's got to)
- Local variables
- Parameter initial values and this
- The creating thread (as shown above, and only in the IEnumerable case; I won't cover this further)
- The last yielded value (i.e. Current; this is trivial enough to not require separate attention)
We'll look at each of the first three in turn.
Keeping track of where we've got to
The first piece of state in our state machine is the one which keeps track of how much code has executed from our original source. If you think of a normal state machine diagram (with circles and lines) this is which circle we're currently in. In many cases it's just referred to as the state - and indeed in our sample decompiled output so far we've seen it as <>1__state. (This is unfortunate as all of the rest of the data is state too, but never mind...) The specification refers to the states ofbefore, running, suspended and after, but as we'll see suspended needs more detail - and we need an extra state for IEnumerable implementations.
Before I go any further, it's worth remembering that an iterator block doesn't just run from start to finish. When the method is originally called, the iterator is just created. It's only when MoveNext() is called (after a call to GetEnumerator() if we're using IEnumerable). At that point, execution starts at the top of the method as normal, and progresses as far as the first yield return or yield break statement, or the end of the method. At that point, a Boolean value is returned to indicate whether or not the block has finished iterating. If/when MoveNext() is called again, the method continues executing from just after the yield return statement. (If the previous call finished for any other reason, we've finished iterating and nothing will happen.) Without looking at the generated code, let's write a small program to step through a simple iterator. Here's the code:
using System;
using System.Collections.Generic;
class Test
{
static readonly string Padding = new string(' ', 30);
static IEnumerator<int> GetNumbers()
{
Console.WriteLine(Padding + "First line of GetNumbers()");
Console.WriteLine(Padding + "Just before yield return 0");
yield return 10;
Console.WriteLine(Padding + "Just after yield return 0");
Console.WriteLine(Padding + "Just before yield return 1");
yield return 20;
Console.WriteLine(Padding + "Just after yield return 1");
}
static void Main()
{
Console.WriteLine("Calling GetNumbers()");
IEnumerator<int> iterator = GetNumbers();
Console.WriteLine("Calling MoveNext()...");
bool more = iterator.MoveNext();
Console.WriteLine("Result={0}; Current={1}", more, iterator.Current);
Console.WriteLine("Calling MoveNext() again...");
more = iterator.MoveNext();
Console.WriteLine("Result={0}; Current={1}", more, iterator.Current);
Console.WriteLine("Calling MoveNext() again...");
more = iterator.MoveNext();
Console.WriteLine("Result={0} (stopping)", more);
}
}
I've included some padding for the output created in the iterator block to make the results clearer. The lines on the left are in the calling code; the lines on the right are in the iterator block:
Calling GetNumbers()
Calling MoveNext()...
First line of GetNumbers()
Just before yield return 0
Result=True; Current=10
Calling MoveNext() again...
Just after yield return 0
Just before yield return 1
Result=True; Current=20
Calling MoveNext() again...
Just after yield return 1
Result=False (stopping)
Now let's introduce the values that <>1__state can take on, and their meanings:
- -2: (IEnumerable only) Before the first call to GetEnumerator() from the creating thread
- -1: "Running" - the iterator is currently executing code; also used for "After" - the iterator has finished, either by reaching the end of the method or by hittingyield break
- 0: "Before" - MoveNext() hasn't been called yet
- Anything positive: indicates where to resume from; it's yielded at least one value, and there's possibly more to come. Positive states are also used when code is still running but within a try block with a corresponding finally block. We'll see why later.
It's interesting to note that the generated code doesn't distinguish between "running" and "after". There's really no reason why it should: if you call MoveNext() when the iterator's in that state (which may be due to it running in a different thread) then MoveNext() will just immediately return false. This state is also the one we end up in after an uncaught exception.
Now that we know what the states are for, let's look at what MoveNext() looks like for the above iterator. It's basically a switch statement that starts execution at a particular place in the code based on the state. That's always the case for MoveNext(), with the one exception of an iterator body which consists solely of a yield break.
PART 3: Continuous