projects ~ blog ~ archive ~ about

Forward-Referenced Recursive Types in Go

04 May 2011

Patch Accepted

I recently received some good news - after much code review (and much cleaning up of my messy first attempt :-) my go patch addressing issue 667 has been accepted into the mainline. Yay!

Recursive Types

The issue is related to invalid recursive types, so let’s cover that first - what exactly is a recursive type, and when is it invalid?

As to the first question - a recursive type is simply a type which, at some point or another references itself in its definition.

A simple example:-

type List struct {
        Value interface{}
        Next *List
}

A convoluted example (note the recursive types here are T2,T5-7) :-

type T1 struct { T2; T3 }
type T2 struct { T5; T7 }
type T3 T4
type T4 struct {}
type T5 T6
type T6 T7
type T7 struct { *T2 }

Note that both of these examples describe valid recursive types. So what is an invalid recursive type?

type Foo struct {
        foo Foo
}

What makes this invalid?

Simply put, determining the size required for storage of this type would (assuming we don’t check for this in advance) result in an infinite loop - there is a cycle in the type’s definition.

There are many ways in which you can achieve this in code, it doesn’t have to be as overt as the previous example, e.g.:-

type Foo struct {
        bar Bar
}
type Bar Foo

This second example can be made valid by making the bar field a pointer type - that way we have no problem determining the size of the type - the field’s size is the system pointer size, e.g.:-

type Foo struct {
        bar *Bar
}
type Bar Foo

The Bug

Now we get to the bug - the code above wrongly resulted in an ‘invalid recursive type’ error.

More generally - types which were validly recursive via a forward reference were wrongly reported as invalid recursive types.

Note that the following compiled just fine:-

type Bar Foo
type Foo struct {
        bar *Bar
}

The problem comes down to the way the compiler constructs its internal representation of declared types - it constructs underlying types first (assuming they haven’t been constructed already) before constructing the type itself.

A problem with this approach arises when you have a declaration like:-

type Bar Foo

In essence this says - ‘copy Foo into a new type called Bar’.

This is fine most of the time, but if the type you are copying is not constructed yet (remember I am talking about the construction of the internal compiler representation of the type here), then you get whatever the compiler defaults the type to before finishing its construction.

It turns out that the compiler defaults to classifying all types (for those that are interested in the internals - the etype field) as forward types (etype == TFORW), before setting their actual type once the underlying types are constructed. Later, the compiler checks to see whether the type is still classified this way, and if so an ‘invalid recursive type’ error occurs (this isn’t the only way invalid recursive types are picked up, incidentally).

This makes sense for situations like:-

type Foo Foo

Or:-

type Foo Bar
type Bar Foo

And is, in fact, a nice solution in these instances, however the fact that the language allows a pointer type to refer to a forward reference is the cause of our bug.

The Solution

The solution I came up with was to specifically look for these circumstances and defer the copying of type information until after the base type has been constructed.

So, at the point where we would otherwise simply go ahead and check the type, specifically check to see whether there is a forward-type in its underlying type graph. If so, defer the type copy.

We have to be careful to actively check for cycles in the underlying type list to prevent the compiler from disappearing into an infinite loop when we encounter:-

type Foo Foo

-type scenarios. I used ‘Floyd’s Cycle-Finding Algorithm’ which is O(n) - described here.

The Best Bit

My favourite part of it all is the fact I could refer to the solution as a ‘dance’:-

// a dance to handle forward-declared recursive pointer types.

The go team often uses the term which is quite appropriate for some of the things which are necessary to make sure stuff works :-)

$ cd $GOROOT/src
$ find . -iname *.c | xargs grep -i 'dance' | wc -l
17

There are obviously more details going on here, but I don’t want to go into unnecessary detail here.

Also, yeah - I know - the patch is far from perfect, and could well be improved, but in any case I’m proud to have been able to fix the problem and contribute to a project I believe in.

Go Open Source!

Comments

blog comments powered by Disqus