Mutable Structs are Evil. Yes. Really.

Of course, you know the rule, don’t you. If you want to define your own value types in .NET, make sure they are immutable. Don’t create mutable value types!

Unfortunately, today, I forgot that rule. And I paid the price in a couple of hours needless debugging. With hindsight, my experience was a rather salutary lesson in why creating mutable valuetypes is almost always a bad idea. So I thought I’d tell you about it!

I was working on a game (a spare time coding hobby that I sometimes indulge in). The details of the game aren’t important, but it consists of a landscape, a bit like this (And yes, I know the graphics are horribly out of date). My excuse is it’s only a spare time project.

milton-keynes
My game – working

Now the key thing to notice is that there are lots of squares on the landscape – which I represent by a class called Square (My inventiveness when it comes to imaginative class names amazes me sometimes). Some of the squares have features – like buildings and roads and so on. That’s represented by a class called FeatureSet – which is basically a list of feature instances. So the relevant bit of code is this:

public class Square
{
    // Lots of other fields too
    private FeatureSet _features;

    public FeatureSet Features
    {
        get { return _features; }
    }
}

Now I’d been pondering for a while about a problem: In the game, there will be tens of thousands, maybe even hundreds of thousands of squares, but only a very small proportion of them will actually have features on them. Now I don’t really fancy instantiating hundreds of thousands of FeatureSet instances that will almost all be empty, so up to now, I’ve just allowed the Square._features field to be usually null. But that leads to some pretty messy code where, every time you do anything with the features, you need to check for nulls first.

And yesterday, I hit on a brainwave. FeatureSet is a very lightweight type – it only contains one member – the list of features. Really, all FeatureSet is is a wrapper around the list of features, which lets me implement bits of higher level logic around things you might do with that list.

public class FeatureSet
{
    // This is the only field. There are lots of methods though
    private List<IFeature> _features;
}

So why don’t I make FeatureSet a struct, not a class. I can allow the FeatureSet._features member to be null unless there really are features – so I’ll still need the null-checking – but now that null-checking is all hidden inside FeatureSet, not spread out over all the rest of the code. That way, the FeatureSet wrapper essentially takes up no memory or resources, and so it doesn’t really matter if I instantiate hundreds of thousands of empty FeatureSets. This seems perfect: I have the best of both worlds: No horrible null-checking to worry about throughout the code, and I’m not wasting tons of memory.

It took me about 10 minutes to implement the change. And of course, the game promptly broke and I spent the next few hours trying to figure out why.

And here’s the reason. I had lots of code like this:

_currentSquare.Features.AddBuilding(BuildingType.TownHouse, Direction.NE);

Looks innocuous enough. Get the Features on a square tile, and add a building to it.

Can you see what the problem is? (Yes – go on – stop reading for a few minutes to see if you can work it out for yourself. I’ve given you all the information you need. And if you’re stuck, there’s a big clue in the ‘Mutable Structs are Evil’ title of this blog.)

The problem is this: The code looks up the Square.Features property – an instance of FeatureSet – of before attempting to modify that FeatureSet instance. That’s absolutely fine when FeatureSet is a reference type, but it doesn’t work when FeatureSet is a value type, because value types – when they are returned – are copied by – umm – value. So what the Square.Features property gives me back isn’t the actual Features instance inside the Square instance, but is a temporary copy of that instance. The code proceeds to add the building to that copy. The code then moves on, and that copy goes out of scope and is promptly discarded. Result: The original Square instance has been left unchanged. No building has been added to it.

The slightly amusing result was that my game kept generating maps that had no features anywhere. No buildings, no roads, no trees, nothing. Not even any beaches. Ahh, the sheer bliss. Perfect isolation and peace. Absolutely ideal if you want to get away from it all (and don’t care about beaches). But not very interesting if you’re writing a transport game.

milton-keynes-valuetype
Where are the features?

So what’s the real lesson here?

The problem is that, looking at the above code quickly, you just wouldn’t realise that the code takes a copy of the FeatureSet instance, and therefore that the changes you are making (adding the building) are being made to the copy, NOT to the original. And that’s a fundamental problem with value types: It’s very common that, under the hood, C# and the .NET framework take copies of value type instances, in places where most developers wouldn’t realise that that’s happening.   And that’s the fundamental reason why mutable value types are bad: It becomes so easy to introduce bugs where you don’t realize that you’re modifying the wrong instance (usually, a temporary copy that then gets thrown away, losing your modifications). That’s exactly what happened to me.

public struct FeatureSet
{
    // Because this is in a struct, this can break as soon as you
    // replace the value of _features
    private List _features;
}

The solution is of course to keep your value types immutable: In other words, make sure that they cannot be modified after instantiation. If a type can’t be modified, then clearly there is no danger of accidentally modifying the wrong instance by mistake! And a huge source of potential bugs vanishes. That’s exactly why good practice says, don’t create mutable value types. In fact Eric Lippert gave a very succinct example of this exact same problem nearly 9 years ago!

Of course the problem doesn’t apply to reference types because – for reference types – it’s very rare to be dealing with copies of instances and not be aware that you have copies. The .NET framework and the C# language just don’t cause copies of reference types to be taken under the hood in the way that routinely happens for value types. If you mofidy a reference type, you can usually be fairly sure which instance you’re modifying! That’s why mutable reference types are usually considered to be OK.

If only, at the point of thinking, “Hey, let’s make FeatureSet a struct”, I’d remembered that this meant I was creating a mutable valuetype, and that’s almost always a VERY BAD IDEA, I’d have saved myself several hours debugging. Admittedly, I’d have also have lost the chance to write a great blog post about how mutable structs are evil, but – even so – I think I’d rather have had my hours back.

And here’s my advice to you: If you’re like most developers, you don’t have a blog. There is therefore NO advantage to you in making the same mistake I made – because you won’t be able to recover your dignity by blogging about it. So, do yourself a favour: Don’t make the mistake. If you ever find yourself tempted to write a mutable struct… Just. Don’t.

Oh yes, and my game is working again. I changed FeatureSet back to being a class, and that fixed the problem instantly. I still haven’t figured out how to square the problem of either having messy null-checking or having the overhead of hundreds of thousands of empty FeatureSet instances, but hey, there are bigger things to worry about. Like not wasting time trying to optimise memory when I could have been spending the time adding some useful features! I’m sure there’s another lesson there somewhere…

Advertisements

7 thoughts on “Mutable Structs are Evil. Yes. Really.

  1. Hello, Mr. Robinson!

    I’ve read your post, it really interesting. But I can’t understand one thing, why you said that storing struct as part of object save you tons of memory. Shouldn’t it create default value(with null list inside) for every Square object ?

  2. Don’t implement “hundreds of thousands of FeatureSet instances that will almost all be empty”; allocate one empty static readonly FeatureSet and assign that to all new squares. When you add a FeatureSet to a square, replace the default with a new FeatureSet.

    1. That’s actually a pretty good solution. Thanks Corvus! It means you save a lot of memory, but don’t need to do any null-checking when you read the lists of features. It slightly complicates the code when you modify the features – because, as you say, before any modification, you need to check that you’re not dealing with the static Empty instance, and replace it if you are. But that’s not a huge problem (especially as modifications would be relatively rare). I may well try that out.

      I’m guessing an extension of that idea could be to define an IFeatureSet interface, and type the Square.Features property to the interface. That would allow the Empty FeatureSet to be a static instance of a different type, which would mean you could easily have it throw an exception on any attempted modification – making the code more robust.

  3. Hi Simon, interesting read.

    Perhaps you could store the List in the Square, and convert FeatureSet to a struct which references the Square.

    The only issue then would be that the List would either need to be made non-private in some way so that the FeatureSet can access it, or the FeatureSet would be a nested struct of Square.

    Any operations which modify the List will then internally modify Square, which is a class.

    1. That would certainly work, and I /think/ if I’ve understood you correctly, it would save memory. I guess the disadvantage would be that it arguably gives poorer class design, since you’re storing the list in the Square, while having a different class to operate on it – which runs against good practice on having each type responsible for just one thing. That could make the code more confusing to read – something that almost always trumps optimising the code for memory/performance!

      But thanks for the suggestion – I appreciate it. It’s always good to think of what different ways there might be to solve a problem.

    1. That’s certainly a novel suggestion! Good call on being innovative!

      I don’t think it would lead to a bigger waste of memory – it probably depends a bit on how you do it. However I suspect it wouldn’t gain anything compared to the simpler solutions: I can think of two ways you might do Lazy(initDelegate):

      1. You might only initialize the list of features (the FeatureSet) when something tries to add a feature to a square. That means you’d always have to do null checking when you read, so it works out basically the same as having all FeatureSets be null until something is added.

      2. You might initialize the list of features when something tries to read a feature. Unfortunately, the means that basically everything would get initialized almost as soon as the app starts – since so many operations on squares require checking to see if there’s anything on them. So it would work out basically the same as having lots of empty FeatureSet instances from the word go.

      Thanks for the idea!

      (Sorry for the long delay replying to your suggestion btw)

Comment on this article (your first comment on TechieSimon will be moderated)

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s