# Monday, May 26, 2008

First some basics: in .NET there are reference types and value types. Reference types are always accessed through a reference (a pointer actually) while value types, well, are not.

The .NET documentation tries to sum it up in one sentence:

A data type is a value type if it holds the data within its own memory allocation. A reference type contains a pointer to another memory location that holds the data.

This means several things:

  • You cannot set a variable of a value type to null. It's not a pointer. It's just there.
  • When returning a value type from a method, a copy is made and the whole value is returned, not just a reference.
  • Creating a variable always initializes it to "something".

All primitive numeric types are value types (Int32, Double, Int64, Boolean, ...), as well as DateTime, enums and structs. Examples of reference types are String, all arrays (even when they contain value types) and all classes.

So far nothing new, but to make it interesting, let's create a small sample application using a custom struct, which is a value type:

Struct declaration:

struct MyStruct
{
  public int Value = 0;
  
  public void Update(int i) { Value = i; }
}


Code sample:

MyStruct[] list = new MyStruct[5];

for (int i=0;i<5;i++)
  Console.Write(list[i].Value + " ");
Console.WriteLine();

for (int i=0;i<5;i++)
  list[i].Update(i+1);

for (int i=0;i<5;i++)
  Console.Write(list[i].Value + " ");
Console.WriteLine();


The output of this code is:

0 0 0 0 0
1 2 3 4 5


Now let's do the same, but substitute the array for a generic List<>:

List<MyStruct> list = new List<MyStruct>(new MyStruct[5]); 

for (int i=0;i<5;i++)
  Console.Write(list[i].Value + " ");
Console.WriteLine();

for (int i=0;i<5;i++)
  list[i].Update(i+1);

for (int i=0;i<5;i++)
  Console.Write(list[i].Value + " ");
Console.WriteLine();


The output is:

0 0 0 0 0
0 0 0 0 0


Suprised? I was too, at first, but the explanation is very simple. No, it's not boxing/unboxing...

When accessing elements from an array, the runtime will get the array elements directly, so the Update() method works on the array item itself. This means that the structs itself in the array are updated.

In the second example, we used a generic List<>. What happens when we access a specific element? Well, the indexer property is called, which is a method. As I mentioned earlier, value types are copied when they are returned by a method, so this is exactly what happens: the list's indexer method retrieves the struct from an internal array and returns it to the caller. Because it concerns a value type, a copy will be made, and the Update() method will be called on the copy, which of course has no effect on the list's original items.

What should we learn from this?

Always make sure your structs are immutable, because you are never sure when a copy will be made. Most of the time it is obvious, but in some cases it can really surprise you...

kick it on DotNetKicks.com
Monday, May 26, 2008 1:27:05 AM (W. Europe Daylight Time, UTC+02:00)  #    Comments [3] -

Tuesday, May 27, 2008 3:04:22 AM (W. Europe Daylight Time, UTC+02:00)
I should say that your code sample surprised me very much too. But once I understood what was really happening, I would say that it is the combination of using the indexer along with the "make-a-copy" characteristic of value types that makes this a tough nut. If we use a method for getting each struct instead of the indexer, then the flaw might be more obvious to the programmer.

Great post anyways.
Thursday, May 29, 2008 11:29:07 AM (W. Europe Daylight Time, UTC+02:00)
Is this actually a problem or is it actually expected behaviour which we didn't thought of at first glance? I would say it's the last.
JV
Thursday, May 29, 2008 12:50:50 PM (W. Europe Daylight Time, UTC+02:00)
It's expected behavior, but it may catch you by surprise if you don't think about what goes on "under the hood" with arrays, lists and value types
Comments are closed.