(ZT)Memory in .NET - what goes where
http://www.yoda.arachsys.com/csharp/memory.html
A lot of confusion has been wrought by people explaining the difference between value types and reference types as "value types go on the stack, reference types go on the heap". This is simply untrue (as stated) and this article attempts to clarify matters somewhat.
What's in a variable?
The key to understanding the way memory works in .NET is to understand what a variable is, and what its value is. At the most basic level, a variable is just an association between a name (used in the program's source code) and a slot of memory. A variable has a value, which is the contents of the memory slot it's associated with. The size of that slot, and the interpretation of the value, depends on the type of the variable - and this is where the difference between value types and reference types comes in.
The value of a reference type variable is always either a reference or null
. If it's a reference, it must be a reference to an object which is compatible with the type of the variable. For instance, a variable declared as Stream s
will always have a value which is either null
or a reference to an instance of the Stream
class. (Note that an instance of a subclass of Stream
, eg FileStream
, is also an instance of Stream
.) The slot of memory associated with the variable is just the size of a reference, however big the actual object it refers to might be. (On the 32-bit version of .NET, for instance, a reference type variable's slot is always just 4 bytes.)
The value of a value type is always the data for an instance of the type itself. For instance, suppose we have a struct declared as:
struct PairOfInts
{
public int a;
public int b;
}
The value of a variable declared as PairOfInts pair
is the pair of integers itself, not a reference to a pair of integers. The slot of memory is large enough to contain both integers (so it must be 8 bytes). Note that a value type variable can never have a value of null
- it wouldn't make any sense, as null
is a reference type concept, meaning "the value of this reference type variable isn't a reference to any object at all".
So where are things stored?
The memory slot for a variable is stored on either the stack or the heap. It depends on the context in which it is declared:
- Each local variable (ie one declared in a method) is stored on the stack. That includes reference type variables - the variable itself is on the stack, but remember that the value of a reference type variable is only a reference (or
null
), not the object itself. Method parameters count as local variables too, but if they are declared with theref
modifier, they don't get their own slot, but share a slot with the variable used in the calling code. See my article on parameter passing for more details. - Instance variables for a reference type are always on the heap. That's where the object itself "lives".
- Instance variables for a value type are stored in the same context as the variable that declares the value type. The memory slot for the instance effectively contains the slots for each field within the instance. That means (given the previous two points) that a struct variable declared within a method will always be on the stack, whereas a struct variable which is an instance field of a class will be on the heap.
- Every static variable is stored on the heap, regardless of whether it's declared within a reference type or a value type. There is only one slot in total no matter how many instances are created. (There don't need to be any instances created for that one slot to exist though.) The details of exactly which heap the variables live on are complicated, but explained in detail in an MSDN article on the subject.
There are a couple of exceptions to the above rules - captured variables (used in anonymous methods and lambda expressions) are local in terms of the C# code, but end up being compiled into instance variables in a type associated with the delegate created by the anonymous method. The same goes for local variables in an iterator block.
A worked example
The above may all sound a bit complicated, but a full example should make things a bit clearer. Here's a short program which does nothing useful, but should demonstrate the points raised above.
using System;
struct PairOfInts
{
static int counter=0;
public int a;
public int b;
internal PairOfInts (int x, int y)
{
a=x;
b=y;
counter++;
}
}
class Test
{
PairOfInts pair;
string name;
Test (PairOfInts p, string s, int x)
{
pair = p;
name = s;
pair.a += x;
}
static void Main()
{
PairOfInts z = new PairOfInts (1, 2);
Test t1 = new Test(z, "first", 1);
Test t2 = new Test(z, "second", 2);
Test t3 = null;
Test t4 = t1;
// XXX
}
}
Let's look at what's where in memory at the line marked with the comment "XXX". (Assume that nothing is being garbage collected.)
- There's a
PairOfInts
instance on the stack, corresponding with variablez
. Within that instance,a=1
andb=2
. (The 8 byte slot needed forz
itself might then be represented in memory as01 00 00 00 02 00 00 00
.) - There's a
Test
reference on the stack, corresponding with variablet1
. This reference refers to an instance on the heap, which occupies "something like" 20 bytes: 8 bytes of header information (which all heap objects have), 8 bytes for thePairOfInts
instance, and 4 bytes for the string reference. (The "something like" is because the specification doesn't say how it has to be organised, or what size the header is, etc.) The value of thepair
variable within that instance will havea=2
andb=2
(possibly represented in memory as02 00 00 00 02 00 00 00
). The value of thename
variable within that instance will be a reference to a string object (which is also on the heap) and which (probably through other objects, such as a char array) represents the sequence of characters in the word "first". - There's a second
Test
reference on the stack, corresponding with variablet2
. This reference refers to a second instance on the heap, which is very similar to the one described above, but with a reference to a string representing "second" instead of "first", and with a value ofpair
wherea=3
(as 2 has been added to the initial value 1). IfPairOfInts
were a reference type instead of a value type, there would only be one instance of it throughout the whole program, and just several references to the single instance, but as it is, there are several instances, each with different values inside. - There's a third
Test
reference on the stack, corresponding with variablet3
. This reference isnull
- it doesn't refer to any instance ofTest
. (There's some ambiguity about whether this counts as aTest
reference or not - it doesn't make any difference though, really - I generally think ofnull
as being a reference which doesn't refer to any object, rather than being an absence of a reference in the first place. The Java Language Specification gives quite nice terminology, saying that a reference is eithernull
or a pointer to an object of the appropriate type.) - There's a fourth
Test
reference on the stack, corresponding with variablet4
. This reference refers to the same instance ast1
- ie the values oft1
andt4
are the same. Changing the value of one of these variables would not change the value of the other, but changing a value within the object they both refer to using one reference would make that change visible via the other reference. (For instance, if you sett1.name="third";
then examinedt4.name
, you'd find it referred to "third" as well.) - Finally, there's the
PairOfInts.counter
variable, which is on the heap (as it's static). There's only a single "slot" for the variable, however many (or few)PairOfInts
values there are.
0 Comments:
Post a Comment
<< Home