String interning in C#
Last week a coworker sent me an interesting piece of code:
const string a = "";
const string b = "";
const string c = "hello";
const string d = "hello";
Console.WriteLine(ReferenceEquals(string.Empty, string.Empty)
? "Not so surprising…"
: "Oh, the humanity!");
Console.WriteLine(ReferenceEquals(a, string.Empty)
? "But I thought string was a reference type!"
: "Seems fair…");
Console.WriteLine(ReferenceEquals(a, b)
? " But I thought string was a reference type!"
: " Seems fair…");
Console.WriteLine(ReferenceEquals(c, d)
? " But I thought string was a reference type!"
: " Seems fair…");
Running the program will, maybe a bit surprising, produce:
Not so surprising…
But I thought string was a reference type!
But I thought string was a reference type!
But I thought string was a reference type!
The reason to this is what is called String interning. The CLR basically holds a hashmap of all the strings in the program (edit: string literals), one entry per unique string. So two identical strings will always have the same reference even though they are defined in different places. String interning is mainly used to speed up string comparisons (no need to check char by char if they are identical, if they have the same reference they are equal), but also to reduce the memory footprint of the application.
Read more about string interning:
http://aspadvice.com/blogs/from_net_geeks_desk/archive/2008/12/25/String-Interning-in-C_2300_.aspx
http://en.wikipedia.org/wiki/String_interning
N.B. This is true for string literals only (string literals = strings enclosed in double quotes). Try running the following example:
with “hello” as (the first) command line parameter and you will see the difference.