Friday, May 29, 2009

Performance testing of Dictionary, List and HashSet

Submit this story to DotNetKicks

Update June 4, 2009
The original post favored Dictionary as the fastest data structure both for add and contains. After some debate and some re-runs of my test code, I found the result to be wrong. HashSet is faster. Probably the results were affected by the workload on my computer. I ran one List test, then a Dictionary test and finally a HashSet test - I should have run them multiple times and at the same workload (i.e. no programs or processes running). Anyway, on to the article.

Still working with high-volume realtime datafeeds, I'm struggling to understand where the bottleneck in my code is. It's not with the database, it's not with the network - it's somewhere in the code. Now that doesn't help a lot.

So I decided to have a look at the different data structures that could be usable for my needs. Right now I'm using a List to keep track of my orders, and each time I get a new order, I check that list for the given ID.

So I'm starting to wonder, maybe there is a performance issue with the List data structure. So I made a small test with the following code:

static void TestList()
{

var watch = new Stopwatch();

var theList = new List<int>();

watch.Start();

//Fill it
for (int i = 0; i <>

{

theList.Add(i);

}

watch.Stop();

Console.WriteLine("Avg add time for List: {0}", (watch.Elapsed.TotalMilliseconds / noOfIterations));

watch.Reset();

watch.Start();

//Test containsKey

for (int j = 0; j <>

theList.Contains(j);

}

watch.Stop();

Console.WriteLine("Avg Contains lookup time for List: {0}", (watch.Elapsed.TotalMilliseconds / noOfIterations));

}

I created similar test code for Dictionary and HashSet. In all of the tests I used a loop where the given key always existed in the data structure. I used Add instead of Insert, because of tests I've seen online show that this is much faster. For example, this one.

First run, I used 1,000 for the noOfIterations variable and ran it three times.

The horizontal scale here is in milliseconds, so things are reasonably fast. A value of 0.2 gives you a possible 200 adds per second. As you probably can see without checking the numbers, dictionary is faster. List is slower for lookup, but HashSet suprises a little with the slow add function. So what happens when we go to 100,000 items?

OK, List is a lot slower for lookup still. Add seems to compare ok though. Let's see how it compares if we remove the lookup-time from the chart: This is wrong, see "The aftermath"

Now that's a result! Turns out List is your winner if fast adding is all you care about. If you want to look up your values later though, dictionary is your winner. Hope this will save some time for other developers in the same situation. Also feel free to comment if you find different results, or a problem with my test!

By popular request: The rest of the code!

static void TestDictionary()

{

var watch = new Stopwatch();

//Create Dictionary

var theDict = new Dictionary<int, int>();

watch.Start();

//Fill it

for (int i = 0; i <noofiterations;i++)

{

theDict.Add(i, i);

}

watch.Stop();

Console.WriteLine("Avg add time for Dictionary: {0}", (watch.Elapsed.TotalMilliseconds/noOfIterations));

watch.Reset();

//Test containsKey

watch.Start();

for (int j = 0; j <noofiterations;j++)

{

theDict.ContainsKey(j);

}

watch.Stop();

Console.WriteLine("Avg Contains lookup time for Dictionary: {0}",

(watch.Elapsed.TotalMilliseconds/noOfIterations));

}

static void TestHashSet()

{

var watch = new Stopwatch();

//Create List

var hashSet = new HashSet<int>();

//Fill it

watch.Start();

for (int i = 0; i <>

{

hashSet.Add(i);

}

watch.Stop();

Console.WriteLine("Avg add time for HashSet: {0}", (watch.Elapsed.TotalMilliseconds / noOfIterations));

watch.Reset();

//Test contains

watch.Start();

for (int j = 0; j <>

{

hashSet.Contains(j);

}

watch.Stop();

Console.WriteLine("Avg Contains lookup time for HashSet: {0}", (watch.Elapsed.TotalMilliseconds / noOfIterations));

}

The aftermath
Because of the controversy involved in HashSet being slower than Dictionary, despite having only one value and the Dictionary two, I re-ran the test on my computer, and on a colleagues. Instead of running the tests one by one, I ran three times HashTest and three times Dictionary, and picked the averages. The result you can see below, HashSet is faster than Dictionary both for Add and Contains methods.


Monday, May 25, 2009

Exchange ActiveSync on my SE W715

Submit this story to DotNetKicks

I got a new phone! After three years of windoze mobile, I decided to go back to a regular phone. But I wanted e-mail, or rather, exchange activesync. So I landed on a Sony Ericsson W715. But after completing my setup, the only message I got was "Session Failed". Huh?

I found a promising blog from a Swede, but even that (allowing non-provisional devices to use ActiveSync on the Exchange Server) was not enough.

Finally we found that the connection attempts were being blocked by the ISA server (Microsoft firewall). By allowing All users, not just Authenticated ones, to connect to OWA, I was able to synch my emails.