Pro tip: Choosing the best collection for the job

You need to store a collection of data of some sort, so you stuff it in an array, problem solved, right? But is an array the best solution for your script? And should you care?

Next after strings (also an array btw.), arrays are one of the first concepts you are likely to encounter when learning PowerShell.  So let’s start by talking about the array, then I will go through some other useful collections that you might not have known about.

System.Array

When you declare an array in PowerShell, like this:

$myArray = @()

what you are really doing is creating an instance of the System.Array class. To see that this is the case, check the resulting type with the GetType method, like so:

$myArray.GetType()

One interesting thing to note about the System.Array class is that it has a fixed capacity. This means that when you instantiate it, you can’t change the size of it! “But wait a minute!” I hear you say… I know. PowerShell, being the helpful interpreter it is, kind of smooths things over when dealing with arrays. This means you can do the following:

$myArray += 'item1'
$myArray += 'item2'

But didn’t I just increase the size of the array on the fly here? No, not really. What PowerShell does every time you think you are just adding stuff to an already existing array, is that it’s creating a brand new array, with a new size, copying over all the data (including the new data you want added) and destroying the “old” array. All this adds a little bit of overhead to the execution of the process of adding items to the array. Normally you would never notice this, but if you are working with very large arrays, you might notice the overhead quite a bit.

Btw., if you think I’m joking about arrays being of fixed size, try the following:

$myArray.Count
$myArray.IsFixedSize
$myArray += 'item3'
$myArray.Count
$myArray.IsFixedSize

Does this mean that you should stop using arrays in PowerShell? Not at all. Using arrays like this is by far the simplest and easiest (and most readable) way of using arrays (or collections) in PowerShell, and my recommendation is that you use this method unless you start to feel the extra overhead and need to optimize your code for speed.

So what other options do we have then? If you are comfortable getting your feet wet with some .NET interaction, I will go through some of the members of the System.Collections namespace, which holds a bunch of interesting classes.

System.Collections.ArrayList

If you think that your array are causing slow-down in your code, this is the class you will probably want to change to in most cases. An object of this class will dynamically adjust its size as required.

This is how you use it:

$myArrayList = New-Object System.Collections.ArrayList
[void]$myArrayList.Add('item1')

First we create a new instance of the class, and then we use the Add method to add an item to it. To suppress the exit code of calling the add method, I’m prefixing the command with [void].

If you want to add several items into the array list at the same time, you do it like this:

$myArrayList.AddRange(@('item2','item3'))

You have other neat methods as well, such as Clear that remove all elements from the array list, Clone that creates a (shallow) copy of the list and one of the more useful ones; Remove and RemoveAt, which lets you remove elements from the list. This is something that is not easily done when using System.Array objects.

Let’s look at a couple of examples:

$myArrayList.Remove('item1')

This will remove the ‘item1 item from the list.

$myArrayList.RemoveAt(1)

This will remove the item at index 1.

Head on over to MSDN to read more about array lists.

System.Collections.HashTable

You have probably used hash tables before, but I’m going through it here because it resides in the same namespace as the other collections (with the exception of System.Array). In PowerShell we can instantiate a hash table like this:

$myHashTable = @{}

But you could also do it like this if you want to:

$myHashTable = New-Object System.Collections.Hashtable

But I’d just stick with the shorter PowerShell way of doing it.

Hash tables are a collection of key/value pairs, and you have probably used it for splatting of parameters or for creating object properties or something similar.

One thing to note about hash tables though. If you find you are having a hard time iterating through it, try the following:

foreach ($item in $myHashTable.GetEnumerator()) {
# do your thing here
}

Head on over to MSDN to learn more about hash tables.

The next two classes are not often seen in PowerShell scripts, but they can come in handy sometimes.

System.Collections.Queue

The queue class represents  a first-in, first-out collection. What does that mean? It means that when you add items to it, they are added at the bottom of the list, while when you remove an item, it will remove from the top. Let’s see it in action:

$myQueue = New-Object System.Collections.Queue
$myQueue.Enqueue('First item')
$myQueue.Enqueue('Second item')
$myQueue.Enqueue('Third item')
$myQueue

As you see, the items are added at the bottom as you would expect. The fun begins when you start to take items out of the collection again:

$myQueue.Dequeue()

Do you see what happened? The first item in the collection ‘First item’ was the one returned. Nifty eh? Let me show you something else that is cool about queues:

$myQueue.Peek()
$myQueue

The Peek method lets you take a peek at the item at the top of the queue, without actually removing it.

For more information read the documentation at MSDN.

System.Collections.Stack

This class is kind of similar to a Queue, but it is a last-in, first-out (also called LIFO) collection. This means that when you add items to it, they will be added to the top of the list, and they will also be removed from the top. Let’s check it out:

$myStack = New-Object System.Collections.Stack
$myStack.Push('First Item')
$myStack.Push('Second Item')
$myStack.Push('Third Item')
$myStack

Here you clearly see that the last item we added (‘Third item’) is shown to be at the top of the list. Let’s remove one:

$myStack.Pop()
$myStack

And again as expected, the top-most item was removed (‘Third item’).

The stack also supports the Peek method, but since it works the same as for a queue, I won’t show it in use here. But test it out for yourself!

More information about the Stack can be found over at MSDN.

There are also some other collections in the System.Collections namespace, but I think the ones mentioned here are the most useful of them.

I hope you have enjoyed this post, and perhaps even learnt something new? Let me know in the comments section below if you have any additional insights or questions.

Leave a comment