Python Data Structures - Dictionaries — Dr William Peter Nicholson

Dictionary: a Python built-in type

Python comes with many built-in object types; the first we usually learn about are known as the simple types, such as int, float, complex, bool, str, etc. However, Python also has several built-in compound types, each of which can be used as containers for those other simpler types.

If you think about a Python list as being ordered collections of arbitrary objects, then it helps to in turn think of dictionaries as being unordered collections of arbitrary objects. Thus, the main distinction between lists and dictionaries is that in dictionaries, items are stored and fetched by key, instead of by positional offset.

While lists can serve roles similar to arrays in other languages, dictionaries take the place of records, search tables, and any other sort of aggregation where item names are more meaningful than item positions.

In addition, Python dictionaries are a highly optimized built-in type, and therefore indexing a dictionary is a very fast search operation. Consequently, dictionaries can replace many of the searching algorithms and data structures that you would otherwise have to implement manually in other lower-level languages.

And finally, in mathematics and data science, Python dictionaries can be used to represent sparse (mostly empty) data structures.

Key Features

Python dictionaries are

Accessed by key, not offset position

Dictionaries are sometimes called associative arrays or hashes.
They associate a set of values with keys, so you can fetch an item out of a dictionary using the key under which you originally stored it.
You use the same indexing operation to get components in a dictionary as you do in a list, but the index takes the form of a key, not a relative offset.

Unordered collections of arbitrary objects

Unlike in a list, items stored in a dictionary aren’t kept in any particular order; their left-to-right order is randomized by Python to provide quicker lookup.
Keys provide the symbolic (not physical) locations of items in a dictionary.

Variable-length, heterogeneous, and arbitrarily nestable

Like lists, dictionaries can grow and shrink in place (without new copies being made), they can contain objects of any type, and they support nesting to any depth (they can contain lists, other dictionaries, and so on).
Lists would use the append() method (of the list class) … while dictionaries grow by assignment to new keys.
Each key can have just one associated value, but that value can be a collection of multiple objects if needed, and a given value can be stored under any number of keys.

Of the category “mutable mapping”

You can change dictionaries in place by assigning to indexes (they are mutable), but they don’t support the sequence operations that work on strings and lists.
Because dictionaries are unordered collections, operations that depend on a fixed positional order (e.g., concatenation, slicing) don’t make sense. Instead, dictionaries are the only built-in, core type representatives of the mapping category — objects that map keys to values. Other mappings in Python are created by imported modules.

Tables of object references (hash tables)

Dictionaries are unordered tables of object references that support access by key.
Internally, dictionaries are implemented as hash tables (data structures that support very fast retrieval), which start small and grow on demand.
Moreover, Python employs optimized hashing algorithms to find keys, so retrieval is quick.
Like lists, dictionaries store object references (not copies, unless you ask for them explicitly).
The time complexity for checking if a key is in a Python dictionary is, in the average case, O(1). For more see Python Wiki

Creating and Accessing Dictionaries

Before we go on - remember: when Python creates a dictionary, it stores the items in any left-to-right order it chooses. Therefore, in order to fetch a value, you supply the key with which that value is associated, not its relative position. Operations that assume a fixed left-to-right order (e.g., slicing, concatenation) do not apply to dictionaries; you should fetch values only by key, not by position.

Traditional Literal Expression

This method is useful if you can spell out the entire dictionary ahead of time.

>>> myDict_A = {'name': 'Bob', 'age': 40}
>>> myDict_B = {'alpha' : 2, 'beta' : 5, 'charlie' : 69}
>>> myDict_C = {'textA' : 69, 'textB' : 180, 'william' : lambda a : a + 10}

>>> myVar = 5
>>> myAnswer = myDict_C['william'](myVar)
15

Empty Dictionary: D = {}
Access items by key and the index operator (square brackets just like with a list … but once again, here it means access by key, not by position):
```
>>> myDict_B['beta']
5
```

Keys don’t have to be strings - they can be any data type

For example:

D = {45 : 50}

and this will compile/work in the same way as if it were any other dictionary, for example D[45] will return the associated value 50 just as you’d expect.

Assign Keys Dynamically

This is useful if you need to create the dictionary one field at a time on the fly (for example within a loop):

D = {}
D['name'] = 'Bob'
D['age']  = 40

Use of `dict` keyword (when all the keys are strings)

The advantage here is of course is that it means less typing when compared with the traditional literal expression method. On the other hand, it requires that all keys be strings.

D = dict(name='Bob', age=40)

Use of `dict` and `zip` keywords (for key/value tuples)

The dict and zip keywords are useful if you need to build up keys and values as sequences at runtime.

D = dict([('name', 'Bob'), ('age', 40)])

And this form is also commonly used in conjunction with the zip function, to combine separate lists of keys and values obtained dynamically at runtime: dict(zip(keyslist, valueslist)). For example, suppose you cannot predict the set of keys and values that will eventually appear in your code - by using dict and zip you can always build them up as lists first and then zip them together later on:

# Zip together keys and values:
>>> list(zip(['a', 'b', 'c'], [1, 2, 3]))
[('a', 1), ('b', 2), ('c', 3)]

Thus consider:

# Make a dict from zip result:
>>> D = dict(zip(['a', 'b', 'c'], [1, 2, 3]))
>>> D
{'b': 2, 'c': 3, 'a': 1}

Another example:

# Create a list of keys:
>>> keysList = [ "roger", "derek", "peter" ]

# Create a list of values:
>>> valuesList = [ 69, 180, 25 ]

# Zip them together to make a list of key/value tuples.
>>> key_value_tuples = list(zip(keysList, valuesList))
[('roger', 69), ('derek', 180), ('peter', 25)]

# Then turn this list of key/value tuples into a dictionary.
>>> D = dict(key_value_tuples)
{'roger': 69, 'derek': 180, 'peter': 25}

Use `dict.fromkeys()` method (to initialize all keys to the same value)

Provided all the key’s values are the same initially, you can create a dictionary with this special signature. Simply pass in a list of keys and an initial value for all of the values (the default is None):

>>> dict.fromkeys(['a', 'b'], 0)
{'a': 0, 'b': 0}

Finding the Number of Entries in the Dictionary (the `len()` function)

The built-in len() function works on dictionaries, to; it returns the number of items stored in the dictionary or, equivalently, the length of its keys list.

>>> D = {'alpha' : 2, 'beta' : 5, 'charlie' : 69}
>>> len(D)
3

The `in` key membership operator

The dictionary in membership operator allows you to test for key existence.

>>> D = {'alpha' : 2, 'beta' : 5, 'charlie' : 69}
>>> 'charlie' in D
True

Again thats key membership … if you did 69 in D it would return False, because there is no key 69.

The `keys()`, `values()`, and `items()` methods

`keys()` and `myDict.keys()`

The keys() method returns all the keys in the dictionary. This can be useful for processing dictionaries sequentially, but you shouldn’t depend on the order of the keys list.

>>> D = {'alpha' : 2, 'beta' : 5, 'charlie' : 69}
>>> D.keys()
dict_keys(['alpha', 'beta', 'charlie'])

Since the keys() result can be used as a normal list, however, it can always be sorted if order matters.

>>> list(D.keys())
['alpha', 'beta', 'charlie']

Note here how keys() returns an iterable object, instead of a physical list.

The list call forces it to produce all its values at once so we can print them interactively.
This is similar to range() iterable … it too would need to be enclosed within a list call (on those occasions where you’re using range() to create a list).

And so, again just like with range(), you might use keys() on its own within a for loop:

>>> D = {'alpha' : 2, 'beta' : 5, 'charlie' : 69}
>>> for key in D.keys():
>>>    print(key)
alpha
beta
charlie

But as always (and apologies for repeating this for the 1000th time!), ordering here remains random. If you wanted some sort of ordering you should first create a list of the keys (as shown above using myList = list(D.keys())) and then sort this list in whatever way you want … and then use the sorted list of keys to access the entries in the dictionary.

`values()` and `items()`

The dictionary values() method returns all of the dictionary’s values - while the dictionary items() methods returns all of the dictionary’s (key, value) pair tuples. And just as was the case with the keys() method, the values() and items() methods also return iterable objects. Thus if you wrap them in a list call - just like with keys(), and range() - you get them all at once.

>>> D = {'spork': 2, 'knife': 1, 'fork': 3}
>>> list(D.values())
[3, 2, 1]
>>> list(D.items())
[('fork', 3), ('spork', 2), ('knife', 1)]

Changing dictionaries in place

Dictionaries, like lists, are mutable, so you can change, expand, and shrink them in place without making new dictionaries.

>>> D
{'fork': 3, 'spork': 2, 'knife': 1}

Change a dictionary value in-place - and with any arbitrary object

Simply assign a value to a key to change or create an entry:

>>> D['knife'] = ['grill', 'bake', 'fry']
>>> D
{'fork': 3, 'spork': 2, 'knife': ['grill', 'bake', 'fry']}

Delete a dictionary entry associated with a key

The del statement deletes the entry associated with the key specified as an index:

>>> del D['fork']
>>> D
{'spork': 2, 'knife': ['grill', 'bake', 'fry']}

Add a new entry

Note how the dictionary can do this in-place … this isn’t possible with a list; to expand a list, you would need to use a method such as append() … or use a slice assignment instead:

>>> D['brunch'] = 'Bacon'
>>> D
{'brunch': 'Bacon', 'spork': 2, 'knife': ['grill', 'bake', 'fry']}

Other Dictionary methods

`get()`

Returns the value that corresponds to the key provided. If that key does not exist in the dictionary it will return either None, or a default value that you have provided.

>>> D = {'spork': 2, 'knife': 1, 'fork': 3}

# Return the value associated with a key (one that actually exists in the dictionary):
>>> D.get('spork')
2

# Return the value associated with a key (that you don't yet know does not actually exist in the dictionary):
>>> print(D.get('toast'))
None

# Return the value associated with a key - providing a default value to return in case that key does not actually exist in the dictionary:
>>> D.get('toast', "That key does not exist.")
'That key does not exist.'

# Note: despite all the above - dictionary 'D' remains the same:
>>> D
{'spork': 2, 'knife': 1, 'fork': 3}

This is an easy way to fill in a default for a key that isn’t present, and avoid a missing-key error when your program can’t anticipate contents ahead of time.

`update()`

Provides something similar to concatenation for dictionaries, though it has nothing to do with left-to-right ordering since, again, there is no such thing in dictionaries. It merges the keys and values of one dictionary into another, blindly overwriting values of the same key if there’s a clash:

>>> D1 = {'spork': 2, 'knife': 1, 'fork': 3}
>>> D2 = {'toast':4, 'muffin':5}
>>> D1.update(D2)

# So this will be D1 and D2 concatenated together.
>>> D1
{'spork': 2, 'knife': 1, 'fork': 3, 'toast': 4, 'muffin': 5}

`pop()`

Deletes a key from a dictionary and returns the value it represented. Is similar to the built-in list types pop method, but it takes a key instead of an optional position.

>>> D1 = {'spork': 2, 'knife': 1, 'fork': 3}
>>> D2 = {'toast':4, 'muffin':5}
>>> D1.update(D2)

# Delete and return the value associated with key 'muffin'.
>>> D1.pop('muffin')
5

# Delete and return the value associated with key 'toast'.
>>> D1.pop('toast')
4

# And therefore the dictionary `D1` will have been transformed into:
>>> D1
{'fork': 3, 'spork': 2, 'knife': 1}

Review

Dictionaries are best suited when the data is labeled (a record with field names, for example).
Dictionary lookup is also usually quicker than searching a list, though this might vary per program.
Lists are best suited to collections of unlabeled items (such as all the files in a directory).