
Picture by Editor | ChatGPT
Introduction
Python’s commonplace library is in depth, providing a variety of modules to carry out widespread duties effectively.
Amongst these, the collections
module is a standout instance, which gives specialised container knowledge varieties that may function alternate options to Python’s general-purpose built-in containers like dict
, record
, set
, and tuple
. Whereas many builders are acquainted with a few of its parts, the module hosts a wide range of functionalities which can be surprisingly helpful and might simplify code, enhance readability, and enhance efficiency.
This tutorial explores ten sensible — and maybe stunning — purposes of the Python collections
module.
1. Counting Hashable Objects Effortlessly with Counter
A typical activity in virtually any knowledge evaluation undertaking is counting the occurrences of things in a sequence. The collections.Counter
class is designed particularly for this. It is a dictionary subclass the place components are saved as keys and their counts are saved as values.
from collections import Counter
# Rely the frequency of phrases in a listing
phrases = ['galaxy', 'nebula', 'asteroid', 'comet', 'gravitas', 'galaxy', 'stardust', 'quasar', 'galaxy', 'comet']
word_counts = Counter(phrases)
# Discover the 2 most typical phrases
most_common = word_counts.most_common(2)
# Output outcomes
print(f"Phrase counts: {word_counts}")
print(f"Most typical phrases: {most_common}")
Output:
Phrase counts: Counter({'galaxy': 3, 'comet': 2, 'nebula': 1, 'asteroid': 1, 'gravitas': 1, 'stardust': 1, 'quasar': 1})
Most typical phrases: [('galaxy', 3), ('comet', 2)]
2. Creating Light-weight Courses with namedtuple
While you want a easy class only for grouping knowledge, with out strategies, a namedtuple
is a helpful, memory-efficient choice. It permits you to create tuple-like objects which have fields accessible by attribute lookup in addition to being indexable and iterable. This makes your code extra readable than utilizing an ordinary tuple.
from collections import namedtuple
# Outline a Ebook namedtuple
# Fields: title, writer, year_published, isbn
Ebook = namedtuple('Ebook', ['title', 'author', 'year_published', 'isbn'])
# Create an occasion of the Ebook
my_book = Ebook(
title="The Hitchhiker"s Information to the Galaxy',
writer="Douglas Adams",
year_published=1979,
isbn='978-0345391803'
)
print(f"Ebook Title: {my_book.title}")
print(f"Writer: {my_book.writer}")
print(f"Yr Revealed: {my_book.year_published}")
print(f"ISBN: {my_book.isbn}")
print("n--- Accessing by index ---")
print(f"Title (by index): {my_book[0]}")
print(f"Writer (by index): {my_book[1]}")
print(f"Yr Revealed (by index): {my_book[2]}")
print(f"ISBN (by index): {my_book[3]}")
Output:
Accessing e-book knowledge by subject identify
Title (by subject identify): The Hitchhiker's Information to the Galaxy
Writer (by subject identify): Douglas Adams
Yr Revealed (by subject identify): 1979
ISBN (by subject identify): 978-0345391803
Accessing e-book knowledge by index
Title (by index): The Hitchhiker's Information to the Galaxy
Writer (by index): Douglas Adams
Yr Revealed (by index): 1979
ISBN (by index): 978-0345391803
You possibly can consider a namedtuple
as just like a mutable C struct, or as a knowledge class with out strategies. They positively have their makes use of.
3. Dealing with Lacking Dictionary Keys Gracefully with defaultdict
A typical frustration when working with dictionaries is the KeyError
that happens once you attempt to entry a key that does not exist. The collections.defaultdict
is the right resolution. It is a subclass of dict
that calls a manufacturing unit operate to provide a default worth for lacking keys. That is particularly helpful for grouping gadgets.
from collections import defaultdict
# Group a listing of tuples by the primary factor
scores_by_round = [('contestantA', 8), ('contestantB', 7), ('contestantC', 5),
('contestantA', 7), ('contestantB', 7), ('contestantC', 6),
('contestantA', 9), ('contestantB', 5), ('contestantC', 4)]
grouped_scores = defaultdict(record)
for key, worth in scores_by_round:
grouped_scores[key].append(worth)
print(f"Grouped scores: {grouped_scores}")
Output:
Grouped scores: defaultdict(, {'contestantA': [8, 7, 9], 'contestantB': [7, 7, 5], 'contestantC': [5, 6, 4]})
4. Implementing Quick Queues and Stacks with deque
Python lists can be utilized as stacks and queues, despite the fact that they don’t seem to be optimized for these operations. Appending and popping from the top of a listing is quick, however doing the identical from the start is sluggish as a result of all different components need to be shifted. The collections.deque
(double-ended queue) is designed for quick appends and pops from each ends.
First, this is an instance of a queue utilizing deque
.
from collections import deque
# Create a queue
d = deque([1, 2, 3])
print(f"Authentic queue: {d}")
# Add to the appropriate
d.append(4)
print("Including merchandise to queue: 4")
print(f"New queue: {d}")
# Take away from the left
print(f"Popping queue merchandise (from left): {d.popleft()}")
# Output last queue
print(f"Remaining queue: {d}")
 
Output:
Authentic queue: deque([1, 2, 3])
Including merchandise to queue: 4
New queue: deque([1, 2, 3, 4])
Popping queue merchandise (from left): 1
Remaining queue: deque([2, 3, 4])
And now let’s use deque
to create a stack:
from collections import deque
# Create a stack
d = deque([1, 2, 3])
print(f"Authentic stack: {d}")
# Add to the appropriate
d.append(5)
print("Including merchandise to stack: 5")
print(f"New stack: {d}")
# Take away from the appropriate
print(f"Popping stack merchandise (from proper): {d.pop()}")
# Output last stack
print(f"Remaining stack: {d}")
Output:
Authentic stack: deque([1, 2, 3])
Including merchandise to stack: 5
New stack: deque([1, 2, 3, 5])
Popping stack merchandise (from proper): 5
Remaining stack: deque([1, 2, 3])
5. Remembering Insertion Order with OrderedDict
Earlier than Python 3.7, commonplace dictionaries didn’t protect the order during which gadgets have been inserted. To unravel this, the collections.OrderedDict
was used. Whereas commonplace dicts now preserve insertion order, OrderedDict
nonetheless has distinctive options, just like the move_to_end()
technique, which is beneficial for duties like making a easy cache.
from collections import OrderedDict
# An OrderedDict remembers the order of insertion
od = OrderedDict()
od['a'] = 1
od['b'] = 2
od['c'] = 3
print(f"Begin order: {record(od.keys())}")
# Transfer 'a' to the top
od.move_to_end('a')
print(f"Remaining order: {record(od.keys())}")
Output:
Begin order: ['a', 'b', 'c']
Remaining order: ['b', 'c', 'a']
6. Combining A number of Dictionaries with ChainMap
The collections.ChainMap
class gives a method to hyperlink a number of dictionaries collectively to allow them to be handled as a single unit. It is usually a lot quicker than creating a brand new dictionary and operating a number of replace()
calls. Lookups search the underlying mappings one after the other till a secret is discovered.
Let’s create a ChainMap named chain and question it for keys.
from collections import ChainMap
# Create dictionaries
dict1 = {'a': 1, 'b': 2}
dict2 = {'b': 3, 'c': 4}
# Create a ChainMap
chain = ChainMap(dict1, dict2)
# Print dictionaries
print(f"dict1: {dict1}")
print(f"dict2: {dict2}")
# Question ChainMap for keys and return values
print("nQuerying ChainMap for keys")
print(f"a: {chain['a']}")
print(f"c: {chain['c']}")
print(f"b: {chain['b']}")
Output:
dict1: {'a': 1, 'b': 2}
dict2: {'b': 3, 'c': 4}
Querying keys for values
a: 1
c: 4
b: 2
Be aware that, within the above situation, ‘b’ is present in first in dict1
, the primary dictionary in chain
, and so it’s the worth related to this key that’s returned.
7. Protecting a Restricted Historical past with deque’s maxlen
A deque
may be created with a hard and fast most size utilizing the maxlen
argument. If extra gadgets are added than the utmost size, the gadgets from the other finish are routinely discarded. That is good for holding a historical past of the final N gadgets.
from collections import deque
# Maintain a historical past of the final 3 gadgets
historical past = deque(maxlen=3)
historical past.append("cd ~")
historical past.append("ls -l")
historical past.append("pwd")
print(f"Begin historical past: {historical past}")
# Add a brand new merchandise, push out the left-most merchandise
historical past.append("mkdir knowledge")
print(f"Remaining historical past: {historical past}")
Output:
Begin historical past: deque(['cd ~', 'ls -l', 'pwd'], maxlen=3)
Remaining historical past: deque(['ls -l', 'pwd', 'mkdir data'], maxlen=3)
8. Creating Nested Dictionaries Simply with defaultdict
Constructing on defaultdict
, you’ll be able to create nested or tree-like dictionaries with ease. By offering a lambda
operate that returns one other defaultdict
, you’ll be able to create dictionaries of dictionaries on the fly.
from collections import defaultdict
import json
# A operate that returns a defaultdict
def tree():
return defaultdict(tree)
# Create a nested dictionary
nested_dict = tree()
nested_dict['users']['user1']['name'] = 'Felix'
nested_dict['users']['user1']['email'] = 'user1@instance.com'
nested_dict['users']['user1']['phone'] = '515-KL5-5555'
# Output formatted JSON to console
print(json.dumps(nested_dict, indent=2))
Output:
{
"customers": {
"user1": {
"identify": "Felix",
"electronic mail": "user1@instance.com",
"cellphone": "515-KL5-5555"
}
}
}
9. Performing Arithmetic Operations on Counters
Information flash: you’ll be able to carry out arithmetic operations, corresponding to addition, subtraction, intersection, and union, on Counter
objects. This can be a highly effective software for evaluating and mixing frequency counts from completely different sources.
from collections import Counter
c1 = Counter(a=4, b=2, c=0, d=-2)
c2 = Counter(a=1, b=2, c=3, d=4)
# Add counters -> provides counts for widespread keys
print(f"c1 + c2 = {c1 + c2}")
# Subtract counters -> retains solely optimistic counts
print(f"c1 - c2 = {c1 - c2}")
# Intersection -> takes minimal of counts
print(f"c1 & c2 = {c1 & c2}")
# Union -> takes most of counts
print(f"c1 | c2 = c2")
Output:
c1 + c2 = Counter({'a': 5, 'b': 4, 'c': 3, 'd': 2})
c1 - c2 = Counter({'a': 3})
c1 & c2 = Counter({'b': 2, 'a': 1})
c1 | c2 = Counter({'a': 4, 'd': 4, 'c': 3, 'b': 2})
10. Effectively Rotating Parts with deque
The deque
object has a rotate()
technique that permits you to rotate the weather effectively. A optimistic argument rotates components to the appropriate; a detrimental, to the left. That is a lot quicker than slicing and re-joining lists or tuples.
from collections import deque
d = deque([1, 2, 3, 4, 5])
print(f"Authentic deque: {d}")
# Rotate 2 steps to the appropriate
d.rotate(2)
print(f"After rotating 2 to the appropriate: {d}")
# Rotate 3 steps to the left
d.rotate(-3)
print(f"After rotating 3 to the left: {d}")
Output:
Authentic deque: deque([1, 2, 3, 4, 5])
After rotating 2 to the appropriate: deque([4, 5, 1, 2, 3])
After rotating 3 to the left: deque([2, 3, 4, 5, 1])
Wrapping Up
The collections
module in Python is a killer assortment of specialised, high-performance container datatypes. From counting gadgets with Counter
to constructing environment friendly queues with deque
, these instruments could make your code cleaner, extra environment friendly, and extra Pythonic. By familiarizing your self with these stunning and highly effective options, you’ll be able to clear up widespread programming issues in a extra elegant and efficient method.
Matthew Mayo (@mattmayo13) holds a grasp’s diploma in pc science and a graduate diploma in knowledge mining. As managing editor of KDnuggets & Statology, and contributing editor at Machine Studying Mastery, Matthew goals to make advanced knowledge science ideas accessible. His skilled pursuits embrace pure language processing, language fashions, machine studying algorithms, and exploring rising AI. He’s pushed by a mission to democratize information within the knowledge science neighborhood. Matthew has been coding since he was 6 years previous.