Python's High-performance container datatypes
Written on March 31st , 2018 by @10000TB
I have been using defaultdict
quite a lot as an substitute to dict
for a while, and Counter
as well. They both come from python’s standard module collections
, which was first introduced in 2.4
. Since I am very satisfied with the two, I thought to myself why not check out the rest of the container types in the module. So here I compile this post on python’s collections
module, the “High-performance container datatypes” as claimed.
This post will briefly introduce what are the container datatypes in collections
and maybe explain why there are of high performance.
As of 2.7
, there are now five useful things in collections
: namedtuple()
, deque
, Counter
, OrderedDict
, defaultdict
.
-
Counter
In a nutshell, it helps you do counting; value and its counting are in the form of key value pair.
Note thatCounter
is a subclass ofdict
, so that should clarify a bit on how countings are represented.notes:
- There are three ways to initialize a Counter object:
Counter()
,Counter({"k1":3, "k2":4})
,Counter(k1=3, k2=4)
. - Counter has a dictionary-like interface except that it return zero for a non-existing key instead of a key error.
-
elements()
- return an iterator repeating each key as many times as its count.
In comparison,keys()
return a list with all keys in it.In [17]: from collections import Counter In [18]: c = Counter(ad=1, adas=3, adqw=23) In [19]: c.elements() Out[19]: <itertools.chain at 0x10f97b8d0> In [20]: for i in c.elements(): ...: print i ...: adas adas adas ad adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw adqw In [22]: c.keys() Out[22]: ['adas', 'ad', 'adqw'] In [23]:
- Since
Counter
is a subclass ofdict
, it jas access tokeys()
to get a list with all keys in itself. most_common([n])
- return n most common keys and their counts in the form of tuples in the counter.subtract([iterable-or-mapping])
- it subtracts counts passed in from current counter. Counts in the result can be zero or negative.
In comparison,update([iterable-or-mapping])
adds the couts passed in on top of existin counts. Both of them does not replace, but subtracting or adding.- We can also do mathmatical operations on counter object:
+
,-
,&
,|
.
- There are three ways to initialize a Counter object:
-
(TODO: 10000tb) more notes on rest of the container types within
collections
.
Reference(s):
- https://docs.python.org/2/library/collections.html