Pythonメモ : あまり知られていない（かもしれない）テクニック集その2

listtocommaseparated.py : リストをカンマ区切りで出力
minmaxindex.py : リスト内の最小値、最大値のインデックスを取得
removeduplicatefromlist.py : リストから重複要素を削除
reverselist.py : リストを逆順にする
reversestring.py : 文字列を逆順にする
flattenlist.py : ネストされたリストの平坦化（flatten）
sortlistkeepindices.py : ソート前のインデックスを保持
transpose.py : 転置
copylist.py : リストの浅いコピー、深いコピー
merge_dict.py : 辞書をマージ
dictionaryget.py : 辞書にキーが存在しない場合にgetメソッドを使用
dictdefaultvalue.py : 辞書のデフォルト値
dictsortbyvalue.py : 辞書を値でソート
dictswapkeysvalues.py : 辞書のキーと値を交換
loopoverlappingdicts.py : 2つの辞書で共通のキーを取得
keydefaultdict.py : defaultdictと関数の組み合わせ
metatable.py : メタテーブル
tree.py : defaultdictを使用した木構造

上記リポジトリにあまり知られていない（かもしれない）Pythonのテクニックが49個まとめられていたのでリストや辞書関連を見ていく（その1、その3は下記リンク）。

wonderwall.hatenablog.com

listtocommaseparated.py : リストをカンマ区切りで出力

joinを使ってリストの要素をカンマ区切りで出力する方法。数値を含む場合はstrで変換する。

コード

#! /usr/bin/env python3
"""converts list to comma separated string"""

items = ['foo', 'bar', 'xyz']

print (','.join(items))

"""list of numbers to comma separated"""
numbers = [2, 3, 5, 10]

print (','.join(map(str, numbers)))

"""list of mix  data"""
data = [2, 'hello', 3, 3.4]

print (','.join(map(str, data)))

実行結果

foo,bar,xyz
2,3,5,10
2,hello,3,3.4

minmaxindex.py : リスト内の最小値、最大値のインデックスを取得

リスト内の最小値、最大値のインデックスを取得する方法。

コード

"""
Find Index of Min/Max Element.
"""

lst = [40, 10, 20, 30]


def minIndex(lst):
    return min(range(len(lst)), key=lst.__getitem__)  # use xrange if < 2.7


def maxIndex(lst):
    return max(range(len(lst)), key=lst.__getitem__)  # use xrange if < 2.7

print(minIndex(lst))
print(maxIndex(lst))

実行結果

1
0

removeduplicatefromlist.py : リストから重複要素を削除

setを使う方法は最初の順序が保持されない。OrderedDictを使うと最初の順序が保持される。

コード

#! /usr/bin/env python3
"""remove duplicate items from list. note: does not preserve the original list order"""

items = [2, 2, 3, 3, 1]

newitems2 = list(set(items))
print(newitems2)

"""remove dups and keep order"""

from collections import OrderedDict

items = ["foo", "bar", "bar", "foo"]

print(list(OrderedDict.fromkeys(items).keys()))

実行結果

[1, 2, 3]
['foo', 'bar']

reverselist.py : リストを逆順にする

ステップに-1を指定すると逆順になる。reversedを使う方法もある。

コード

#! /usr/bin/env python3
"""reversing list with special case of slice step param"""
a = [5, 4, 3, 2, 1]
print(a[::-1])

"""iterating over list contents in reverse efficiently."""
for ele in reversed(a):
    print(ele)

実行結果

[1, 2, 3, 4, 5]
1
2
3
4
5

reversestring.py : 文字列を逆順にする

文字列で逆順にする方法も同じ。数値の場合はstrで変換してから行う。

コード

#! /usr/bin/env python3

"""reversing string with special case of slice step param"""

a = 'abcdefghijklmnopqrstuvwxyz'
print(a[::-1])


"""iterating over string contents in reverse efficiently."""

for char in reversed(a):
    print(char)

"""reversing an integer through type conversion and slicing."""

num = 123456789
print(int(str(num)[::-1]))

実行結果

zyxwvutsrqponmlkjihgfedcba
z
y
x
w
v
u
t
s
r
q
p
o
n
m
l
k
j
i
h
g
f
e
d
c
b
a
987654321

flattenlist.py : ネストされたリストの平坦化（flatten）

ネストされたリストを平坦化（flatten）する方法。いろいろな方法がある。

コード

#! /usr/bin/env python3
"""
Deep flattens a nested list

Examples:
    >>> list(flatten_list([1, 2, [3, 4], [5, 6, [7]]]))
    [1, 2, 3, 4, 5, 6, 7]
    >>> list(flatten_list(['apple', 'banana', ['orange', 'lemon']]))
    ['apple', 'banana', 'orange', 'lemon']
"""


def flatten_list(L):
    for item in L:
        if isinstance(item, list):
            yield from flatten_list(item)
        else:
            yield item

# In Python 2
from compiler.ast import flatten
flatten(L)


# Flatten list of lists

a = [[1, 2], [3, 4]]

# Solutions:

print([x for _list in a for x in _list])

import itertools
print(list(itertools.chain(*a)))

print(list(itertools.chain.from_iterable(a)))

# In Python 2
print(reduce(lambda x, y: x+y, a))

print(sum(a, []))

実行結果（Python 3で実行。"In Python 2"の部分をコメントアウト。）

[1, 2, 3, 4]
[1, 2, 3, 4]
[1, 2, 3, 4]

実行結果（Python 2で実行。"In Python 2"の部分み実行。flatten(L)でエラーになるのでaを対象にしてprint。）

[1, 2, 3, 4]
[1, 2, 3, 4]
[1, 2, 3, 4]

sortlistkeepindices.py : ソート前のインデックスを保持

ソート前のインデックスを保持する方法。enumerateを使ってインデックスを取得している。返却されるのはタプルなのでリストにしたい場合はlist()で変換する。

コード

#! /usr/bin/env python3
"""Sort a list and store previous indices of values"""

# enumerate is a great but little-known tool for writing nice code

l = [4, 2, 3, 5, 1]
print("original list: ", l)

values, indices = zip(*sorted((a, b) for (b, a) in enumerate(l)))

# now values contains the sorted list and indices contains
# the indices of the corresponding value in the original list

print("sorted list: ", values)
print("original indices: ", indices)

# note that this returns tuples, but if necessary they can
# be converted to lists using list()

実行結果

original list:  [4, 2, 3, 5, 1]
sorted list:  (1, 2, 3, 4, 5)
original indices:  (4, 1, 2, 0, 3)

transpose.py : 転置

転置する方法。タプルではなくリストにしたい場合はStack Overflowによるとlist(map(list, zip(*original)))とのこと。

コード

#! /usr/bin/env python3
"""transpose 2d array [[a,b], [c,d], [e,f]] -> [[a,c,e], [b,d,f]]"""

original = [['a', 'b'], ['c', 'd'], ['e', 'f']]
transposed = zip(*original)
print(list(transposed))

実行結果

[('a', 'c', 'e'), ('b', 'd', 'f')]

copylist.py : リストの浅いコピー、深いコピー

コピーには浅いコピーと深いコピーがある。ドキュメントによると「浅い (shallow) コピーと深い (deep) コピーの違いが関係するのは、複合オブジェクト (リストやクラスインスタンスのような他のオブジェクトを含むオブジェクト) だけです」とのこと。詳細は下記ブログを参照。

eibiisii.hateblo.jp

コード

#! /usr/bin/env python3
"""a fast way to make a shallow copy of a list"""

a = [1, 2, 3, 4, 5]
print(a[:])


"""copy list by typecasting method"""

a = [1, 2, 3, 4, 5]
print(list(a))


"""using the list.copy() method (python3 only)"""

a = [1, 2, 3, 4, 5]

print(a.copy())


"""copy nested lists using copy.deepcopy"""

from copy import deepcopy

l = [[1, 2], [3, 4]]

l2 = deepcopy(l)
print(l2)

実行結果

[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]
[[1, 2], [3, 4]]

merge_dict.py : 辞書をマージ

辞書をマージする3つの方法。キーが同じ場合、{**d1, **d2}とd1.update(d2)は後に書いた方の値が残るようだが、dict(d1.items() | d2.items())は不定っぽいので注意（参考：Stack Overflow）。

コード

#! /usr/bin/env python3.5
"""merge dict's"""

d1 = {'a': 1}
d2 = {'b': 2}

#  python 3.5
print({**d1, **d2})

print(dict(d1.items() | d2.items()))

d1.update(d2)
print(d1)

実行結果

{'b': 2, 'a': 1}
{'b': 2, 'a': 1}
{'b': 2, 'a': 1}

dictionaryget.py : 辞書にキーが存在しない場合にgetメソッドを使用

getメソッドを使用すればKeyErrorにならない。デフォルト値を指定しない場合はNoneが返る。

コード

#! /usr/bin/env python3
"""returning None or default value, when key is not in dict"""
d = {'a': 1, 'b': 2}

print(d.get('c', 3))

実行結果

dictdefaultvalue.py : 辞書のデフォルト値

キーが存在しない場合のデフォルト値を指定する場合は、dict.setdefaultとdict.getを使うか、collections.defaultdictを使うと簡潔に書ける。

コード

"""
When update value in a dict based on the old value, we usually check whether
the key is in the dict. But with `dict.setdefault`, `dict.get` and
`collections.defaultdict` the code can be shorter and cleaner.

Before:
    >>> d = {}
    >>> if 'a' not in d:  # update a list
    ...     d['a'] = []
    ...
    >>> d['a'].append(1)
    >>>
    >>> if 'b' not in d:  # update an integer
    ...     d['b'] = 0
    ...
    >>> d['b'] += 1

Now:
    >>> d = {}
    >>> d.setdefault('a', []).append(1)
    >>> d['b'] = d.get('b', 0) + 1
"""

""" builtin dict """
d = {}
d.setdefault('a', []).append(1)
d['b'] = d.get('b', 0) + 1

print(d)


""" with collections.defaultdict """
from collections import defaultdict

d = defaultdict(list)
d['a'].append(1)

print(d)

実行結果

{'a': [1], 'b': 1}
defaultdict(<class 'list'>, {'a': [1]})

dictsortbyvalue.py : 辞書を値でソート

辞書を値でソートする方法。itemgetterはドキュメントにもソート例が書いてある。

コード

#!/usr/bin/env python3
""" Sort a dictionary by its values with the built-in sorted() function and a 'key' argument. """

d = {'apple': 10, 'orange': 20, 'banana': 5, 'rotten tomato': 1}
print(sorted(d.items(), key=lambda x: x[1]))


""" Sort using operator.itemgetter as the sort key instead of a lambda"""


from operator import itemgetter


print(sorted(d.items(), key=itemgetter(1)))


"""Sort dict keys by value"""


print(sorted(d, key=d.get))

実行結果

[('rotten tomato', 1), ('banana', 5), ('apple', 10), ('orange', 20)]
[('rotten tomato', 1), ('banana', 5), ('apple', 10), ('orange', 20)]
['rotten tomato', 'banana', 'apple', 'orange']

dictswapkeysvalues.py : 辞書のキーと値を交換

辞書のキーと値を交換する方法。最初に値が重複していなことを確認している。

コード

#! /usr/bin/env python3
"""Swaps keys and values in a dict"""

_dict = {"one": 1, "two": 2}
# make sure all of dict's values are unique
assert len(_dict) == len(set(_dict.values()))
reversed_dict = {v: k for k, v in _dict.items()}

実行結果（reversed_dictをprint）

{1: 'one', 2: 'two'}

loopoverlappingdicts.py : 2つの辞書で共通のキーを取得

2つの辞書で共通のキーを取得する方法。items()を使えばキーと値が共通の要素を取得できる。

コード

#! /usr/bin/env python3
"""loop over dicts that share (some) keys in Python2"""

dctA = {'a': 1, 'b': 2, 'c': 3}
dctB = {'b': 4, 'c': 3, 'd': 6}

for ky in set(dctA) & set(dctB):
    print(ky)

"""loop over dicts that share (some) keys in Python3"""
for ky in dctA.keys() & dctB.keys():
    print(ky)

"""loop over dicts that share (some) keys and values in Python3"""
for item in dctA.items() & dctB.items():
    print(item)

実行結果

b
c
b
c
('c', 3)

keydefaultdict.py : defaultdictと関数の組み合わせ

defaultdictと関数の組み合わせ。この例は1をキー値だけ左シフトした値がセットされる。

コード

"""
keydefaultdict with where the function recieves the key.
"""
from collections import defaultdict


class keydefaultdict(defaultdict):
    def __missing__(self, key):
        if self.default_factory is None:
            raise KeyError(key)
        else:
            ret = self[key] = self.default_factory(key)
            return ret


def pow2(n):
    return 1 << n

d = keydefaultdict(pow2)
print(d[1])
print(d[3])
print(d[10])
print(d)

実行結果

2
8
1024
defaultdict(<function pow2 at 0x7f176e2db048>, {1: 2, 10: 1024, 3: 8})

metatable.py : メタテーブル

メタテーブルという名前が付いているが中身はdefaultdictと関数の組み合わせとほぼ同じ。この例はフィボナッチ数列を計算しており、d[10]とするとd[4]からd[9]も計算される。

コード

"""
metatable with where the function recieves the dictionary and key.
"""
from collections import defaultdict


class metatable(defaultdict):

    def __missing__(self, key):
        if self.default_factory is None:
            raise KeyError(key)
        else:
            ret = self[key] = self.default_factory(self, key)
            return ret


def fib(d, n):
    if n == 0 or n == 1:
        return n
    return d[n - 1] + d[n - 2]

d = metatable(fib)
print(d[1])
print(d[3])
print(d[10])
print(d)

実行結果

1
2
55
defaultdict(<function fib at 0x7f53a196c048>, {0: 0, 1: 1, 2: 1, 3: 2, 4: 3, 5: 5, 6: 8, 7: 13, 8: 21, 9: 34, 10: 55})

tree.py : defaultdictを使用した木構造

defaultdictを使用した木構造。事前に下位の構造を作成しなくてよい。

コード

#! /usr/bin/env python3
"""
See description here
https://gist.github.com/hrldcpr/2012250
"""

from collections import defaultdict

tree = lambda: defaultdict(tree)


users = tree()
users['harold']['username'] = 'chopper'
users['matt']['password'] = 'hunter2'

実行結果（print文を追加）

# print(users['harold'])
defaultdict(<function <lambda> at 0x7faf4e7f1048>, {'username': 'chopper'})
# print(users['harold']['username'])
chopper
# print(users)
defaultdict(<function <lambda> at 0x7faf4e7f1048>, {'harold': defaultdict(<function <lambda> at 0x7faf4e7f1048>, {'username': 'chopper'}), 'matt': defaultdict(<function <lambda> at 0x7faf4e7f1048>, {'password': 'hunter2'})})
# import json
# print(json.dumps(users))
{"matt": {"password": "hunter2"}, "harold": {"username": "chopper"}}