close
[python]speedup your dict get method

在寫WM algo的時候,使用python dict,
pattern數9 最短長度為2
test file 90M result file 4.8M
用profile看了一下時間的瓶頸,大部份再原版本的get也就是取hash值的地方filter。

$time python wm4pc.py > tmp.pcpc5

Time spent in user mode (CPU seconds) : 37.743s
Time spent in kernel mode (CPU seconds) : 0.521s
Total time : 0:43.77s
CPU utilisation (percentage) : 87.4%
Times the process was swapped : 0
Times of major page faults : 0
Times of minor page faults : 89885
36177119 function calls in 360.682 seconds


後來改用 120618 使用collections defaultdict speed up
$time python wm4pc.py > tmp.pctry

Time spent in user mode (CPU seconds) : 20.131s
Time spent in kernel mode (CPU seconds) : 0.307s
Total time : 0:21.07s
CPU utilisation (percentage) : 96.9%
Times the process was swapped : 0
Times of major page faults : 0
Times of minor page faults : 89889


本來想說再SHFIT[key] SHIFTVALUE處自己做CACHE加了一個SHIFTVALUE
結果反而慢了10秒~


$time python wm5pc.py > tmp.pctry5

Time spent in user mode (CPU seconds) : 31.343s
Time spent in kernel mode (CPU seconds) : 0.268s
Total time : 0:32.54s
CPU utilisation (percentage) : 97.1%
Times the process was swapped : 0
Times of major page faults : 0
Times of minor page faults : 89889


結論,要大量用dict 要用 defaultdict 初始值。
多餘的變數會造成每次執行的初始化可能更慢。
dict[key]比較快 但是要處理KeyError,
且不要用try catch比較慢。

Another Mapping : The defaultdict — Building Skills in Programming
http://www.itmaybeahack.com/homepage/books/nonprog/html/p10_set_map/p10_c04_defaultdict.html
algorithm - Python - Is a dictionary slow to find frequency of each character? - Stack Overflow
http://stackoverflow.com/questions/2522152/python-is-a-dictionary-slow-to-find-frequency-of-each-character
Speed of python dict access methods (direct vs. get) | Standard Thoughts
http://www.partofthething.com/thoughts/?p=513


Power by peicheng.pixnet.net
arrow
arrow
    文章標籤
    python dict speedup
    全站熱搜

    peicheng 發表在 痞客邦 留言(0) 人氣()