[python]speedup your dict get method
在寫WM algo的時候,使用python dict,
pattern數9 最短長度為2
test file 90M result file 4.8M
用profile看了一下時間的瓶頸,大部份再原版本的get也就是取hash值的地方filter。
$time python wm4pc.py > tmp.pcpc5
Time spent in user mode (CPU seconds) : 37.743s
Time spent in kernel mode (CPU seconds) : 0.521s
Total time : 0:43.77s
CPU utilisation (percentage) : 87.4%
Times the process was swapped : 0
Times of major page faults : 0
Times of minor page faults : 89885
36177119 function calls in 360.682 seconds
後來改用 120618 使用collections defaultdict speed up
$time python wm4pc.py > tmp.pctry
Time spent in user mode (CPU seconds) : 20.131s
Time spent in kernel mode (CPU seconds) : 0.307s
Total time : 0:21.07s
CPU utilisation (percentage) : 96.9%
Times the process was swapped : 0
Times of major page faults : 0
Times of minor page faults : 89889
本來想說再SHFIT[key] SHIFTVALUE處自己做CACHE加了一個SHIFTVALUE
結果反而慢了10秒~
$time python wm5pc.py > tmp.pctry5
Time spent in user mode (CPU seconds) : 31.343s
Time spent in kernel mode (CPU seconds) : 0.268s
Total time : 0:32.54s
CPU utilisation (percentage) : 97.1%
Times the process was swapped : 0
Times of major page faults : 0
Times of minor page faults : 89889
結論,要大量用dict 要用 defaultdict 初始值。
多餘的變數會造成每次執行的初始化可能更慢。
dict[key]比較快 但是要處理KeyError,
且不要用try catch比較慢。
Another Mapping : The defaultdict — Building Skills in Programming
http://www.itmaybeahack.com/homepage/books/nonprog/html/p10_set_map/p10_c04_defaultdict.html
algorithm - Python - Is a dictionary slow to find frequency of each character? - Stack Overflow
http://stackoverflow.com/questions/2522152/python-is-a-dictionary-slow-to-find-frequency-of-each-character
Speed of python dict access methods (direct vs. get) | Standard Thoughts
http://www.partofthething.com/thoughts/?p=513
Power by peicheng.pixnet.net
留言列表