I've been testing a cacheing system of my making. Its purpose is to speed up a Django web application. It stores everything in-memory. According to cProfile most of the time in my tests is spent inside QuerySet._clone() which turns out to be terribly inefficient (it's actually not that strange given the implementation).
I was having high hopes for using PyPy to speed things up. I've got a 64-bit machine. However after installing all the required libraries it turns out that PyPy compiled code runs about 2.5x slower than regular Python code, and I don't know what to make out of it. The code is CPU bound (there are absolutely no database queries, so IO-bounding is not an option). A single test runs for about 10 seconds, so I guess it should be enough for JIT to kick in. I'm using PyPy 1.5. One note - I didn't compile the sources myself, just downloaded a 64-bit linux version.
I'd like to know how frequent it is for a CPU intensive code to actually run slower under PyPy. Is there hopefully something wrong I could have done that would prevent PyPy from running at its best.
EDIT
Exact cPython output:
PyPy 1.5:
3439146 function calls (3218654 primitive calls) in 19.094 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
2/1 0.000 0.000 18.956 18.956 :1()
2/1 0.000 0.000 18.956 18.956 /path/to/my/project/common/integrity/models/transactions.py:200(newfn)
2/1 0.000 0.000 18.956 18.956 /path/to/my/project/common/integrity/models/transactions.py:134(recur)
2/1 0.000 0.000 18.956 18.956 /usr/local/pypy/site-packages/django/db/transaction.py:210(inner)
2/1 0.172 0.086 18.899 18.899 /path/to/my/project/common/integrity/tests/optimization.py:369(func_cached)
9990 0.122 0.000 18.632 0.002 /usr/local/pypy/site-packages/django/db/models/manager.py:131(get)
9990 0.127 0.000 16.638 0.002 /path/to/my/project/common/integrity/models/cache.py:1068(get)
9990 0.073 0.000 12.478 0.001 /usr/local/pypy/site-packages/django/db/models/query.py:547(filter)
9990 0.263 0.000 12.405 0.001 /path/to/my/project/common/integrity/models/cache.py:1047(_filter_or_exclude)
9990 0.226 0.000 12.096 0.001 /usr/local/pypy/site-packages/django/db/models/query.py:561(_filter_or_exclude)
9990 0.187 0.000 8.383 0.001 /path/to/my/project/common/integrity/models/cache.py:765(_clone)
9990 0.212 0.000 7.662 0.001 /usr/local/pypy/site-packages/django/db/models/query.py:772(_clone)
9990 1.025 0.000 7.125 0.001 /usr/local/pypy/site-packages/django/db/models/sql/query.py:226(clone)
129942/49972 1.674 0.000 6.021 0.000 /usr/local/pypy/lib-python/2.7/copy.py:145(deepcopy)
140575/110605 0.120 0.000 4.066 0.000 {len}
9990 0.182 0.000 3.972 0.000 /usr/local/pypy/site-packages/django/db/models/query.py:74(__len__)
19980 0.260 0.000 3.777 0.000 /path/to/my/project/common/integrity/models/cache.py:1062(iterator)
9990 0.255 0.000 3.154 0.000 /usr/local/pypy/site-packages/django/db/models/sql/query.py:1149(add_q)
9990 0.210 0.000 3.073 0.000 /path/to/my/project/common/integrity/models/cache.py:973(_query)
9990 0.371 0.000 2.316 0.000 /usr/local/pypy/site-packages/django/db/models/sql/query.py:997(add_filter)
9990 0.364 0.000 2.168 0.000 /path/to/my/project/common/integrity/models/cache.py:892(_deduct)
29974/9994 0.448 0.000 2.078 0.000 /usr/local/pypy/lib-python/2.7/copy.py:234(_deepcopy_tuple)
19990 0.362 0.000 2.065 0.000 /path/to/my/project/common/integrity/models/cache.py:566(__init__)
10000 0.086 0.000 1.874 0.000 /path/to/my/project/common/integrity/models/cache.py:1090(get_query_set)
19990 0.269 0.000 1.703 0.000 /usr/local/pypy/site-packages/django/db/models/query.py:31(__init__)
9990 0.122 0.000 1.643 0.000 /path/to/my/project/common/integrity/models/cache.py:836(_deduct_recur)
19980 0.274 0.000 1.636 0.000 /usr/local/pypy/site-packages/django/utils/tree.py:55(__deepcopy__)
9990 0.607 0.000 1.458 0.000 /path/to/my/project/common/integrity/models/cache.py:789(_deduct_local)
10020 0.633 0.000 1.437 0.000 /usr/local/pypy/site-packages/django/db/models/sql/query.py:99(__init__)
129942 0.841 0.000 1.191 0.000 /usr/local/pypy/lib-python/2.7/copy.py:267(_keep_alive)
9994/9992 0.201 0.000 1.019 0.000 /usr/local/pypy/lib-python/2.7/copy.py:306(_reconstruct)
Python 2.7:
3326403 function calls (3206359 primitive calls) in 12.430 CPU seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 12.457 12.457 :1()
1 0.000 0.000 12.457 12.457 /path/to/my/project/common/integrity/models/transactions.py:200(newfn)
1 0.000 0.000 12.457 12.457 /path/to/my/project/common/integrity/models/transactions.py:134(recur)
1 0.000 0.000 12.457 12.457 /usr/local/lib/python2.7/dist-packages/django/db/transaction.py:210(inner)
1 0.000 0.000 12.457 12.457 /path/to/my/project/common/integrity/models/transactions.py:165(recur2)
1 0.089 0.089 12.450 12.450 /path/to/my/project/common/integrity/tests/optimization.py:369(func_cached)
9990 0.198 0.000 12.269 0.001 /usr/local/lib/python2.7/dist-packages/django/db/models/manager.py:131(get)
9990 0.087 0.000 11.281 0.001 /path/to/my/project/common/integrity/models/cache.py:1068(get)
9990 0.040 0.000 8.161 0.001 /usr/local/lib/python2.7/dist-packages/django/db/models/query.py:547(filter)
9990 0.110 0.000 8.121 0.001 /path/to/my/project/common/integrity/models/cache.py:1047(_filter_or_exclude)
9990 0.127 0.000 7.983 0.001 /usr/local/lib/python2.7/dist-packages/django/db/models/query.py:561(_filter_or_exclude)
9990 0.100 0.000 5.593 0.001 /path/to/my/project/common/integrity/models/cache.py:765(_clone)
9990 0.122 0.000 5.125 0.001 /usr/local/lib/python2.7/dist-packages/django/db/models/query.py:772(_clone)
9990 0.405 0.000 4.899 0.000 /usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py:226(clone)
129942/49972 1.456 0.000 4.505 0.000 /usr/lib/python2.7/copy.py:145(deepcopy)
129899/99929 0.191 0.000 3.117 0.000 {len}
9990 0.111 0.000 2.968 0.000 /usr/local/lib/python2.7/dist-packages/django/db/models/query.py:74(__len__)
19980 0.070 0.000 2.843 0.000 /path/to/my/project/common/integrity/models/cache.py:1062(iterator)
9990 0.208 0.000 2.190 0.000 /path/to/my/project/common/integrity/models/cache.py:973(_query)
9990 0.182 0.000 2.114 0.000 /usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py:1149(add_q)
19984/9994 0.291 0.000 1.644 0.000 /usr/lib/python2.7/copy.py:234(_deepcopy_tuple)
9990 0.288 0.000 1.599 0.000 /usr/local/lib/python2.7/dist-packages/django/db/models/sql/query.py:997(add_filter)
9990 0.171 0.000 1.454 0.000 /path/to/my/project/common/integrity/models/cache.py:892(_deduct)
19980 0.177 0.000 1.208 0.000 /usr/local/lib/python2.7/dist-packages/django/utils/tree.py:55(__deepcopy__)
9990 0.099 0.000 1.199 0.000 /path/to/my/project/common/integrity/models/cache.py:836(_deduct_recur)
9990 0.349 0.000 1.040 0.000 /path/to/my/project/common/integrity/models/cache.py:789(_deduct_local)
Answer
Brushing aside the fact that PyPy might really be intrinsically slower for your case, there are some factors that could be making it unnecessarily slower:
- Profiling is known to slow PyPy a lot more than CPython.
- Some debugging/logging code can disable optimizations (by, e.g., forcing frames).
- The server you're using can be a dominant factor in performance (think about how awful classic CGI would be with a JIT: it would never warm up). It can also simply influence results (different WSGI servers have shown various speed-ups).
- Old-style classes are slower than new-style ones.
- Even if everything is in memory, you could be hitting e.g. slow paths in PyPy's SQLite.
You can also check the JIT Friendliness wiki page for more hints about what can make PyPy slower. A nightly build will probably be faster too, as there are many improvements relative to 1.5.
A more detailed description of your stack (server, OS, DB) and setup (how did you benchmark? how many queries?) would allow us to give better answers.
No comments:
Post a Comment