Part of bzrlib.tuned_gzip View In Hierarchy
Knit tuned version of GzipFile. This is based on the following lsprof stats: python 2.4 stock GzipFile write: 58971 0 5644.3090 2721.4730 gzip:193(write) +58971 0 1159.5530 1159.5530 +<built-in method compress> +176913 0 987.0320 987.0320 +<len> +58971 0 423.1450 423.1450 +<zlib.crc32> +58971 0 353.1060 353.1060 +<method 'write' of 'cStringIO. StringO' objects> tuned GzipFile write: 58971 0 4477.2590 2103.1120 bzrlib.knit:1250(write) +58971 0 1297.7620 1297.7620 +<built-in method compress> +58971 0 406.2160 406.2160 +<zlib.crc32> +58971 0 341.9020 341.9020 +<method 'write' of 'cStringIO. StringO' objects> +58971 0 328.2670 328.2670 +<len> Yes, its only 1.6 seconds, but they add up.
Method | __init__ | Undocumented |
Method | readline | Tuned to remove buffer length calls in _unread and... |
Method | readlines | Undocumented |
Method | write | Undocumented |
Method | writelines | Undocumented |
Method | close | Undocumented |
Method | _add_read_data | Undocumented |
Method | _write_gzip_header | A tuned version of gzip._write_gzip_header |
Method | _read | Undocumented |
Method | _read_eof | tuned to reduce function calls and eliminate file seeking: |
Method | _read_gzip_header | Supply bytes if the minimum header size is already read. |
Method | _unread | tuned to remove unneeded len calls. |
A tuned version of gzip._write_gzip_header We have some extra constrains that plain Gzip does not. 1) We want to write the whole blob at once. rather than multiple calls to fileobj.write(). 2) We never have a filename 3) We don't care about the time
Parameters | bytes | 10 bytes of header data. |
Tuned to remove buffer length calls in _unread and... also removes multiple len(c) calls, inlines _unread, total savings - lsprof 5800 to 5300 phase 2: 4168 calls in 2233 8176 calls to read() in 1684 changing the min chunk size to 200 halved all the cache misses leading to a drop to: 4168 calls in 1977 4168 call to read() in 1646 - i.e. just reduced the function call overhead. May be worth keeping.
because this is such an inner routine in readline, and readline is in many inner loops, this has been inlined into readline().
The len_buf parameter combined with the reduction in len calls dropped the lsprof ms count for this routine on my test data from 800 to 200 - a 75% saving.