Do not simply dump stuff in here. Think carefully as to whether it would be better as a method on an existing content object or IFooSet object.
Function | text_replaced | Return a new string with text replaced according to the dict provided. |
Function | backslashreplace | Return a copy of the string, with non-ASCII characters rendered as |
Function | string_to_tarfile | Convert a binary string containing a tar file into a tar file obj. |
Function | simple_popen2 | Run a command, give it input on its standard input, and capture its |
Class | ShortListTooBigError | This error is raised when the shortlist hardlimit is reached |
Function | shortlist | Return a listified version of sequence. |
Function | is_tar_filename | Check whether a filename looks like a filename that belongs to a tar file, |
Function | test_diff | Generate a string indicating the difference between expected and actual |
Function | filenameToContentType | Return the a ContentType-like entry for arbitrary filenames |
Function | intOrZero | Return int(value) or 0 if the conversion fails. |
Function | truncate_text | Return a version of string no longer than max_length characters. |
Function | english_list | Return all the items concatenated into a English-style string. |
Function | ensure_unicode | Return input as unicode. None is passed through unharmed. |
The keys of the dict are substrings to find, the values are what to replace found substrings with.
>>> text_replaced('', {'a':'b'}) '' >>> text_replaced('a', {'a':'c'}) 'c' >>> text_replaced('faa bar baz', {'a': 'A', 'aa': 'X'}) 'fX bAr bAz' >>> text_replaced('1 2 3 4', {'1': '2', '2': '1'}) '2 1 3 4'
Unicode strings work too.
>>> text_replaced(u'1 2 3 4', {u'1': u'2', u'2': u'1'}) u'2 1 3 4'
The argument _cache is used as a cache of replacements that were requested before, so we only compute regular expressions once.
Parameters | text | An unicode or str to do the replacement. |
replacements | A dictionary with the replacements that should be done |
Returns the data from standard output.
This function is needed to avoid certain deadlock situations. For example, if you popen2() a command, write its standard input, then read its standard output, this can deadlock due to the parent process blocking on writing to the child, while the child process is simultaneously blocking on writing to its parent. This function avoids that problem by using subprocess.Popen.communicate().
Return a listified version of sequence.
If <sequence> has more than <longest_expected> items, a warning is issued.
>>> shortlist([1, 2]) [1, 2]
>>> shortlist([1, 2, 3], 2) #doctest: +NORMALIZE_WHITESPACE Traceback (most recent call last): ... UserWarning: shortlist() should not be used here. It's meant to listify sequences with no more than 2 items. There were 3 items.
>>> shortlist([1, 2, 3, 4], hardlimit=2) Traceback (most recent call last): ... ShortListTooBigError: Hard limit of 2 exceeded.
>>> shortlist( ... [1, 2, 3, 4], 2, hardlimit=4) #doctest: +NORMALIZE_WHITESPACE Traceback (most recent call last): ... UserWarning: shortlist() should not be used here. It's meant to listify sequences with no more than 2 items. There were 4 items.
It works on iterable also which don't support the extended slice protocol.
>>> iter(range(5))[:1] #doctest: +ELLIPSIS Traceback (most recent call last): ... TypeError: ...
>>> shortlist(iter(range(10)), 5, hardlimit=8) #doctest: +ELLIPSIS Traceback (most recent call last): ... ShortListTooBigError: ...
Return the a ContentType-like entry for arbitrary filenames
deb files
>>> filenameToContentType('test.deb') 'application/x-debian-package'
text files
>>> filenameToContentType('test.txt') 'text/plain'
Not recognized format
>>> filenameToContentType('test.tgz') 'application/octet-stream'
Build logs >>> filenameToContentType('buildlog.txt.gz') 'text/plain'
Various compressed files
>>> filenameToContentType('Packages.gz') 'application/x-gzip' >>> filenameToContentType('Packages.bz2') 'application/x-bzip2' >>> filenameToContentType('Packages.xz') 'application/x-xz'
Return int(value) or 0 if the conversion fails.
>>> intOrZero('1.23') 0 >>> intOrZero('1.ab') 0 >>> intOrZero('2') 2 >>> intOrZero(None) 0 >>> intOrZero(1) 1 >>> intOrZero(-9) -9
Tries not to cut off the text mid-word.
Follows the advice given in The Elements of Style, chapter I, section 2:
Beware that this is US English and is wrong for non-US.
Return input as unicode. None is passed through unharmed.
Do not use this method. This method exists only to help migration of legacy code where str objects were being passed into contexts where unicode objects are required. All invokations of ensure_unicode() should eventually be removed.
This differs from the builtin unicode() function, as a TypeError exception will be raised if the parameter is not a basestring or if a raw string is not ASCII.
>>> ensure_unicode(u'hello') u'hello'
>>> ensure_unicode('hello') u'hello'
>>> ensure_unicode(u'A'.encode('utf-16')) # Not ASCII Traceback (most recent call last): ... TypeError: '\xff\xfeA\x00' is not US-ASCII
>>> ensure_unicode(42) Traceback (most recent call last): ... TypeError: 42 is not a basestring (<type 'int'>)
>>> ensure_unicode(None) is None True