Function | split_paragraphs | Split text into paragraphs. |
Function | re_substitute | Transform a string, replacing matched and non-matched sections. |
Function | next_word_chunk | Return the next chunk of the word of length between minlen and maxlen. |
Function | add_word_breaks | Insert manual word breaks into a string. |
Function | break_long_words | Add word breaks to long words in a run of text. |
Function | extract_bug_numbers | Unique bug numbers matching the "LP: #n(, #n)*" pattern in the text. |
Function | linkify_bug_numbers | Linkify to a bug if LP: #number appears in the (changelog) text. |
Function | extract_email_addresses | Unique email addresses in the text. |
Function | parse_diff | Parse a string into categorised diff lines. |
Class | FormattersAPI | Adapter from strings to HTML formatted text. |
Function | format_markdown | Return html form of marked-up text. |
This function yields lists of strings that represent lines of text in each paragraph.
Paragraphs are split by one or more blank lines.
This function behaves similarly to re.sub() when a function is passed as the second argument, except that the non-matching portions of the string can be transformed by a second function.
Parameters | patter | a regular expression |
replace_match | a function used to transform matches | |
replace_nomatch | a function used to transform non-matched text | |
string | the string to transform |
Shorter word chunks are preferred, preferably ending in a non alphanumeric character. The index of the end of the chunk is also returned.
This function treats HTML entities in the string as single characters. The string should not include HTML tags.
The word may be entity escaped, but is not expected to contain any HTML tags.
Breaks are inserted at least every 7 to 15 characters, preferably after puctuation.
The text may contain entity references or HTML tags.