l.t.u.g.POParser(object) : class documentation

Part of lp.translations.utilities.gettext_po_parser View In Hierarchy

Parser class for Gettext files.
Method __init__ Undocumented
Method parse Parse string as a PO file.
Method _emitSyntaxWarning Undocumented
Method _decode Undocumented
Method _getHeaderLine Undocumented
Method _storeCurrentMessage Undocumented
Method _parseHeader Undocumented
Method _unescapeNumericCharSequence Unescape leading sequence of escaped numeric character codes.
Method _parseQuotedString Parse a quoted string, interpreting escape sequences.
Method _dumpCurrentSection Dump current parsed content inside the translation message.
Method _parseFreshLine Parse a new line (not a continuation after escaped newline).
Method _parseLine Undocumented
def __init__(self, plural_formula=None):
Undocumented
def _emitSyntaxWarning(self, message):
Undocumented
def _decode(self):
Undocumented
def _getHeaderLine(self):
Undocumented
def parse(self, content_text):
Parse string as a PO file.
def _storeCurrentMessage(self):
Undocumented
def _parseHeader(self, header_text, header_comment):
Undocumented
def _unescapeNumericCharSequence(self, string):
Unescape leading sequence of escaped numeric character codes.

This is for characters given in hexadecimal or octal escape notation.

Returnsa tuple: first, any leading part of string as an unescaped string (empty if string did not start with a numeric escape sequence), and second, the remainder of string after the leading numeric escape sequences have been parsed.
def _parseQuotedString(self, string):

Parse a quoted string, interpreting escape sequences.

>>> parser = POParser()
>>> parser._parseQuotedString(u'\"abc\"')
u'abc'
>>> parser._parseQuotedString(u'\"abc\\ndef\"')
u'abc\ndef'
>>> parser._parseQuotedString(u'\"ab\x63\"')
u'abc'
>>> parser._parseQuotedString(u'\"ab\143\"')
u'abc'

After the string has been converted to unicode, the backslash escaped sequences are still in the encoding that the charset header specifies. Such quoted sequences will be converted to unicode by this method.

We don't know the encoding of the escaped characters and cannot be just recoded as Unicode so it's a TranslationFormatInvalidInputError >>> utf8_string = u'"view \302\253${version_title}\302\273"' >>> parser._parseQuotedString(utf8_string) Traceback (most recent call last): ... TranslationFormatInvalidInputError: Could not decode escaped string: (302253)

Now, we note the original encoding so we get the right Unicode string.

>>> class FakeHeader:
...     charset = 'UTF-8'
>>> parser._translation_file = TranslationFileData()
>>> parser._translation_file.header = FakeHeader()
>>> parser._parseQuotedString(utf8_string)
u'view \xab${version_title}\xbb'

Let's see that we raise a TranslationFormatInvalidInputError exception when we have an escaped char that is not valid in the declared encoding of the original string:

>>> iso8859_1_string = u'"foo \\xf9"'
>>> parser._parseQuotedString(iso8859_1_string)
Traceback (most recent call last):
...
TranslationFormatInvalidInputError: Could not decode escaped string as UTF-8: (\xf9)

An error will be raised if the entire string isn't contained in quotes properly:

>>> parser._parseQuotedString(u'abc')
Traceback (most recent call last):
  ...
TranslationFormatSyntaxError: String is not quoted
>>> parser._parseQuotedString(u'\"ab')
Traceback (most recent call last):
  ...
TranslationFormatSyntaxError: String not terminated
>>> parser._parseQuotedString(u'\"ab\"x')
Traceback (most recent call last):
  ...
TranslationFormatSyntaxError: Extra content found after string: (x)
def _dumpCurrentSection(self):
Dump current parsed content inside the translation message.
def _parseFreshLine(self, line, original_line):
Parse a new line (not a continuation after escaped newline).
ParameterslineRemaining part of input line.
original_lineLine as it originally was on input.
ReturnsIf there is one, the first line of a quoted string belonging to the line's section. Otherwise, None.
def _parseLine(self, original_line):
Undocumented
API Documentation for Launchpad, generated by pydoctor at 2019-04-24 00:00:10.