gpf.common.textutils module

Module that contains helper functions to improve text handling and formatting.

gpf.common.textutils.get_alphachars(text: str) → str[source]

Returns all alphabetic characters [a-zA-Z] in string text in a new (concatenated) string.

Example:

>>> get_alphachars('Test123')
'Test'
Parameters:text – The string to search.
gpf.common.textutils.get_digits(text: str) → str[source]

Returns all numeric characters (digits) in string text in a new (concatenated) string.

Example:

>>> get_digits('Test123')
'123'
>>> int(get_digits('The answer is 42'))
42
Parameters:text – The string to search.
gpf.common.textutils.to_str(value: Any, encoding: str = 'UTF-8') → str[source]

This function behaves similar to the built-in str() method: it converts any value into a string. However, if value is a bytes object, it will be decoded according to the specified encoding.

Parameters:
  • value – The value to convert to string.
  • encoding – The encoding to use when value is a bytes object.

Note

By default, the encoding is UTF-8, unless the user specified something else. If this function fails to decode the value into str using the specified encoding, the default system encoding is used instead (which often is cp1252). For this fallback case, the ‘replace’ method is used, which means that it will not raise an error if it fails. Bytes that fail to decode will be replaced by a question mark.

gpf.common.textutils.to_bytes(value: Any, encoding: str = 'UTF-8') → bytes[source]

This function behaves similar to the built-in bytes() method: it converts any value into a bytes object. However, if value is a str, it will be decoded according to the specified encoding.

Parameters:
  • value – The value to convert to bytes.
  • encoding – The encoding to use when value is a str.

Note

By default, the encoding is UTF-8, unless the user specified something else. If this function fails to encode the value into bytes using the specified encoding, the default system encoding is used instead (which often is cp1252). For this fallback case, the ‘replace’ method is used, which means that it will not raise an error if it fails. Characters that fail to encode will be replaced by a question mark.

Warning

Python 3 only!

gpf.common.textutils.to_repr(value: Any, encoding: str = 'UTF-8') → str[source]

This function behaves similar to the built-in repr() method: it converts any value into its representation. However, if value is a bytes-like object, it will be decoded using the specified encoding (defaults to UTF-8). The encoding will use the ‘replace’ method, which means that it will not raise an error if it fails. This means that the representation of the bytes-like object will not have the ‘b’ prefix anymore.

Parameters:
  • value – The value for which to get its representation.
  • encoding – The encoding to use when value is a bytes or bytearray object.
gpf.common.textutils.capitalize(text: str) → str[source]

Function that works similar to the built-in string method str.capitalize(), except that it only makes the first character uppercase, and leaves the other characters unchanged.

Parameters:text – The string to capitalize.
gpf.common.textutils.unquote(text: str) → str[source]

Strips trailing quotes from a text string and returns it.

Parameters:text – The string to strip.
gpf.common.textutils.format_plural(word: str, number: numbers.Number, plural_suffix: str = 's') → str[source]

Function that prefixes word with number and appends plural_suffix to it if number <> 1. Note that this only works for words with simple conjugation (where the base word and suffix do not change). E.g. words like ‘sheep’ or ‘life’ will be falsely pluralized (‘sheeps’ and ‘lifes’ respectively).

Examples:

>>> format_plural('{} error', 42)
'42 errors'
>>> format_plural('{} bus', 99, 'es')
'99 buses'
>>> format_plural('{} goal', 1)
'1 goal'
>>> format_plural('{} regret', 0)
'0 regrets'
Parameters:
  • word – The word that should be pluralized if number <> 1.
  • number – The numeric value for which word will be prefixed and pluralized.
  • plural_suffix – If word is a constant and the plural_suffix for it cannot be ‘s’, set your own.
gpf.common.textutils.format_iterable(iterable: Union[list, tuple], conjunction: str = 'and') → str[source]

Function that pretty-prints an iterable, separated by commas and adding a conjunction before the last item.

Example:

>>> iterable = [1, 2, 3, 4]
>>> format_iterable(iterable)
'1, 2, 3 and 4'
Parameters:
  • iterable – The iterable (e.g. list or tuple) to format.
  • conjunction – The conjunction to use before the last item. Defaults to “and”.
gpf.common.textutils.format_timedelta(start: datetime.datetime, stop: datetime.datetime = None) → str[source]

Calculates the time difference between start and stop datetime objects and returns a pretty-printed time delta. If stop is omitted, the current time (now()) will be used. The smallest time unit that can be expressed is in (floating point) seconds. The largest time unit is in days.

Example:

>>> t0 = _dt(2019, 1, 1, 1, 1, 1)  # where _dt = datetime
>>> format_timedelta(t0)
'1 day, 3 hours, 4 minutes and 5.2342 seconds'
Parameters:
  • start – The start time (t0) for the time delta calculation.
  • stop – The end time (t1) for the time delta calculation or now() when omitted.