Stem Docs

String Utilities

String Utilities

Toolkit for various string activity.

Changed in version 1.3.0: Dropped the get_* prefix from several function names. The old names still work, but are deprecated aliases.

Module Overview:

crop - shortens string to a given length

size_label - human readable label for a number of bytes
time_label - human readable label for a number of seconds
time_labels - human readable labels for each time unit
short_time_label - condensed time label output
parse_short_time_label - seconds represented by a short time label
stem.util.str_tools.crop(msg, size, min_word_length=4, min_crop=0, ending='Ellipse', get_remainder=False)[source]

Shortens a string to a given length.

If we crop content then a given ending is included (counting itself toward the size limitation). This crops on word breaks so we only include a word if we can display at least min_word_length characters of it.

If there isn't room for even a truncated single word (or one word plus the ellipse if including those) then this provides an empty string.

If a cropped string ends with a comma or period then it's stripped (unless we're providing the remainder back). For example...

>>> crop('This is a looooong message', 17)
'This is a looo...'
>>> crop('This is a looooong message', 12)
'This is a...'
>>> crop('This is a looooong message', 3)
''

The whole point of this method is to provide human friendly croppings, and as such details of how this works might change in the future. Callers should not rely on the details of how this crops.

New in version 1.3.0.

Parameters:
  • msg (str) -- text to be processed
  • size (int) -- space available for text
  • min_word_length (int) -- minimum characters before which a word is dropped, requires whole word if None
  • min_crop (int) -- minimum characters that must be dropped if a word is cropped
  • ending (Ending) -- type of ending used when truncating, no special truncation is used if None
  • get_remainder (bool) -- returns a tuple with the second part being the cropped portion of the message
Returns:

str of the text truncated to the given length

stem.util.str_tools.size_label(byte_count, decimal=0, is_long=False, is_bytes=True, round=False)[source]

Converts a number of bytes into a human readable label in its most significant units. For instance, 7500 bytes would return "7 KB". If the is_long option is used this expands unit labels to be the properly pluralized full word (for instance 'Kilobytes' rather than 'KB'). Units go up through petabytes.

>>> size_label(2000000)
'1 MB'

>>> size_label(1050, 2)
'1.02 KB'

>>> size_label(1050, 3, True)
'1.025 Kilobytes'

Changed in version 1.6.0: Added round argument.

Parameters:
  • byte_count (int) -- number of bytes to be converted
  • decimal (int) -- number of decimal digits to be included
  • is_long (bool) -- expands units label
  • is_bytes (bool) -- provides units in bytes if True, bits otherwise
  • round (bool) -- rounds normally if True, otherwise rounds down
Returns:

str with human readable representation of the size

stem.util.str_tools.time_label(seconds, decimal=0, is_long=False)[source]

Converts seconds into a time label truncated to its most significant units. For instance, 7500 seconds would return "2h". Units go up through days.

This defaults to presenting single character labels, but if the is_long option is used this expands labels to be the full word (space included and properly pluralized). For instance, "4h" would be "4 hours" and "1m" would become "1 minute".

>>> time_label(10000)
'2h'

>>> time_label(61, 1, True)
'1.0 minute'

>>> time_label(61, 2, True)
'1.01 minutes'
Parameters:
  • seconds (int) -- number of seconds to be converted
  • decimal (int) -- number of decimal digits to be included
  • is_long (bool) -- expands units label
Returns:

str with human readable representation of the time

stem.util.str_tools.time_labels(seconds, is_long=False)[source]

Provides a list of label conversions for each time unit, starting with its most significant units on down. Any counts that evaluate to zero are omitted. For example...

>>> time_labels(400)
['6m', '40s']

>>> time_labels(3640, True)
['1 hour', '40 seconds']
Parameters:
  • seconds (int) -- number of seconds to be converted
  • is_long (bool) -- expands units label
Returns:

list of strings with human readable representations of the time

stem.util.str_tools.short_time_label(seconds)[source]

Provides a time in the following format: [[dd-]hh:]mm:ss

>>> short_time_label(111)
'01:51'

>>> short_time_label(544100)
'6-07:08:20'
Parameters:seconds (int) -- number of seconds to be converted
Returns:str with the short representation for the time
Raises :ValueError if the input is negative
stem.util.str_tools.parse_short_time_label(label)[source]

Provides the number of seconds corresponding to the formatting used for the cputime and etime fields of ps: [[dd-]hh:]mm:ss or mm:ss.ss

>>> parse_short_time_label('01:51')
111

>>> parse_short_time_label('6-07:08:20')
544100
Parameters:label (str) -- time entry to be parsed
Returns:int with the number of seconds represented by the label
Raises :ValueError if input is malformed
stem.util.str_tools.get_size_label(byte_count, decimal=0, is_long=False, is_bytes=True, round=False)

Converts a number of bytes into a human readable label in its most significant units. For instance, 7500 bytes would return "7 KB". If the is_long option is used this expands unit labels to be the properly pluralized full word (for instance 'Kilobytes' rather than 'KB'). Units go up through petabytes.

>>> size_label(2000000)
'1 MB'

>>> size_label(1050, 2)
'1.02 KB'

>>> size_label(1050, 3, True)
'1.025 Kilobytes'

Changed in version 1.6.0: Added round argument.

Parameters:
  • byte_count (int) -- number of bytes to be converted
  • decimal (int) -- number of decimal digits to be included
  • is_long (bool) -- expands units label
  • is_bytes (bool) -- provides units in bytes if True, bits otherwise
  • round (bool) -- rounds normally if True, otherwise rounds down
Returns:

str with human readable representation of the size

stem.util.str_tools.get_time_label(seconds, decimal=0, is_long=False)

Converts seconds into a time label truncated to its most significant units. For instance, 7500 seconds would return "2h". Units go up through days.

This defaults to presenting single character labels, but if the is_long option is used this expands labels to be the full word (space included and properly pluralized). For instance, "4h" would be "4 hours" and "1m" would become "1 minute".

>>> time_label(10000)
'2h'

>>> time_label(61, 1, True)
'1.0 minute'

>>> time_label(61, 2, True)
'1.01 minutes'
Parameters:
  • seconds (int) -- number of seconds to be converted
  • decimal (int) -- number of decimal digits to be included
  • is_long (bool) -- expands units label
Returns:

str with human readable representation of the time

stem.util.str_tools.get_time_labels(seconds, is_long=False)

Provides a list of label conversions for each time unit, starting with its most significant units on down. Any counts that evaluate to zero are omitted. For example...

>>> time_labels(400)
['6m', '40s']

>>> time_labels(3640, True)
['1 hour', '40 seconds']
Parameters:
  • seconds (int) -- number of seconds to be converted
  • is_long (bool) -- expands units label
Returns:

list of strings with human readable representations of the time

stem.util.str_tools.get_short_time_label(seconds)

Provides a time in the following format: [[dd-]hh:]mm:ss

>>> short_time_label(111)
'01:51'

>>> short_time_label(544100)
'6-07:08:20'
Parameters:seconds (int) -- number of seconds to be converted
Returns:str with the short representation for the time
Raises :ValueError if the input is negative