alpm_mtree::path_decoder

Function decode_utf8_chars

Source
pub fn decode_utf8_chars(input: &mut &str) -> ModalResult<String>
Expand description

Decodes UTF-8 characters from a string using MTREE-specific escape sequences.

MTREE uses various decodings.

  1. the VIS_CSTYLE encoding of strsvis(3), which encodes a specific set of characters. Of these, only the following control characters are allowed in filenames:
    • \s Space
    • \t Tab
    • \r Carriage Return
    • \n Line Feed
  2. # is encoded as \# to differentiate between comments.
  3. For all other chars, octal triplets in the style of \360\237\214\240 are used. Check unicode_char for more info.

ยงSolution

To effectively decode this pattern we use winnow instead of a handwritten parser, mostly to have convenient backtracking and error messages in case we encounter invalid escape sequences or malformed escaped UTF-8.