API

Tokenization

Tokenization is performed by the Tokeniser class. The next method consumes characters from a feeder and returns a token if the tokenization succeeds. If the tokenization fails an instance of TranslateError is raised.

The tokens returned by next are instances of the Token class:

Feeders

A feeder is an intermediate between the tokeniser and the actual file being scanned. Feeders used by the tokeniser are instances of the LineFeeder class:

Specialized Feeders

To read multiple lines of code at a time use the MultiLineFeeder class:

To read a single line of code at a time use the SingleLineFeeder class:

To read lines of code from a file use the FileLineFeeder class:

Character Conversions

The mathics_scanner.characters module also exposes special dictionaries:

named_characters

Maps fully qualified names of named characters to their corresponding code-points in Wolfram’s internal representation:

for named_char, code in named_characters.items():
  print(f"The named character {named_char} maps to U+{ord(code):X}")
aliased_characters

Maps the ESC sequence alias of all aliased characters to their corresponding code points in Wolfram’s internal representation.

mathics_scanner.generate.rl_inputrc