ZGL String Literals
Context
ZGL is a data interchange language (also known as a data exchange language).
Although it lacks string processing functions (by design), ZGL aims to make it easy to include strings in various styles, ranging from short strings to large blocks of text.
Rationale
As you will see below, ZGL provides six options that can be mixed and matched using flags (e.g. -ecltaz
). Learning 6 options is easier than 2 ^ 6 = 64.
YAML serves as an illustrative counterexample to ZGL's rationale.
Encoding
Strings in ZGL use UTF-8 encoding. Any unicode character is allowed.
Unicode characters can be included directly; e.g. "halló"
, "slán"
, or "奇怪环形"
.
Consistent with the above, strings may include newlines; for example:
"No person is an island,
Entire of itself,
Every being is a piece of the continent,
A part of the main."
Forms
String literals may take any of these forms:
name | example |
---|---|
0# | "···" |
1# | #"···"# |
2# | ##"···"## |
3# | ###"···"###` |
The forms with one or more #
are referred to as # forms; they can help reduce the amount of escaping needed.
For example, in the table below, the string literals in each column are equivalent:
form | x |
"y" |
"#z" |
---|---|---|---|
0# | "x" |
"\"y\"" |
"\"#z\"" |
1# | #"x"# |
#""y""# |
#"\"#z""# |
2# | ##"x"## |
##""y""## |
##""#z""## |
Also, forms may be prefixed with options, explained next.
Options
A string literal has six independent options:
option (enabled by default) | flag to disable |
---|---|
escaping | -e |
continuations | -c |
unindent leading whitespace | -l |
discard trailing whitespace | -t |
discard empty first line | -a |
discard empty last line | -z |
By default, all options are enabled; e.g. "simpla"
or #"απλός"#
, has all options enabled.
Options can be disabled selectively by prefixing with flags. Option flags are case sensitive; only lowercase is allowed.
-a"..."
disables "discard empty first line".-z"..."
disables "discard empty last line".
If more than one flag is used, only one -
should be used. For example:
-az"..."
disables 2 options.-taz"..."
disables 3 options.
The options are explained in detail below.
Escaping
Escaping is enabled by default. (Disable with the -e
prefix.)
Here are the available types of escape codes:
pattern | label | escaped examples |
resulting characters |
---|---|---|---|
\[\"'nrt0] |
ASCII shortcut |
\\ \" \' \n \r \t \0
|
\ " ' newline carriage return tab null character |
\xHH |
ASCII character |
\x5e \x5f
|
^ _
|
\u{H{1,6}} |
Unicode character (1 to 6 digits) |
\u{1ce} \u{0x2d44} \u{20d47}
|
ǎ ⵄ 𠵇
|
Notes:
H
means a hex digit; e.g. the[xX]
regex pattern.- Backslashes must be escaped, unless you disable escaping (see below).
- The single quote
'
does not have to be escaped.
Continuations
Continuations are enabled by default. (Disable with the -c
prefix.)
To split a string across multiple lines without a newline, end a line with \
. This is called a string continuation.
= |
|
|
= |
|
|
|
|
|
= |
|
|
|
|
Make sure you understand this before going to the next section.
Continuations: disable
To disable continuations, prefix with -c
. This is case sensitive.
This flag is useful in combination with other flags, but using it alone is not helpful. To put it another way, -c"..."
does not provide any advantages over simply using "..."
.
= |
|
|
|
|
This is invalid because:
|
Unindent leading whitespace
This is enabled by default. (Disable with the -l
prefix.)
= |
|
|
Each line of a string is unindented as follows:
- Count the leading spaces of each line, ignoring the first line and any lines that are empty or contain spaces only.
- Take the minimum.
- If the first line is empty i.e. the string begins with a newline, remove the first line.
- Remove the computed number of spaces from the beginning of each line.
This behavior is the same as used by indoc, a Rust crate: the above text is copied from it.
= |
|
|
Discard trailing whitespace
This is enabled by default. (Disable with the -t
prefix.)
This option discards trailing whitespace on each line.
= |
|
|
Discard empty first line
This is enabled by default. (Disable with the -a
prefix.)
= |
|
|
Discard empty last line
This is enabled by default. (Disable with -z
)
= |
|
|
Translations
In case you are curious of some of the phrases used above ...
word | language | English translation |
---|---|---|
halló | Icelandic | hello |
slán | Irish | goodbye |
奇怪环形 | Chinese | strange loop |
simpla | Esperanto | simple |
απλός | Greek | simple |
hai dòng | Vietnamese | two lines |
dies ist ungültig | German | this is invalid |
hem acabat aquí | Catalan | we're done here |
¿por qué sigues aquí? | Spanish | why are you still here? |
zoo, Kuv tsis tu | Hmong | fine, I don't care |