diff --git a/doc/manual/redirects.js b/doc/manual/redirects.js index beef6ef4a..cb8cd18fa 100644 --- a/doc/manual/redirects.js +++ b/doc/manual/redirects.js @@ -344,6 +344,7 @@ const redirects = { }, "language/syntax.html": { "scoping-rules": "scoping.html", + "string-literal": "string-literals.html", }, "installation/installing-binary.html": { "linux": "uninstall.html#linux", diff --git a/doc/manual/src/SUMMARY.md.in b/doc/manual/src/SUMMARY.md.in index 7661f5f62..eef7d189c 100644 --- a/doc/manual/src/SUMMARY.md.in +++ b/doc/manual/src/SUMMARY.md.in @@ -29,6 +29,7 @@ - [String context](language/string-context.md) - [Syntax and semantics](language/syntax.md) - [Variables](language/variables.md) + - [String literals](language/string-literals.md) - [Identifiers](language/identifiers.md) - [Scoping rules](language/scope.md) - [String interpolation](language/string-interpolation.md) diff --git a/doc/manual/src/language/identifiers.md b/doc/manual/src/language/identifiers.md index 861ee3e20..584a2f861 100644 --- a/doc/manual/src/language/identifiers.md +++ b/doc/manual/src/language/identifiers.md @@ -16,7 +16,7 @@ An *identifier* is an [ASCII](https://en.wikipedia.org/wiki/ASCII) character seq # Names -A *name* can be written as an [identifier](#identifier) or a [string literal](./syntax.md#string-literal). +A *name* can be written as an [identifier](#identifier) or a [string literal](./string-literals.md). > **Syntax** > diff --git a/doc/manual/src/language/string-interpolation.md b/doc/manual/src/language/string-interpolation.md index 1778bdfa0..27780dcbb 100644 --- a/doc/manual/src/language/string-interpolation.md +++ b/doc/manual/src/language/string-interpolation.md @@ -8,6 +8,10 @@ Such a construct is called *interpolated string*, and the expression inside is a [path]: ./types.md#type-path [attribute set]: ./types.md#attribute-set +> **Syntax** +> +> *interpolation_element* → `${` *expression* `}` + ## Examples ### String diff --git a/doc/manual/src/language/string-literals.md b/doc/manual/src/language/string-literals.md new file mode 100644 index 000000000..8f4b75f3e --- /dev/null +++ b/doc/manual/src/language/string-literals.md @@ -0,0 +1,190 @@ +# String literals + +A *string literal* represents a [string](types.md#type-string) value. + +> **Syntax** +> +> *expression* → *string* +> +> *string* → `"` ( *string_char*\* [*interpolation_element*][string interpolation] )* *string_char*\* `"` +> +> *string* → `''` ( *indented_string_char*\* [*interpolation_element*][string interpolation] )* *indented_string_char*\* `''` +> +> *string* → *uri* +> +> *string_char* ~ `[^"$\\]|\$(?!\{)|\\.` +> +> *indented_string_char* ~ `[^$']|\$\$|\$(?!\{)|''[$']|''\\.|'(?!')` +> +> *uri* ~ `[A-Za-z][+\-.0-9A-Za-z]*:[!$%&'*+,\-./0-9:=?@A-Z_a-z~]+` + +Strings can be written in three ways. + +The most common way is to enclose the string between double quotes, e.g., `"foo bar"`. +Strings can span multiple lines. +The results of other expressions can be included into a string by enclosing them in `${ }`, a feature known as [string interpolation]. + +[string interpolation]: ./string-interpolation.md + +The following must be escaped to represent them within a string, by prefixing with a backslash (`\`): + +- Double quote (`"`) + +> **Example** +> +> ```nix +> "\"" +> ``` +> +> "\"" + +- Backslash (`\`) + +> **Example** +> +> ```nix +> "\\" +> ``` +> +> "\\" + +- Dollar sign followed by an opening curly bracket (`${`) – "dollar-curly" + +> **Example** +> +> ```nix +> "\${" +> ``` +> +> "\${" + +The newline, carriage return, and tab characters can be written as `\n`, `\r` and `\t`, respectively. + +A "double-dollar-curly" (`$${`) can be written literally. + +> **Example** +> +> ```nix +> "$${" +> ``` +> +> "$\${" + +String values are output on the terminal with Nix-specific escaping. +Strings written to files will contain the characters encoded by the escaping. + +The second way to write string literals is as an *indented string*, which is enclosed between pairs of *double single-quotes* (`''`), like so: + +```nix +'' +This is the first line. +This is the second line. + This is the third line. +'' +``` + +This kind of string literal intelligently strips indentation from +the start of each line. To be precise, it strips from each line a +number of spaces equal to the minimal indentation of the string as a +whole (disregarding the indentation of empty lines). For instance, +the first and second line are indented two spaces, while the third +line is indented four spaces. Thus, two spaces are stripped from +each line, so the resulting string is + +```nix +"This is the first line.\nThis is the second line.\n This is the third line.\n" +``` + +> **Note** +> +> Whitespace and newline following the opening `''` is ignored if there is no non-whitespace text on the initial line. + +> **Warning** +> +> Prefixed tab characters are not stripped. +> +> > **Example** +> > +> > The following indented string is prefixed with tabs: +> > +> >
''
+> > all:
+> > @echo hello
+> > ''
+> >
+> >
+> > "\tall:\n\t\t@echo hello\n"
+
+Indented strings support [string interpolation].
+
+The following must be escaped to represent them in an indented string:
+
+- `$` is escaped by prefixing it with two single quotes (`''`)
+
+> **Example**
+>
+> ```nix
+> ''
+> ''$
+> ''
+> ```
+>
+> "$\n"
+
+- `''` is escaped by prefixing it with one single quote (`'`)
+
+> **Example**
+>
+> ```nix
+> ''
+> '''
+> ''
+> ```
+>
+> "''\n"
+
+These special characters are escaped as follows:
+- Linefeed (`\n`): `''\n`
+- Carriage return (`\r`): `''\r`
+- Tab (`\t`): `''\t`
+
+`''\` escapes any other character.
+
+A "double-dollar-curly" (`$${`) can be written literally.
+
+> **Example**
+>
+> ```nix
+> ''
+> $${
+> ''
+> ```
+>
+> "$\${\n"
+
+Indented strings are primarily useful in that they allow multi-line
+string literals to follow the indentation of the enclosing Nix
+expression, and that less escaping is typically necessary for
+strings representing languages such as shell scripts and
+configuration files because `''` is much less common than `"`.
+Example:
+
+```nix
+stdenv.mkDerivation {
+...
+postInstall =
+ ''
+ mkdir $out/bin $out/etc
+ cp foo $out/bin
+ echo "Hello World" > $out/etc/foo.conf
+ ${if enableBar then "cp bar $out/bin" else ""}
+ '';
+...
+}
+```
+
+Finally, as a convenience, *URIs* as defined in appendix B of
+[RFC 2396](http://www.ietf.org/rfc/rfc2396.txt) can be written *as
+is*, without quotes. For instance, the string
+`"http://example.org/foo.tar.bz2"` can also be written as
+`http://example.org/foo.tar.bz2`.
diff --git a/doc/manual/src/language/syntax.md b/doc/manual/src/language/syntax.md
index 6108bacd6..daf073aef 100644
--- a/doc/manual/src/language/syntax.md
+++ b/doc/manual/src/language/syntax.md
@@ -6,175 +6,7 @@ This section covers syntax and semantics of the Nix language.
### String {#string-literal}
- *Strings* can be written in three ways.
-
- The most common way is to enclose the string between double quotes, e.g., `"foo bar"`.
- Strings can span multiple lines.
- The results of other expressions can be included into a string by enclosing them in `${ }`, a feature known as [string interpolation].
-
- [string interpolation]: ./string-interpolation.md
-
- The following must be escaped to represent them within a string, by prefixing with a backslash (`\`):
-
- - Double quote (`"`)
-
- > **Example**
- >
- > ```nix
- > "\""
- > ```
- >
- > "\""
-
- - Backslash (`\`)
-
- > **Example**
- >
- > ```nix
- > "\\"
- > ```
- >
- > "\\"
-
- - Dollar sign followed by an opening curly bracket (`${`) – "dollar-curly"
-
- > **Example**
- >
- > ```nix
- > "\${"
- > ```
- >
- > "\${"
-
- The newline, carriage return, and tab characters can be written as `\n`, `\r` and `\t`, respectively.
-
- A "double-dollar-curly" (`$${`) can be written literally.
-
- > **Example**
- >
- > ```nix
- > "$${"
- > ```
- >
- > "$\${"
-
- String values are output on the terminal with Nix-specific escaping.
- Strings written to files will contain the characters encoded by the escaping.
-
- The second way to write string literals is as an *indented string*, which is enclosed between pairs of *double single-quotes* (`''`), like so:
-
- ```nix
- ''
- This is the first line.
- This is the second line.
- This is the third line.
- ''
- ```
-
- This kind of string literal intelligently strips indentation from
- the start of each line. To be precise, it strips from each line a
- number of spaces equal to the minimal indentation of the string as a
- whole (disregarding the indentation of empty lines). For instance,
- the first and second line are indented two spaces, while the third
- line is indented four spaces. Thus, two spaces are stripped from
- each line, so the resulting string is
-
- ```nix
- "This is the first line.\nThis is the second line.\n This is the third line.\n"
- ```
-
- > **Note**
- >
- > Whitespace and newline following the opening `''` is ignored if there is no non-whitespace text on the initial line.
-
- > **Warning**
- >
- > Prefixed tab characters are not stripped.
- >
- > > **Example**
- > >
- > > The following indented string is prefixed with tabs:
- > >
- > > ''
- > > all:
- > > @echo hello
- > > ''
- > >
- > > "\tall:\n\t\t@echo hello\n"
-
- Indented strings support [string interpolation].
-
- The following must be escaped to represent them in an indented string:
-
- - `$` is escaped by prefixing it with two single quotes (`''`)
-
- > **Example**
- >
- > ```nix
- > ''
- > ''$
- > ''
- > ```
- >
- > "$\n"
-
- - `''` is escaped by prefixing it with one single quote (`'`)
-
- > **Example**
- >
- > ```nix
- > ''
- > '''
- > ''
- > ```
- >
- > "''\n"
-
- These special characters are escaped as follows:
- - Linefeed (`\n`): `''\n`
- - Carriage return (`\r`): `''\r`
- - Tab (`\t`): `''\t`
-
- `''\` escapes any other character.
-
- A "double-dollar-curly" (`$${`) can be written literally.
-
- > **Example**
- >
- > ```nix
- > ''
- > $${
- > ''
- > ```
- >
- > "$\${\n"
-
- Indented strings are primarily useful in that they allow multi-line
- string literals to follow the indentation of the enclosing Nix
- expression, and that less escaping is typically necessary for
- strings representing languages such as shell scripts and
- configuration files because `''` is much less common than `"`.
- Example:
-
- ```nix
- stdenv.mkDerivation {
- ...
- postInstall =
- ''
- mkdir $out/bin $out/etc
- cp foo $out/bin
- echo "Hello World" > $out/etc/foo.conf
- ${if enableBar then "cp bar $out/bin" else ""}
- '';
- ...
- }
- ```
-
- Finally, as a convenience, *URIs* as defined in appendix B of
- [RFC 2396](http://www.ietf.org/rfc/rfc2396.txt) can be written *as
- is*, without quotes. For instance, the string
- `"http://example.org/foo.tar.bz2"` can also be written as
- `http://example.org/foo.tar.bz2`.
+See [String literals](string-literals.md).
### Number {#number-literal}
@@ -253,7 +85,7 @@ Attribute sets are written enclosed in curly brackets (`{ }`).
Attribute names and attribute values are separated by an equal sign (`=`).
Each value can be an arbitrary expression, terminated by a semicolon (`;`)
-An attribute name is a string without context, and is denoted by a [name] (an [identifier](./identifiers.md#identifiers) or [string literal](#string-literal)).
+An attribute name is a string without context, and is denoted by a [name] (an [identifier](./identifiers.md#identifiers) or [string literal](string-literals.md)).
[name]: ./identifiers.md#names
diff --git a/doc/manual/src/language/types.md b/doc/manual/src/language/types.md
index 229756e6b..82184a8b0 100644
--- a/doc/manual/src/language/types.md
+++ b/doc/manual/src/language/types.md
@@ -45,7 +45,7 @@ The function [`builtins.isBool`](builtins.md#builtins-isBool) can be used to det
A _string_ in the Nix language is an immutable, finite-length sequence of bytes, along with a [string context](string-context.md).
Nix does not assume or support working natively with character encodings.
-String values without string context can be expressed as [string literals](syntax.md#string-literal).
+String values without string context can be expressed as [string literals](string-literals.md).
The function [`builtins.isString`](builtins.md#builtins-isString) can be used to determine if a value is a string.
### Path {#type-path}