# String Building
This document explores whether there is value in distinguishing between different partially built string representations.
There seem to be several aspects to string building:
1. *Decomposability*: the ability to decompose string composition into simpler operations
2. *Delegation*: the ability to delegate composition of a substring to other functions in other modules
3. *Introspection*: the ability to treat the partially built string, for example: if the last character (if any) is not a space, append a space and then append the rest.
4. *Transparency*: is it obvious to a code reader what's being appended
5. *Interopiness*: does our translation connect types well (see I/O interop below)
In Java, there are several types that help to accumulate code-units. I'll use these to examine the dimensions above.
```mermaid
flowchart TD
StringBuilder --> CharSequence
StringBuilder --> Appendable
StringWriter --> Appendable
StringWriter --> Writer
ByteArrayOutputStream --> OutputStream
PrintStream --> Appendable
PrintStream --> OutputStream
CharSequence["CharSequence\n(Readable)"]
OutputStream["OutputStream\n(Byte sink, I/O)"]
Appendable["Appendable\n(UTF-16 sink)"]
Writer["Writer\n(UTF-16 sink, I/O)"]
```
- `new StringBuilder()` is an *Appendable* and a *CharSequence* but does not inter-operate with `java.io` well
- `new StringWriter()` is an *Appendable* but not a *CharSequence* and interoperates with `java.io` by extending *Writer*
- `java.io.PrintStream` fuzzes the distinction between byte sinks and UTF-16 character sinks. It inter-operates with `java.io` as an *OutputStream*.
- rarely, `new ByteArrayOutputStream()` can accumulate UTF-8 octets and is a `java.io` *OutputStream*. That can be handy when composing a string from a mix of (strings, files, URLs, process stdout).
In addition to those types, there are common APIs:
- The static method *String.format* and the instance method *String.formatted* takes does *sprintf*-like substitution based on positional parameters
- *MessageFormat* similarly combines positional parameters into a template string using a different syntax. It has affordances for locale-aware interpolation which make it a special case that is used for simple concatenation but does not have those semantics so I don't treat it as distinct from *String.format* for the purposes of this document.
| / | Decomposability | Delegation | Introspection | Transparency | Interopiness |
| -- | -- | -- | -- | -- | -- |
| *StringBuilder* | ✓ | ✓ | ✓ | ✓ | |
| *StringWriter* | ✓ | ✓ | | ✓ | ✓ |
| *PrintStream* | ✓ | ✓ | | ✓ | ✓ |
| *ByteArrayOutputStream* | ✓ | ✓ | | | ~ |
| *String.format* | ✓ | | | ✓ | |
*StringBuilder* acts as a *string-like* and allows delegation.
*StringWriter* and friends like *PrintStream* do not allow introspection but do allow delegation.
In Java, we could craft a *StringWriter* like API that allows introspection via our own type that `extends Writer implements CharSequence` but that would not require Java APIs to be aware of our type which would be super awkward.
## Non-appending mutations
*StringBuilder* allows for (not-super efficient) insertion and replacment by random-access index.
TODO: use cases for not-at-end reading and mutation.
Ben: I did a github search... mostly seem to see these in [solutions to homework](https://github.com/doocs/leetcode).
* Concatenation.
* ``` cs
indent = new StringBuilder(trail).Insert(0, spaces, recursion - 1).ToString();
```
* Build a string in reverse.
* ``` java
var sb = new StringBuilder();
while (stk.Count > 0) {
sb.Insert(0, "/" + stk.Pop());
}
return sb.Length == 0 ? "/" : sb.ToString();
```
* Reverse in place.
* ``` java
while (start < end) {
char temp = sb.charAt(start);
sb.setCharAt(start, sb.charAt(end));
sb.setCharAt(end, temp);
start++;
end--;
}
```
* Something palindromey.
* ``` java
StringBuilder sb = new StringBuilder();
sb.append(i);
sb.append(new StringBuilder(i + "").reverse().substring(l & 1));
res.add(Long.parseLong(sb.toString()));
```
The only real-life use case seems to be `prepend()` for stringifying some data structures that are naturally reversed, like `SomeNode.parent` fields. We can have a cheap `prepend()` method with a custom string builder class if it's a growable ring buffer. Java's `.insert(0, x)` will shift each time, but C# seems to have a linked list of chunks, so `.insert(0, x)` would be performant.
"Make a list of strings, reverse them, and concatenate" is not a terrible way to handle it and is reasonably idiomatic.
Looking at common String methods, Java's StringBuilder doesn't implement some obvious conveniences like `.trim()`. C#'s StringBuilder does let you do `.Replace('!', '?')`, which might be a handy finishing operation. It doesn't seem like established languages have found a lot of compelling use cases for internal mutation.
## Conclusion
It seems like the main bit of distinction between partially built strings is whether or not the string is readable.
We might preserve freedom for I/O interop if we have a default *CharSink* that is not readable and a *ReadableCharSink* that must preserve enough information to allow some level of introspection, possibly at the cost of connecting to a type that does not allow for easy I/O interop with backend types.
```ts=
class StringBuilder {
public append(suffix: String): Void;
public toString(): String;
}
class MutableString {
public append(suffix: String): Void;
public toString(): String;
}
```
## Aside on positional interpolation markers vs interstitial expressions
```python3
greeting = "Hello";
audience = "World";
mood = "!";
# Positional
"%s, %s%s" % (greeting, audience, mood)
# Interstitial
f"{greeting}, {audience}{mood}"
```
In the above, to understand the sprintf string you have to look
at the `%s` and then look over to the expression list, and then up to the definition of the value. That's a lot of back and forth.

In the interstitial expression case, the scanning is simpler.
And if you understand the expressions already, then you can scan it linearly.
