Read-once Objects

Posted on Oct 22, 2023

tl;dr: Another value object type that lets you protect sensitive data from leakage and accidental disclosure.

Hello! In the previous post, I covered tainted types as the second of a domain primitives series.

Tainted types helped us package dangerous user input into a value object to ensure that all the areas where it propagated to are handled correctly. We’ll extend the value object concept to create a container for sensitive data. It’s one of my favorite domain primitives. They are fantastic for storing and tracking sensitive data, and preventing them from leaking into logs. (I’m sure you’ve been there.)

Implementation

The idea is basic: store the sensitive value into a value object, and then unseal it once and only once it is actually needed. Any other additional attempts to do so should fail loudly.

Coming back to our register example from the branded types post, the type of code we want to avoid is this:

async function register(tenant: Tenant, userId: string) {
  const url = `https://${tenant}.foobar.net`;
  const secret = getFooBarSecret(); // secret is a string

  // Call to external service
  const data = await callFooBar(secret, {...});

  // TODO: remove after testing
  console.debug(data, secret); // Woopsie, committed to source 🙊
  ...
}

To defend against this, we can retool getFooBarSecret to return a read-once object instead of a string primitive. Let’s implement that object first:

class SensitiveValue<T = string> {
  private read: boolean = false;
  #value: T;

  constructor(value: T) {
    this.#value = value;
  }

  toString() {
    return "<SensitiveValue>"
  }

  value() {
    if (!this.read) {
      this.read = true;
      return this.#value;
    } else {
      throw new Error("Value already read");
    }
  }
}

The # in #value is syntactic sugar for a private attribute: the only way for other classes to access that property is through the getter, .value(). The key piece here is that the getter can be used only once, giving us that “read-once” guarantee. Secondary access will throw an error. You might also notice that the .toString() method automatically masks the value.

Equipped with this new type, we can change getFooBarSecret to return a SensitiveValue (instead of a string) and change the register function to:

async function register(tenant: Tenant, userId: string) {
  const url = `https://${tenant}.foobar.net`;
  const secret = getFooBarSecret(); // secret is a SensitiveValue

  // Call to external service
  const data = await callFooBar(secret.value(), {...});

  // TODO: remove after testing
  console.debug(data, secret); // Nothing sensitive leaked into logs!
  console.debug(data, secret.value()); // 💥 Error!
  ...
}

Notice that nothing is leaked into logs and that the second call throws an error. Unfortunately, my TypeScript-fu isn’t strong enough to find a way to return something like a never type on the second invocation so we can catch this at compile-time. I think it might be possible with function overloads, but if you know of a way, please let me know!

Alright, all this is great, but what if the developer calls console.log before the important business logic, causing that logic to fail?

console.debug(data, secret.value()); // 😞 Sad logs
const data = await callFooBar(secret.value(), {...}); // 💥 Error!

Of course, this exception is Probably Bad! But… you do have test coverage for this, right? Right? This should absolutely fail in tests before reaching production. And even if doesn’t, I’ll argue that it’s is a better outcome than a data breach or disclosure. (Fight me!)

If we don’t have appropriate coverage, we still have a strongly typed SensitiveValue object that lets us use a taint tracking rule to catch double usage and when it goes to logging sinks. Having a domain primitive that represents a sensitive value helps you target the source correctly and predictably.

Another obvious workaround is that a developer simply reads the object once, assigns it to a variable, and then sends that into logs:

const secret = getFooBarSecret(); // SensitiveValue
const token = secret.value(); // The *actual* secret value

const data = await callFooBar(token, ...);
console.debug(token) // 🪦 Ha-ha!

Boo! These read-once objects should be sealed until the last possible moment, and never assigned or returned by some function. It’s like you have a lead box with Uranium inside. You want to keep it sealed, open it once and only once, and then throw it into the nuclear reactor. You don’t want to grab it and toss it around everywhere.

Semgrep can help keep that contract and make sure developers use that lead box properly.

Semgrep

A simple rule does the trick:

rules:
  - id: sensitive-value-assignment
    message: Unsealed sensitive values should not be assigned or returned
    languages: [typescript]
    severity: WARNING
    pattern-either:
      - pattern: "$VAL = ($X: SensitiveValue).value()"
      - pattern: "return $X.value()"

Ta-da! Now this flags in your continuous integration pipeline before reaching production. We have a read-once object that keeps secrets out of logs, and a static analysis rule that keeps developers on the paved path.

I didn’t showcase a taint tracking rule to catch the first failure pattern of double usage because ~~I’m lazy~~ I wanted you to take a crack at it. If you haven’t checked out how awesome Semgrep’s ease and expressivity is, especially taint tracking, you can start here.

So far, I’ve covered three very powerful domain primitives that help you build software that’s secure-by-design. And along the way, we’ve used static analysis tools to help you stay there.

A recap:

branded types make things that send data to you act safely
tainted types make things that you send data to act safely
read-once objects keep secrets safe
static analysis augments all of the above

I don’t have a next post planned (yet), but this is just the tip of the iceberg! We only touched alternatives to string primitives. They certainly don’t have to be strings and can be used to represent all kinds of important, domain-specific data. There’s so much more to secure-by-design and domain primitives.

I’ll probably revisit this series some time in the future, but in the meantime, I highly recommend these resources for further reading:

Thanks! ✌️

Read-once Objects

Implementation

Semgrep

Next