TLS Notary (TLSN) at a high levelA simplified TLSN architecture with minimal cryptographyAside: Multi-Party Computation (MPC)Full TLSN architecture with MPCWrap Up
This article is a comprehensive introduction to the TLS Notary (TLSN) Protocol, originally conceived of in 2014, and subsequently re-designed and re-implemented by the Ethereum Privacy, Scaling, and Exploration group with modern cryptographic components.
We at Pluto.xyz are productionizing TLSN to enable smart contract developers to take advantage of any off-chain data sources in their smart contracts in a service that we are calling Web Proofs.
If you were to share the contents of your TLS transcript with a third party, that party would have no way to detect whether the transcript had been forged with faulty data. TLSN is a protocol for Data Provenance — data with proof of origination from some particular server.
Why would you read this article instead of the well written TLSN documentation? This article aims to cover TLSN first from high level, then to give the requisite mathematical background to understand the protocol. We'll start off with a basic background of the architecture of TLSN, and provide design motivation, before opening each box and visiting cryptographic details.
We will assume a working background knowledge of the basic workings of TLS¹.
TLS Notary (TLSN) at a high level
TLS (or Transport Layer Security) is a protocol for encrypting and authenticating communication between two parties: a Client (described elsewhere in this post as the Prover) and a Server. TLS Notary is a protocol that allows the Client to prove Data Provenance to a third party; in other words, TLS demonstrates that the Client honestly obtained the data from the Server, and did not interfere with the contents.
The TLS Notary protocol aims to achieve the following properties:
- Authenticity - the protocol should demonstrate that TLS transcript is not a forgery by the client
- Provenance - the protocol should authenticate the identity of the server via certificate chain verification--a valid transcript with a fake giithub.com should be rejected²
- Privacy - the Client should not have to sacrifice the privacy of their data or credentials to a third party
- Non-Proprietary³ - The protocol should avoid enshrining a single service provider as intermediary
We'll begin with a simplified picture of TLSN, and work towards the complete architecture.
A simplified TLSN architecture with minimal cryptography
In Figure 1, we introduce two new parties to the TLS Protocol: a Proxy, who observes and stores encrypted traffic between the Client and Server, and a Verifier, who is interested to verify the contents of the Client-Server transcript.
In this flow, the Proxy observes traffic between the Client and Server, and saves the encrypted TLS transcript.
At the end of the session, the Client sends the Proxy:
- the certificate chain, demonstrating the server's identity is as claimed
- the symmetric encryption key for the session, held by Client and Server
The Proxy then performs the following:
- verifies the certificate chain, attesting to the Server's identity
- decrypts the TLS transcript with the Client's symmetric key
- signs the decrypted transcript, or signs the subset of the transcript that the client wants to share
The Proxy-signed TLS transcript data may then be passed to the verifier. The verifier checks the Proxy's signature, and is thereby convinced of authenticity and provenance. However, the Client had to sacrifice their secret session key, and all transcript privacy to the Proxy.
Besides the sacrifice of privacy, the Proxy is an enshrined Trusted Party in the proposed scheme, akin to a Certificate Authority. The intermediary role of the Proxy bottlenecks the service, potentially limiting overall network traffic. Finally, a compromised Proxy may collude with a client to deceive the verifier.
In the following sections, we will discuss how the TLS Notary architecture resolves these issues. First, a brief interlude to introduce the active cryptographic component in TLSN, multi-party computation.
Aside: Multi-Party Computation (MPC)
Multi-Party Computation is the cryptographic problem of how to compute a function , with some number of parties , each with knowledge of their corresponding private input to the function. Parties are interested in the output , but would also like to avoid leaking information about their secret data.
MPC techniques generally involve:
- representing a program as a function , composed of a circuit of boolean or arithmetic gates
- a setup phase, where each party obtains some "share" of a secret shared by everyone; e.g. additive shares where , where parties hold the shares, but neither party knows the value of
- each party conceals and publishes their inputs, in some way the preserves the mathematical structure of the following step
- parties performs computation on the concealed inputs to arrive at some output
- if the output is not already in plaintext (i.e. if homorphic encryption is employed), a final decryption step may be required to obtain the result
In the context of TLSN, MPC is applied to constrain the Client's ability to exchange messages with the server. A secondary party, termed the Notary, performs MPC with the Client to encrypt, decrypt and authenticate messages. Because of the privacy preserving nature of MPC, the Notary prevents the Client from forging messages, without ever observing the messages the client sends.
In a few sections, we'll put our cryptography hats on to discuss the mathematics of Garbled Circuits MPC, but first we'll update our simplified architecture model of TLS Notary, employing MPC.
Full TLSN architecture with MPC
In the updated architecture picture, we replace the Proxy server with a MPC partner for the client, who shall be called the Notary. The Notary performs MPC with the Client to perform TLS protocols, including key exchange, encryption, authentication, and decryption. However, via MPC, the Notary never observes the plaintext of the Client's transcript, thereby preserving Client privacy.
Recall from prior discussion that the TLS Notary protocol aims to achieve:
- Authenticity - the protocol should demonstrate that TLS transcript is not a forgery by the client
- Provenance - the protocol should authenticate the identity of the server via certificate chain verification--a valid transcript with a fake giithub.com should be rejected
- Privacy - the Client should not have to sacrifice the privacy of their data or credentials to a third party
- Non-Proprietary - The protocol should avoid enshrining a single service provider as intermediary
We've already discussed how client privacy is obtained. The protocol may be rendered non-proprietary by allowing the Verifier to accept transcripts produced with other Notaries than the Pluto Notary. We are also investigating the security-performance trade-off of further decentralizing the Notary, from 2-party Client-Notary computation to 3-or-more-party Notary computation.
Provenance of the Server's identity is obtained identically to our simplified protocol: by verification of the Certificate chain providing the Server's public key.
Finally, Authenticity is obtained by a cryptographic technique called blind signatures. Blind signatures allow the Notary to sign commitments to the Client's messages. This prevents the client from later forging the content of their message, while the Notary never observes message plaintexts.
Blind signatures are also constructed over commitments to the Server's identification; that is, the Notary also never directly observes the identity of the Server.
We close our high level discussion of the TLS Notary Protocol with a mini-spec of the TLS Notary protocol, that we will expand on in greater depth in the next post.
- Preprocessing step: Client and Notary perform Oblivious Transfer pre-processing, to allow for faster online proof computation.
- Prior to Key Generation: Client obtains Server's public key and certificate chain, proving Server public key authenticity.
- Key Generation: Client and Notary generate ephemeral key shares of the TLS shared secret (premaster key), for encryption and authentication with the server
- Encryption and Authentication: Client wants to send a message to the server. Client and Notary use 2-party computation to encrypt and authenticate the message. Notary blind-signs the message, and client sends the encrypted message to the server.
- Server issues a response.
- Client and Notary use 2-party computation again to decrypt and authenticate the Server's message. These two steps repeat until the client's session is complete.
- Proof generation: At the end of the protocol, the Client possesses a certificate chain of the server's identity, and the TLS transcript with signatures from the Notary. A proof may be generated to verify succinctly correctness of the signatures and certificate chain.
- We may discuss selective disclosure in a further post.
Wrap Up
In this post, we gave a high-level description of TLS Notary, starting with a simplified architecture and progressing through towards an overview of the TLS Notary protocol that includes multi-party computation.
In a follow-on post, we will examine the protocol from a mathematical lens. If you’re interested in working on TLS Notary, apply to work with us at Pluto.
¹We may write a primer on TLS in the future, but have not yet.
²The term Data Provenance, used elsewhere, refers to the combination what we term Authenticity and Provenance.
³Non-Proprietary gets at the point that the service provider should be replaceable; but the term Non-Proprietary is slightly misleading. I came up with these terms on the fly, don't read them as gospel.