An idea for public key authentication from a name without certificates or central authority

by Stephen Hewitt | Published 14 May 2021 | Last updated 23 May 2021

Here is an idea for authenticating a cryptographic public key from a name without a central authority to bind the key to the name. The difficulty this seeks to overcome has been called “Zooko's Triangle” - see external links below.

The classic problem

The discovery of public key cryptography meant that the difficulty of communicating privately with someone became the problem of reliably obtaining their public key. If an attacker can trick you into believing that you have your intended recipient's public key when in fact you have the attacker's public key then the attacker can mount a MITM (man-in-the-middle) attack.

In the MITM the attacker intercepts all the messages that you send and decrypts them and then encrypts them using the correct public key and forwards them to your intended recipient, who thinks they have come directly from you.

Alternatively, if you are using the public key to verify a signature, the attacker can induce you to accept something signed by the attacker if you are mistakenly using the attacker's public key instead of the authentic one.

A proposed solution

The solution proposed here is to make the name long enough that it can serve as a fingerprint of the cryptographic key. This solution squashes flat the metaphorical Zooko's triangle by coalescing its “human meaningful” and “secure” vertices into one point.

The name by itself will not tell you the public key, but it will allow you to authenticate a key that you obtain from elsewhere.

The idea is to make a universally unique name from a sequence of pronounceable pseudowords - words that are like English words in form but actually are not real words. My first experiments to create these using random parts of real words show that a sequence of just four such pseudowords, each of around three syllables, contains enough information to represent a binary string of around 128 bits. The following is an example.

foupouren tottymouge scourivynd flemiongunge

To illustrate relative complexity, here are two names from the translated novels of Dostoevsky:

Rodion Romanovich Raskolnikov

Parfyon Semyonovitch Rogozhin

The cryptographic hashes currently used for fingerprints (in X509 certificates and PGP for example) are longer than 128 bits, but it is not clear to me that there is a fundamental reason why they need to be - but see the discussion below. What is needed is a suitable cryptographic hash function, preferably one that is provably-secure in its desired properties and which is not called MD5.

The proposal is to first define a bijective mapping, which maps each possible value of a 128-bit binary string to a unique sequence of pseudowords. The inverse mapping takes the word sequence and maps it back to the original binary string, with no loss of information.

The mapping from binary to pseudowords will be called encoding. The complete system is then illustrated in Figure 1.

 +-----------------+    +---------------+
 | Public key,     |    | Encoding      |
 | Associated info,|    | Specification |
 | Nonce           |    | (bijective)   |
 +-----------------+    +---------------+
         |                     |     |
         V                     V     |
       hash1()               hash1() |
         |                     |     |
         +----+          +-----+     |
              |          |           |
              V          V           |
        +---------------------+      |
        | concatenation of    |      |
        | both hashes         |      |
        +---------------------+      |
                  |                  |
                  V                  |
                hash2()              |
                  |                  |
                  V                  |
        +---------------------+      |
        | key fingerprint     |      |
        | (binary string)     |      |
        +---------------------+      |
                  |                  |
                  V                  |
 | Key fingerprint encoded as pseudowords     |
 |  - publish as a name, a trademark,         |
 | a sign on the office building              |
 | eg with pseudowords from a 128-bit hash:   |
 |                                            |
 | clarion tottymouge scourivynd flemiongunge |

 Figure 1. The derivation of a human-friendly
 public key fingerprint.

The threat model is that the attacker can supply a false key and a false encoding specification, but they cannot manipulate the published name.

Suppose for example that the name is in raised letters on the outside of a business premises in central London. When you walk past you can make a note of the name and it would be very hard for the attacker to change it. Alternatives might include a registered company name or registered trademark. It might be possible for an attacker to interfere with these, but with some difficulty and likelihood of leaving a trail. An important point is that once the fingerprint exists as a name, it may be possible to insert it into a public record even though that public record was not primarily intended for authenticating cryptographic keys.

The system achieves its aim if you can verify the true key and encoding specification pair against the name while rejecting fake key and encoding specification pairs from an attacker. The intent of the design is that it will be intractable for the attacker to make another key map to an existing public name.

To verify the key, you need the key, its associated information and the encoding specification. First the encoding specification needs to be checked. If the system is used for months or years then over time particular encoding specifications may become standard and well-known. In which case then hopefully it may already have been checked. It needs to be bijective, unambiguous and complete in the sense of mapping every one of the 2128 binary strings to a different name. But in principle to verify a particular key you just need to check that the encoding specification encodes only one binary string, out of all the possible 2128 different binary strings, to the target name. Then you follow the process in Figure 1, starting with the public key and the encoding scheme and finishing with a derived name made of pseudowords. The encoding at the “encode()” step is defined by the encoding specification which has been hashed into the fingerprint. If this derived name matches the public name, then this system considers the key verified.

The attacker can try to substitute their public key but in doing that they will change the binary key fingerprint. So they could try to change the encoding specification to make their changed binary key fingerprint still map to the public name. But changing the encoding specification will also change its hash which will change the binary key fingerprint. The circularity of this with the one-wayness of a cryptographic hash will prevent finding a collision by manipulating the encoding specification. The fundamental reason that the attacker cannot manipulate the encoding scheme is that anyone generating a name has to commit to a particular encoding scheme before they know the key fingerprint.

The creator of the public cryptographic key and public name must also start with the public key and derive the name from it, without much choice about what that name will be. However, some limited choice could be provided as follows. The creator could keep incrementing the nonce shown in Figure 1 and hashing again until a preferred name was found. For example, in the initial experiments each of the four words represents about 32 bits, meaning there are about 4.3 billion different possible pseudowords. Even on a home computer, it may be tractable to search a space of 4.3 billion to find a name with, for example, a particular first word. The first word in the example is shown as “clarion” to illustrate that it could have been chosen in this way. (It wasn't though). See the notes in the next section for some reservations and further ideas on this.

The format of the encoding specification from a theoretical perspective does not matter. It could be a natural language description or a computer programme to do the encoding or perhaps a purpose-defined specification language. From a practical perspective it is an area for further work, discussed below.

The system is designed to demonstrate that the creator of the name wanted it to be associated with a particular public key. It says nothing about whether the owner of the key wants to be associated with the name. There is nothing to stop someone creating a name for someone else's public key or even for something that is not a public key at all. This might lead to confusion and the way to avoid that confusion would be to require a signature from the public key say it is associated with the name. This signature cannot be included in the information that is hashed by hash1() because the name is not known until after the hashing. This signature will have to be made after the name is generated.

Are 128 bits enough?

The system is not predicated on using only 128 bits to define the public name, but the fewer bits, the more human-friendly the word sequence will be.

The aim of the attacker is to find a combination of public key, its related information and an encoding specification that generates the name of a genuine user, thus allowing one of the attacker's public keys to be falsely verified against someone else's name.

One way of doing this is to repeatedly generate names until there is a collision with a genuine name.

A feel for the probabilities

To get a feel for the probabilities involved, let's consider the following scenario.

Let's suppose that everyone in the world has a personal global name, so there are around 233 names (about 8.6 billion).

Let's suppose that a powerful attacker could generate names at the same rate as the whole world's bitcoin miners can generate SHA256 hashes. In May 2021 the bitcoin hash rate according to was around 149 * 1018 hashes/second.

How many collisions with genuine names would we expect an attacker working at the this rate to find in a year? The number of names the attacker would generate in a year would be:

149 x 1018 x 60 x 60 x 24 x 7 x 52

~ 292

With the 128-bit hash, the total namespace is at least 2128. In fact the global namespace would likely be a few bits more than this, because of the possibility of multiple encoding specifications. For example, a Korean encoding - presumably using a different alphabet - would produce a set of names having no overlap at all with an English encoding.

But assuming a namespace of size 2128 for now, for each name the attacker generates, there is a 233/2128 chance of a collision with a real name. The number of collisions expected in a year is therefore:

292 x 233/2128 = 1/8

So amongst the world population there would be about one collision every eight years.

Discussed below is a possible way of improving this, even without increasing the key fingerprint size.

Popular names

Another possible weakness is related to the suggestion above that the creator of a public name keeps incrementing the nonce and regenerating the name in order to find a name with (for example) a preferred first word.

If most names are generated like that, it seems to have the potential to reduce the size of the namespace actually used.

For example, suppose that everyone uses the simple encoding in the example above, creating names consisting of four pseudowords of about 32 bits each. Further suppose that most names are generated with a preferred first word and that there are relatively few first words that are popular, so that of the 232 possible first words only 28 are commonly used. That would mean that the theoretical namespace of 2128 would be reduced to a pool of only 2104 names from which most names are actually taken.

At first sight that reduction in the namespace size seems like a reduction in the search space for an attacker. But in fact the attacker cannot go backwards from a name but must start with the public key and encoding scheme and go forward.

The attacker has no knowledge of the name before it has been generated, so the reduced name space does not reduce the attacker's hashing work at all because there is no way to confine name generation to the reduced space.

However there is a way in which this might help an attacker. It might reduce the storage requirements for an attacker who generates a large number of names and stores them, hoping at some point to get a match on any public name when there are a large number of public names in use.

The attacker would generate names and store only the ones with popular first words, for example.

This needs further study, but in principle removing any advantage to the attacker means an encoding specification which matches the actual information content (entropy) of the popular first words to the number of bits they consume from the 128-bit fingerprint, so that the popular names are clustered together no denser in fingerprint space than any other name.

So in this artificial example, if there were only 256 popular first words which accounted for most of the global names, then the coding of each popular first word should ideally only use about 8 bits from the fingerprint, not 32 bits.

To illustrate a simplistic encoding approach for names associated with people, an encoding could specify five words for the public name, with (say) 16 bits for the first and last words and three 32-bit words in between them.

The first and last words could be specified with a list of 65,536 common first and last names respectively (last names can be taken from an old telephone directory). Then at an expected cost of 232 iterations the creator of a new name can search for their personal name. Mine could be something like the following.

Stephen Foupouren Tottymouge Scourivynd Hewitt

A birthday attack?

An apparent weakness would be a “birthday attack” where an attacker generates and temporarily stores 264 or so names, aiming to generate the same name for two different keys. Such a collision is likely with 264 attempts in a space of size 2128 because the attacker would have generated 264 x (264 -1)/2 different (unordered) pairs of names.

In this way the attacker could create two different public keys, both bound to the same name. It is not obvious that this represents a direct threat to the integrity of the other people's names and keys but it seems possible it could be used to perpetrate some kind of fraud. For example the attacker, using the public name, could appear to have signed something, perhaps getting something from a victim in return, but later repudiating the signature by showing that it was not made with their public key but with a different key which maps to the same name. And the dishonest attacker could disclaim all knowledge of the other key. It would allow a (somewhat) plausible denial of what appeared to be an authentic signature.

Alternatively it conceivably might be used to accuse someone of leaking secret information if they can be induced to encrypt something under a second public key with the same name. Or it could be used to provide a cover story for how information was leaked. There are probably lots of other tricks that can be played with it.

It would certainly be desirable if these things could not happen, but it is not clear to me this is a fatal flaw. It might be argued that preventing this is outside the scope of what this system is designed to do or that the system is quite correctly authenticating both keys. The successful verification asserts that the creator of the name wanted it to be associated with a particular key. And in this case the assertion is correct for both keys, since they were in fact generated by the same person and the name collision is not by chance. Further work on this might be clarifying exactly what guarantees the system provides.

The importance of the trick also depends whether people are going to believe that by some unknown process or sheer bad luck someone has somehow managed to duplicate a name with a different public key, even if it would take the entire world's bitcoin miners eight years against 8 billion targets to do such a thing once.

This also requires further study.

Further Work

More study is needed to understand exactly the security strengths and limitations and there is a lot more work to produce a practical implementation.

A major open question is the encoding specification. An obvious alternative to allowing an arbitrary encoding specification that is verified and whose hash has to be included in the key fingerprint as proposed above, would be to specify a particular encoding which is always used.

An alternative to both of them would be a compromise somewhere between the two. So a different encoding specification is allowed, but it is not arbitrary, it is constrained within certain limits.

There are advantages and disadvantages to these approaches. Evaluating these advantages may require further investigation.

A major advantage to allowing different encoding specifications is that they can evolve and improve over time without the need for a new standard of the system to be announced with a new incompatible encoding system. Also people speaking a different natural languages are likely to want different encoding specifications that reflect the different alphabets and word morphologies of their language.

Another advantage of allowing choice of encoding specifications is that the total namespace will then likely be larger than the 2128 of the key fingerprint. This is because it is unlikely that two different encoding specifications would generate the same set of possible names. There may be overlap between their sets of output strings but it is unlikely they will be identical sets. In fact, it would be pointless to have a second encoding scheme that duplicated the same set of names.

Another advantage of allowing different encoding specifications is that as the number of public names grows over time and the frequency distribution of actual first names becomes known, the writer of a new encoding specification could take account of this and assign less entropy to a common first word, so that anyone choosing that would still have to have longer or more words after it than someone choosing a less common name for the first word.

The encoding specification in any case seems like a potential weak point that needs careful consideration in a practical system. The threat is an attack that subverts the encoding software for a particular user or for a particular public name so that the attacker's key fingerprint appears to map to the target public name.

A disadvantage of allowing choice of encoding specifications is that the encoding specification, in whatever form it exists, is likely to be large. For example, consider the proposal that the first and last words could be commonly used first and last personal names respectively. This would require the encoding specification to contain all those names.

If it becomes common for a public name to have its own individual encoding specification, then this increases the amount of data that must be fetched to verify a public key, since the specification is needed.

However, the hope would be that a limited number of encoding specifications would become popular. These would become well known and well distributed, so that once given the hash the user would likely already have the specification.

There is also the idea of parameterised encoding specifications. An encoding specification could accept parameters to change its behaviour. Conceptually this would be the same as a different encoding specification but for transport, storage and possibly verification it might have practical advantages. For example given an encoding specification that produces names consisting of four words like the example given above, the specification could be parameterised to produce either names all in lower case, names with every word capitalised, or names with just the first word capitalised. (In my opinion other options should not be allowed, because they would not be very human-friendly). Just to be clear, in such a system the parameters would have to be included in the cryptographic hash of the encoding specification which is hashed into the the key fingerprint.

Clarion Tottymouge Scourivynd Flemiongunge
Clarion tottymouge scourivynd flemiongunge
clarion tottymouge scourivynd flemiongunge

Verifying the encoding specification is potentially complicated. It is important to reject a specification that would allow the attacker to trivially map their key fingerprint to the target public name. The attacker could achieve this if they wrote a (non-bijective) custom specification for a particular target name. Such a specification would map many of the 2128 possible key fingerprint values to the target name instead of just one, so making it easier to find a match.

The specification needs to be checked for this, but how easy is the check going to be? Ideally the specification needs to be written in a language that allows automated checking. What language? And it is clearly not tractable to perform a check by trying all 2128 possible fingerprint values. Is it possible to develop techniques that can examine an encoding and prove that only one value maps to the particular target name? This is what matters in the case of a particular target name. But it might be better to develop a checker that shows that the encoding is bijective - ie that it works correctly for all names. Then an encoding specification can be publicly checked by multiple independent trusted people with its hash published too and would perhaps not need further checking each time it was used.

It seems that some forms of encoding specification will be easier to check than others. For example a simplistic one that split the binary string into individual bytes and produced a pseudoword for each byte would be easy to check. Once you establish that it has 256 different words and that it maps each byte value to a different word then you are satisfied.

But how to check an encoding specification that uses a more complicated system such as arithmetic encoding and which does not have obvious repeated blocks? Should the system constrain the encoding specification to prevent this, solely with the aim of making it easier to check? If there are going to be constraints on what kind of encoding is allowed, what do they need to be?

It might be useful to allow a variable number of bits in the key fingerprint. Increasing the number of bits comes at the cost of a longer name, which is less human-friendly, as discussed. But a variable number could accommodate varying security requirements or preferences of different users.

For example, the system could allow an encoding specification to encode either from a 128-bit fingerprint from a 160-bit fingerprint.

The question then is, does it need to be apparent from the public name, whether it is a 160-bit name or a 128-bit name? The distinction could be made apparent with the kind of rule that says something like a 128-bit name has four words and a 160-bit name has five words.

If the distinction is not apparent from the name, then there is the possibility of an attacker creating an encoding specification that maps a 128-bit fingerprint to a chosen set of names which includes existing names that are 160-bit names. The attacker can then search for a match in a 2128 bit space instead of a 2160 bit space.

If the attacker could then (somehow) find a match for the name using a 128-bit fingerprint, then a victim trying to authenticate a key against a 160-bit name will be given a encoding specification that works on a 128-bit fingerprint and be none the wiser.

Further security analysis would be sensible.

For example if the attacker was able to create collisions in hash1() they could perhaps supply it with an encoding specification which contained some kind of collision sequence which would enable the specification to be changed without changing the hash. Then of course it would be trivial for them to write whatever name they want by changing the encoding specification. In any case this hash probably should have an output size of 256 bits or more. There is no disadvantage to using a wide hash here as it does not affect the size of the public name.

A way to strengthen meaningful names without increasing the hash size

If the information supplied with the key and included in the hash contains fields that are meaningful to people then this will make the attacker's job considerably harder, and possibly completely infeasible.

For example, there could be a free-form text field called something like “additional information”. If the someone was creating a name for a business, they might write in the first business address and year of opening, products sold and so on. It would be difficult for an attacker generating 292 names a year to fill these. And if the attacker did put information like this into the fields it would limit which real names the generated name could plausibly be matched against. So from around 8 billion it might drop to a handful.

Anyone verifying a key can read the additional information field and decide to put more trust as they see information that they know is associated with the name or lose trust because it contains inconsistent information that makes no sense, or remain sceptical because the field is empty.

In general the more verifiable assertions the creator can insert into the additional information fields, the more trust someone can attribute to the key, and the less likely it is to be a collision from a 292/year hashing operation.

But there might be something more formal. If the creator of a name intends to find preferred last and first words by incrementing the nonce as suggested above, then they can commit to this before doing it. So there could be a “first name” and “last name” field. So for the example above, we could have the following fields in the information supplied with the key:

Additional info: Creator of; First name: Stephen; Last name: Hewitt;

This information would be hashed into the fingerprint for the following global name:

Stephen Foupouren Tottymouge Scourivynd Hewitt

The name fields are special fields which people can instantly understand. But the verifying software could understand some of them too. If they are filled in, then they mean that the first and last words in the generated name must match. If they do not match, verification of the name fails.

If the attacker fills these in, then the attack hashing has to be repeated until the generated name matches. Until it does, the generated name is invalid and it will be rejected by the verification algorithm. And every valid name that the attacker generates with the name fields completed is restricted to a much smaller potential target set.

If the attacker leaves them empty, then anyone receiving the attacker's information to verify a key could be suspicious. Why did someone creating what looks like a personal name not fill in the name fields?

But this idea for checking could perhaps be extended. There might be a way for the original creator of the name to specify that certain information fields were filled. That would be realised by making the global name itself somehow encode the fact (as in the case of 128-bit versus 160-bit names in the discussion above) If the verification software saw that fields specified as present by the name were in fact empty, then verification would fail. This information would need to be read from the name independently of the encoding specification since the encoding specification can be supplied by the attacker. Preferably it should be obvious to a human too.

As a simplistic example, a rule could be: if the first and last words are capitalised, then they must match the corresponding “first name” and “last name” fields.

This is an area for further investigation, both in its effectiveness and how it would be implemented.


External links