### Introducing SAGE: Symmetric Asynchronous Generative Encryption

*by Jean-Philippe Beaudet*

## Introduction

*(New to encryption? A glossary follows this article.)*

Classical encryption is threatened by quantum computing, and the race is on to discover the next best thing to preserve our privacy. One such solution is Symmetric Asynchronous Generative Encryption (SAGE). This article explains the concepts and applications of the SAGE protocol. I hope, by the end of it, you’ll have a fair idea of the strengths and particularities of generative encryption based on random number generation, and why it can considerably mitigate the quantum computing threat.

Research has been conducted from a conceptual pseudocode into a working Proof Of Concept (POC) in Python, which is currently being ported to C++ as a beta.

## Summary

SAGE is a pure encryption technique based on a cipher and is not an authentication encryption, although it can play this role in some cases. The protocol was conceived and researched with two (2) main ideas behind it: mutability and the removal of the mathematical challenge component. These ideas are based on the assumption that this forces a malicious actor to relies on a single option – dictionary attacks (brute force).

Dictionary attacks use the known valid format of the secret component (generally the encryption key or passwords) and try various possibilities until guessing the correct one. In certain circumstances, this process can be facilitated with pattern recognition and/or a machine learning algorithm prioritizing certain possibilities.

The mutability of secrets (encryption keys) is meant to limit the window of opportunity for dictionary attacks by continually changing the state of a key in a process referred to as encryption key mutation.

Furthermore, great care has been dedicated to ensuring that the cipher relies on multilayer masking techniques to disable any pattern recognition of similar symbols. This means that repeated encrypted characters would not be encrypted the same, thus limiting pattern recognition potential considerably.

Finally, SAGE uses long keys (2048) based on symbols instead of binaries (bits) and a mutual known reference derived from the initial encryption key exchanges, referred to as a codex. The probability of guessing a key and a codex are astronomically low. The likelihood of guessing a valid key state at any given time is on the order of 90 ** 2048. The keys mutate every time they are used to either encrypt or decrypt a message. This could happen multiple times per second in the case of bytes packet streaming such as chat system and WebSocket in general.

In SAGE, the complexity is measured in the base number of bits that are used for codex generation. It can be either 8, 16, 32, or 64 bit. It is also related to the number of layers of encryption, which can be between 2 and 256. The minimal power it can be used with (8 bit codex with two layers) would have 2** 256 total possible combinations, which is roughly the same complexity as AES256. SAGE is recommended to be used at 16 bit with two layers minimum, which would push the complexity exponentially to 2**65536. To be able to recover and encrypt messages, one must know the valid encryption key state and the codex at the same time. Codexes are regenerated during a scheduled period, chosen at the setup. It is recommended to change it at least once a month. After the initial set up, it is done automatically between users with each cycle by exchanging new genesis keys.

## Mechanics

Alice wants to communicate with Bob. She generates her genesis and initiator half-key and sends Bob a communication channel request by submitting two half-keys. Bob receives the request, generates his genesis and initiator half-key, which he sends to Alice. Once Alice receives Bob’s half-keys, they both generate a common codex from the combination of the genesis and initiator keys. The communication channel can now send and receive messages. Alice creates a message, encodes it, then her key permutates, and she sends it. Bob receives the message, decodes it, then his key permutates in relation to Alice’s latest key permutation. This continues as messages are sent back and forth, the keys permuting with each use. Only persons involved in the message chain receive the updated permutations. Successful decryption can be verified every time by returning a checksum of the unencrypted data whilst not revealing anything about the given data.

## Use Cases

– Encryption is used in most secure messaging system such as Whatsapp, Telegram, and Dust;

– Encryption is used in internet communication and certificate authority (CA) such as Https (SSL) for validating the domain on which a user browses;

– Encryption is used in the salting of a secret value in databases such as passwords or financial information;

– Encryption is used to secure emails in products such as ProtonMail and MulaMail;

– Encryption is used everywhere where privacy must be maintained;

– Encryption can be used to sign a document digitally.

## Glossary

**SAGE:** Symmetric Asynchronous Generative Encryption.

**Cipher: **Reference table used in accordance with a secret (encryption key) to encrypt data.

**Encryption keys: **An encryption key is a secret used to create a cipher from which data can be encrypted and decrypted. In the case of asymmetric encryption, the keys will be referred to as key pairs, a public and a private key. The public key enables you to encrypt a message only the private key owner can read. This enables you to prove ownership and is often referred to as authentication-based encryption. In the case of generative encryption, the keys need to be synchronized between users for them to communicate as the keys will change their valid state each time they are used.

**Codex:** The codex is a common reference table created between 2 or more end-users to communicate using SAGE. Each codex is unique, and the collision possibility depends on the base encoding and the number of layers. It starts from (2** (2 ** 8)) up to (256 ** (2 **64)).

**Collision:** The possibility of the repetition of a unique value such as an encryption key and, in the case of SAGE, the codex.

**Key mutability:** The mutability of keys refers to the fact the symbols comprising the encryption keys are completely changed (mutate) each time the key is used.

**Natural Language Processing:** Natural Language Processing (NLP) is a technique in the field of machine learning which studies the recognition and understanding of natural language and language patterns. This technique is widely used in technology such as Alexa, Siri, and other similar applications. Algorithms also use it to aggregate data about the user based on user content and texts.

**Dictionary attack (brute force): **Dictionary attacks are a technique that continually performs trial and error guessing to obtain a secret by knowing the valid possibilities, such as word or binaries.

**Pattern recognition: **Pattern recognition is the process of recognizing patterns by using a machine learning algorithm. Pattern recognition can be defined as the classification of data based on knowledge already gained or on statistical information extracted from patterns and/or their representation.

**Genesis Keys:** A genesis key is a secret that is created and shared by two or more end-users to generate a new communication channel (codex).

**Encryption complexity:** The complexity is the total number of possible combinations of a secret (encryption keys, passwords, PINs).

*This piece was originally published on Medium on 16 January 2020.*