Information security is a constant battle between the increasing computer power of bad actors and the improving encryption efforts of businesses and other entities to prevent and mitigate attacks. In the current encryption environment, based on NIST guidelines, the “mechanics” of encryption is centered on algorithmically generated “keys” that are used to encrypt data, decrypt data, or both. These keys, and the robust management of them, are theoretically strong enough to prevent data and systems from being mathematically compromised, but in reality there continue to be weaknesses in every system, as evidenced by constant reports of hacks, ransomware attacks, and thefts of commercial and personal data.
To understand the process of using keys for information security, and thereby how they can best be managed to minimize weaknesses, it is important to know the differences between data keys, secondary keys, and archive keys. These are the keys that information security managers must generate, track, maintain, and control effectively in an ever more complex encryption environment.
A GROSS (BUT USEFUL) OVERSIMPLIFICATION
While the mathematical and algorithmic processes used in today’s most advanced information security methods are extremely complex, the idea of a simple physical lock and key serves to help understand these building blocks of encryption. Imagine a low-tech physical world of any era, where a room is set aside for storage of information or objects. Only certain individuals should have access to this room and its contents, and to prevent access by anyone else, a lock is placed on the door. Only with the correct physical key can someone open and view the contents of the room, add to those contents, and perhaps even remove some or all of them.
THE DATA KEY
This key that directly, physically opens the lock is, in the very simplest sense, a data key or Data Encryption Key (DEK). It allows anyone who possesses it to – in a single step – turn the lock mechanism, access the contents of the storage room, and use those contents however they please.
Again, this is a decidedly gross oversimplification, and one significant aspect of encryption that it does not communicate is that when an algorithmic data key is generated, it also creates the lock itself, because the locking method is the encryption of the data itself. In our low-tech physical world, the data key would do the impossible: the door would always be open, but nothing in the room would be recognizable, readable, useful, or removable.
Still, our simple concept of lock, key, and storage room illustrates the most important aspect of a data key: if a bad actor finds or steals the data key for encrypted data, they are only one simple step away from viewing and controlling that data.
THE SECONDARY KEY
How can such “easy” security breaches be prevented, then? Again, in a low-tech physical world, one method would be to place the data key in a lockbox and hide it, thus making possession of the secondary key to the lockbox the first step to accessing our storage room. Now, rather than simply finding or stealing the data key, a bad actor would instead need to find or steal the secondary key, then have knowledge of the location of the lockbox, then access the lockbox, before they would be able to take the final step of unlocking our storage room.
In encryption parlance, this is a form of “envelope encryption”. The data key that provides access to encrypted data is itself encrypted. Its readable text format is made unreadable by a secondary key encryption algorithm, and that secondary key, or Key Encryption Key (KEK), is placed in a separate, secure location – physical, virtual, or both. With the advent of cloud-based Key Management Systems (KMS), secondary keys are often held in a secure location in the cloud, while the now-encrypted data key may be stored right alongside the encrypted data itself.
THE KEY LIFECYCLE, AND THE ARCHIVE KEY
Any given cryptographic key is always at risk of being compromised – it is just a matter of time before the key is accidentally made available or a bad actor uses any of a variety of hacking methods to obtain it. Information security managers must therefore assume that every key they generate has a limited period that it can be relied upon to provide effective security, and they must anticipate the eventual suspension or deletion of those keys. This period is called the “cryptoperiod”, and its duration will vary based on various factors, most notably on the sensitivity of the data being secured. Symmetric keys (keys that initially both encrypt and decrypt the same data) in particular need to be monitored because they tend to be much longer-lived than asymmetric keys. The amount of data a given symmetric data key will eventually encrypt will vary, of course, but eventually, because of reaching a storage limit or time limit or some other factor, the data key will cease to be available for encryption.
At this point it is no longer called a data key, and instead is referred to as an archive key, indicating that it is a later-cycle key that can only decrypt its associated dataset. Note that this is simply a change of function, and the key’s nomenclature is altered to reflect that more limited function, as well as indicate its stage in the key lifecycle. It is also important to recognize that this change in function and nomenclature doesn’t negate the potential need for continued encryption by a secondary key. The associated dataset, considered sensitive enough to encrypt before, is unlikely to have suddenly become less valuable.
Over time, however, the need to access this particular dataset will wane. Now that it is no longer being expanded, newer information being encrypted by newly created keys will be more relevant and more frequently accessed, until ultimately the original dataset will become so rarely used it that becomes irrelevant, and the key in existence for so long that it is likely to be lost track of or forgotten or be compromised. This is why key deletion is generally a scheduled item in a robust key management system, and a key’s lifecycle from creation to its expected deletion date.
WHY DOES ALL THIS MATTER?
It matters because, first, it is crucial to understand these keys and their lifecycle in order to build a secure system based on sensitivity and time risks. But it is also important because it helps to understand how information security managers find themselves trapped in a key management system that is out of control. In our low-tech, physical world, it was easy to understand and follow which key did what and when. But when you have encryption needs that require hundred, thousands, or millions of keys, each with their own individual, overlapping lifecycles, as well as associated secondary keys that may be connected with groups of data keys or archive keys, a system can quickly become unwieldy in terms of the time and expense needed to track and understand the relationships within it. It also makes it hard to end any given key lifecycle with confidence, knowing that that key may have untracked relationships to still “living” keys and data. Or worse, the execution of an automatic deletion may break untracked relationships between still living keys and data, resulting in lost access and lost time.