Our team has spent months developing, tuning and perfecting a complex deep neural network to classify important financial, medical or government data. The application has been containerized, packaged, and is finally ready to deploy as a public service on the cloud, but one thing stands in the way. How do you get assurance that your trade secret code, as well as the highly sensitive data it handles can be kept confidential and private? In addition, how can you provide workload and data governance in highly regulated industries?
The problems of privacy and confidentiality, and governance is in the center of what IBM Research is tackling. To improve these aspects of container and cloud technology, we have recently contributed two open-source projects: Encrypted Container Images, which ensure that container images remain encrypted and private, and Trusted Service Identity, that ensures that data processed by these images remains secure.
We will first go through how these technologies enable confidentiality of code and data, and then show how they can provide governance.
Protecting the confidentiality of workloads and secrets
To provide high assurance, we want to ensure both data and code are encrypted, which are provided by Encrypted Container Images and Trusted Service Identity respectively.
Encrypted Container Images protects the confidentiality of the workload/code by extending the OCI (Open Container Initiative) container image specification with +encrypted media types, which allows developers to encrypt container images, so that they can only be decrypted by authorized parties (developers, clusters, machines, etc.). This ensures that the workload stays encrypted from build to run-time. Without the appropriate key, even in the event of the registry compromise, the content of the image remains confidential.
Ensuring that the workload is encrypted is only one side of the story. Every workload needs some access to data sources or other services. In order to do this, it must authenticate itself, using a password, API key or certificate. Today, this is typically done through Kubernetes secrets. The problem with the Kubernetes secrets is that once they are stored, they would be also available to administrators, cloud operators or anyone with access to this namespace. However, the admins might be third-party employees just managing the resources, might not be certified to access the data or might not have security clearance.
Trusted Service Identity protects sensitive data access by ensuring only attested services are able to obtain credentials. This is done through the use of workload identity, composed of various run-time measurements like the image and cluster name, data center location, etc, to identify the application. These measurements are securely signed by a service running on every hosting node, using a chain of trust installed during the secure bootstrapping of the environment.
Extending Governance: Geofencing execution and Location++ based data policy
When it comes to highly regulated industries, confidentiality of workloads and data is just the tip of the iceberg. Often, more fine-grained controls need to be put into place. And the technologies described above address them.
- The first type of control is geofencing execution. How do we ensure that this workload is only run in particular clusters or regions? How do we enforce export control or digital rights management? These are natural uses of Encrypted Container Images. By only providing the appropriate decryption keys through attestation, authorization and key management, we can create a trust binding between certain clusters/workers and workloads. This will provide assurance of knowing WHERE workloads are running.Another control is to be able to express, enforce and audit data access via location (or other properties) of the workloads, i.e. GDPR. Trusted Service Identity provides just that. By exposing the properties of workloads in the definition of policies governing secrets, it provides the ability to create policies such as “Only workload X (with this image), running in datacenter DAL05, in kube cluster hipaaCluster
can access medical records in this Cloudant database“. This method of governance has very fine-grained controls, that might apply to a wide spectrum of use-cases. In addition, the entire audit trail is retained, tracking every interaction and what process received what secret and when. These are just a couple examples of how these technologies can be used to enhance governance in the container cloud ecosystem.
Find out more
We have only scratched the surface of exploring these technologies in this article. If you’d like to find out more about these technologies.