Camille Morhardt 00:32
Hi, I’m Camille Morhardt, host of InTechnology podcast. Today I have with me Felix Schuster joining us from Germany. He’s CEO of Edgeless Systems, and we’re going to talk about encrypted runtime, protecting workloads in the cloud, confidential computing. Welcome to the show, Felix.
Felix Schuster 00:50
Hello, Camille, thanks for having me.
Camille Morhardt 00:53
First, just can you describe what it is that your company does?
Felix Schuster 00:57
Yeah, so we are a company that’s building infrastructure software for confidential computing. And confidential computing is a great technology that allows you to keep your workloads encrypted and tested throughout the lifecycle. I used to work for Microsoft–Microsoft Research, to be precise. That was in 2015. And back then confidential computing meant Intel SGX or Software Guard Extensions. And we started using the technology to build verifiable and always encrypted data processing frameworks for Azure, so that people could run their workloads on Azure without having to trust the Azure infrastructure. And I continued doing that at Microsoft till 2019. And then I quit, moved back to Germany and start this company together with my friend, Thomas. And we essentially continued building infrastructure software for confidential computing. But we started doing this in the open as open source and cloud agnostic.
Camille Morhardt 02:05
It sounds a little bit odd that you’re working within Microsoft to create like an environment that is separate from the Microsoft infrastructure–that sounds sort of like why is that happening? Maybe you could just give us like a quick framework to kind of look at the problem. And then we can sort of dive in.
Felix Schuster 02:22
Yeah, of course. So the cloud is essentially someone else’s computer. Right? That is that is a famous saying. And that’s okay for many workloads. But when we’re thinking about highly sensitive workloads, or regulated workloads–like in health care, and defense and finance–then it’s oftentimes not okay, to run these workloads on someone else’s computer. If you’re a bank, and you’re moving a certain workload to the cloud, you’re essentially giving up control of this workload. And without confidential computing, you have to fully trust the infrastructure provider–you have to trust Azure, or Google or AWS, or whoever runs your cloud infrastructure. And it may be okay, from a practical point of view, or it may be okay, from your risk assessments point of view, or maybe not; but it may not be okay, from your customer’s point of view, it may not be okay, from your regulators point of view. So that’s a reason why a lot of workloads have remained off the cloud. And that’s also reason why the big cloud providers are investing in this technology.
So, with confidential computing, you can change this now you can create workloads or environments where you don’t have to trust the infrastructure anymore. And that is achieved through different means. different features that are implemented in Intel CPUs, in AMD CPUs, and also in the future in ARM CPUs. And these features are, in a nutshell, strong isolation. So, the CPU will take care that your workload is protected against other parts of the system.
The second important feature is runtime encryption. So, the CPU will ensure that your workload remains encrypted in main memory throughout the processing. So, from the outside, it looks as if the CPU is working on encrypted data. Still, the CPU is processing the data in plain text in its internal registers. But from the outside, everything looks like it was always encrypted. So sometimes people like to say that with confidential computing we now get runtime encryption. We had encryption at rest, we had encryption in transit, and now we have runtime encryption, which kind of closes the circle. Right? We can now keep data encrypted throughout.
Camille Morhardt 04:45
Okay, throughout all of the different states that it exists in. What are the ways that an attack would occur? If you don’t have encrypted runtime? Like what is a way that some malware or some or somebody would try to be able to access or maybe have access to the data while it’s sitting within a cloud environment?
Felix Schuster 05:06
Yeah, so that’s a fundamental question, very important one to answer. Confidential computing does not protect against the attacker walking in through your front door, right? If you have an application, let’s say a banking application, and it’s vulnerable in this login form. It could, for example, be something like an SQL injection which is an infamous vulnerability. Confidential computing does not protect against this kind of vulnerability. And even if you have a perfect confidential computing envelope around your application, attackers may still walk in through this front door vulnerability, and steal all of your data. So that is not what the tech protects against. Instead, it’s all about protecting against threats coming from the infrastructure, like threats that are outside of your control, if you’re running a workload on someone else’s computer, for example, the cloud.
Camille Morhardt 06:01
Yeah, so what would happen, like give us an example of a way in
Felix Schuster 06:05
Could be a malicious cloud Admin, they could be an employee, or maybe contractor working for the infrastructure provider who could try to access your data at runtime.
Camille Morhardt 06:16
And they would do that through like the use of software or—
Felix Schuster 06:19
Yeah. Maybe they have Admin access to your worker nodes. Maybe you don’t even know that there is a worker node that’s running your database, and that Admin could have accessed and then copy your data at runtime from memory. That’s one attack. And then there’s another attack that could be for example, a data center employee who just well maybe cleans data center, and they could, at night, open the server record to your RAM, and go off with that. And then they could of course, be foreign legislation, just lawfully walking into the data center, and accessing your data in ways that does not comply with your local laws, which is also a concern.
So in essence, it’s all about unauthorized people having access to the infrastructure and misusing their the access capabilities. And of course, it could also be hackers, like hackers can, of course, also hack into cloud infrastructures and exploit vulnerabilities that maybe go from one tenant to another tenant, and then copy data.
Camille Morhardt 07:21
Because multiple different users or tenants are sitting on the same infrastructure for scale reasons.
Felix Schuster 07:26
Exactly. So they’re sitting on the same server and if there’s a vulnerability in the software that runs on the server, that orchestrates different workloads of the different tenants, and it has happened in the past that one tenant could access another tenant’s workload, and then yeah, copy that data.
Camille Morhardt 07:45
So let me ask you, you’re building a Kubernetes distribution for confidential computing Edgeless. And why are you doing that? And why is that hard?
Felix Schuster 07:55
Yeah. So, when we started with Edgeless, we were building developer tools for Intel SGX. And what we learned back then is that the message market fit for confidential computing is very good. So, if you talk to a CISO at a bank, at the public sector institution, at a defense company, or maybe even like a retail company, essentially all of them love the idea of keeping their workloads encrypted and isolated at runtime–be it on the cloud or on-prem. However, we also learned that no one actually wants to change workloads to get better security. So, what we realized was, and what we learned from the customers is that in a perfect world, they will be able to just lift and shift the existing workloads from an nonconfidential world to a confidential world and enjoy the benefits of strong isolation and runtime encryption.
And of course, if you’re looking at enterprise application landscapes, a lot of these workloads now runs on Kubernetes in a microservice style; it’s a default way to deploy workloads, manage workloads to scale workloads. And I think most applications that are currently being developed, are developed for Kubernetes. And this is why we focused on providing companies highly compatible and familiar environment to which they can lift and shift their applications without having to change the code. And that’s the basic idea, then we set out and build it.
Camille Morhardt 9:31
And was it difficult to build?
Felix Schuster 9:32
Frankly, yes. The reason is, the basic building blocks that confidential computing gives us are confidential VMs. And I believe you have discussed that with your guests and previous shows. So there is a technology called Intel TDX, is a technology called AMD SEV and these are the latest technologies in confidential computing. And they give us virtual machines that have confidential computing features– runtime encryption, strong isolation and remote attestation. And we use these as the basic foundation and we make sure that everything that comprises the Kubernetes cluster runs inside these confidential VMs; we then make sure that all of the software that runs inside these confidential VMs is attested and then verified using the remote attestation feature of confidential computing. And with this, we make sure that only good confidential VMs running precisely to the right software can be part of our Kubernetes cluster. And once we have verified the integrity of such a node, we then distribute keys to the nodes, and thus ensure that they can talk securely over end-to-end encrypted connections.
And we also gives them keys that allows them to write directly in encrypted form to cloud storage. And he put all of this together, what we get is a Kubernetes cluster where all the data is always encrypted, even when nodes talk to each other over the network, even the nodes write to cloud storage, the user, which is in our case, the DevOps admin, doesn’t even see the encryption, it is always encrypted. And therefore from the outside, no data can ever be accessed, but from the inside just looks and feels like normal Kubernetes. And that was yeah, that was quite difficult. And we’re quite happy that we got to that point.
Camille Morhardt 11:26
Are you finding that the customers that are adopting this or more like institutions or enterprises or organizations like you mentioned in, in certain sectors that have a lot of regulation? Who are like developing their own apps or who have brought apps in from the outside–some kind of enterprise app management, whether whatever it be? Or are you finding that the customers you’re most working with are like the enterprise application vendors that are then fanning that out to the enterprises that are interested in the environment?
Felix Schuster 12:00
I’d say both. There are certainly SaaS vendors that sell to enterprise customers that use our products. On the other hand, they are the classic use case is a company wanting to move on-prem workloads to the public cloud, for example, we were working with a with a large hospital. And so far, they have had all workloads on-prem. They ran into a problem that if they want to scale now, because data scientists get larger, more applications, they need to build out on trend, right? This means buying new hardware, maybe building a new building, stuff like that; if they now can move to the cloud, they can be way more flexible and cost efficient. And with competent computing and our Kubernetes they can now take their on prem deployments and move that to the cloud. So that’s the biggest use case we currently see.
Camille Morhardt 12:51
Why are you doing this in an open source framework?
Felix Schuster 12:55
Of course, open source has many pros and cons can be good for go-to-market, things like that. There can be a long discussion, why and how startups do open source. But there’s one specific thing to open source when it comes to confidential computing and that is remote attestation. Remote attestation allows a remote party to verify that there is indeed a good Intel CPU or good AMD CPU that is running a certain workload within a confidential computing environment. And that feature is super important. And without that feature, sort of the whole idea of confidential computing collapses, because if you can’t verify that there is your code running on a good confidential computing CPU, the infrastructure could just pretend that there is a confidential workload. So you crucially need remote attestation.
And now, if you think about remote attestation key part is that you can verify that a certain piece of software is running and not another piece, not a modified version of your software. But precisely the software you want there is run. And if you run third party software, like our software, you kind of need to establish trust and that software. The CPU will give you a remote attestation certificate for which you can infer, “okay, here’s a software that has a following hash, it’s running on the CPU.” And this hash will be vastly more meaningful to you if this corresponds to some open source version of software that you can go and check out and maybe compare yourself and see that “okay, yeah, this builds to precisely this binary that has this hash. And this hash is reflected in my remote attestation statement that I’m getting from the hardware.” And therefore, to have this nice trail of verifiability for software, it is very beneficial to have open source in confidential computing.
Camille Morhardt 14:54
This seems like a topic that comes up more and more sort of the traceability and transparency of software kind of all the way back to the origin.
Felix Schuster 15:02
Yes, absolutely. Yeah. So supply chain security, software transparency is a big topic. And Mark Russinovich your previous guest also spoke about that, and I agree with all of what he said. Essentially, we cannot expect people to verify every single line of a certain software, but what we can and give them with confidence computing is we can give them the assurance that a certain built version of a certain software was running or is running at this point in time. And we can then combine that also with auditable locks. And this is also what we do at Edgeless Systems. So what we do is, whenever we release a new version of our software, we publish the hashes of our reproducible builds, we publish them on a global public ledger, or log. And that ledger is called Zigster. And that is run by the Linux Foundation and others, companies can lock publicly the hashes of the software, and can this way commit two versions of the software like we can commit to saying, “Okay, this is now the newest version of the actual systems, constellation Kubernetes, it has these hashes. And when you’re running our constellation, Kubernetes, in an infrastructure of your choice, you can verify, okay, this is precisely the newest build, there are no changes, I’m getting the software that everyone else is getting.” And you may go and check out the source code later. And you may rely on others to check out the source code; you may go to the source code two years later, but you have this visibility of everything that is running. And you can go from source code to what was running, at what point, whatever you like.
Camille Morhardt 16:52
So part of the concern that this is addressing is I migrate an application, if I’m an enterprise, I migrate it to the cloud for scale–in the case, you gave a hospital and I just I, you know, I don’t want to go expand my server area. And I don’t frankly, want to manage it either. But I’ve had to do that for compliance reasons. But now I’ve got an option to go sort of be way more cost efficient, whatnot, get it into the cloud, but still maintain control over that data. But my concern would then be or one of my concerns would then be that the application that I’m running over there has been tampered with in some way. And then then potentially, that application could extract data or something. And so that’s what you’re verifying here. Just going back and making sure the app is showing up exactly as you would expect it to show up.
Felix Schuster 17:40
Camille Morhardt 17:41
So all of this seems actually fairly complicated. And you yourself said it’s hard. And I’m curious, because you know, you’re a small company, yet, you’re running on like hyperscaler, right? Azure, and you’re working with other Fortune 100s, right?– including Intel and other providers of CPUs that are providing the hardware basis for this kind of verification. And I’m just wondering, is it early days? And so we can still have very small companies in the mix with very large companies doing this? Is that going to change? Or because there’s an open source capability we’ll continue to see smaller companies emerge and grow and develop innovations in the space? What’s your take on that?
Felix Schuster 18:29
Yeah, first, you’re right, it’s hard. But we also work very hard to make it easy to use. So in the end, you can set this up in minutes on the cloud of your choice and the only thing you have to do is download the command line interface tool. And this will do everything for you–talking to the cloud provisioning the virtual machines doing the remote attestation and in the end, it will just output one succinct fingerprint or attestation statement for your cluster. And from that you can verify everything. That’s the first part. The other part, yes, small company, we are roughly 20 people; we have a truly great team and I have been in this space for like, like ten years now. And we have lots of experience in the team. I think we are still at a stage of the market where a small team can have an impact. And maybe we are also, in general, in a world where small teams can have impact.
If you look at companies that have recently have great successes in AI, for example, and I think these are oftentimes small teams. And I think we have such a great open source landscape now and so great tools, and maybe even AI fuel tools that allows small teams to move very fast and put together existing things, and then add some magic sauce on top of that. And I think that that is what we are doing. Like we’re not implementing a whole Kubernetes; we’re not implementing a complete new remote attestation infrastructure. We’re taking bits and pieces out there and have done great engineers put them together and put in the missing pieces to have a new and very innovative product.
Camille Morhardt 20:12
So essentially doing what an enterprise might have to create spin up a team to do on its own to basically get you know be comfortable moving into the as environment and you’re saying, Okay, we’re just doing that, and then at multiple different places as a as a service,
Felix Schuster 20:30
Big companies are working on similar offerings. Our perspective here is that we want to be the ones that are cross cloud and cloud agnostic and also work on the smaller clouds. Like, it’s not like there’s only hyperscalers. But there are lots of regional cloud providers–especially in Europe–and I think those people typically don’t have the resources to build something like this. And we are happy to bring this great technology, confidential computing, at scale to also the smaller clouds.
Camille Morhardt 21:00
Very interesting. So what is like one of the most argued topics in this space right now?
Felix Schuster 21:07
So I think the hottest topic is certainly confidential AI, something that has really held back the market so far has been the lack of accelerators. So, so far, you couldn’t just extend your trust boundary, your secure execution environment, you couldn’t extend that to accelerators; you are bound to the CPU. And that is, of course, a problem in a world where there’s evermore AI workloads, right. Everyone wants to run things on accelerators. And now Nvidia recently launched the H100 series of GPUs and they added confidential computing features to that. So you can now have a set up a secure channel from your Intel Xeon powered confidential VM to an NVIDIA H100 and share data securely between the two without the infrastructure being able to access that. And that opens up quite a few new possibilities. You can have LLMs run isolate from the infrastructure protected both themselves but also the input data are being protected against the infrastructure. And I think that’s opened up quite a few super exciting opportunities. Think about enterprise-ready AI services, stuff like that. Yeah. So that is one of the most discussed topic at this point. And everyone’s super excited.
Camille Morhardt 22:29
LLM being Large Language Model. So what kinds of LLM or other like AI workloads and enterprises do you anticipate seeing role first given this new architectural capability?
Felix Schuster 22:43
Yeah, to be honest, this is still unfolding. So they are no H100s with these features in the market yet. Currently, we and others are just experimenting with some early pilots and proof of concepts. I anticipate this being available in the market early next year. The workload that we will see that’s yet to be seen, I believe it will be the typical industries that we are currently seeing are interested in, in confidential computing–being healthcare, public sector, defense, finance, essentially, everyone was currently refraining from using AI services online. And that trying to build out AI capabilities on-prem, of course, the problem was building out these capabilities on prem is that it’s very difficult to get hold of accelerators. At this point, like there’s a long time you have to wait if you want to buy the latest Nvidia hardware. And if you only need this for certain workloads, maybe every now and then it may not be very economic to buy this hardware on-prem. So we believe that confidential AI will make it way more achievable and accessible for companies that have sensitive workloads to run them because they don’t have to pull everything out on-prem.
Camille Morhardt 23:56
What, and how far along in maturity or adoption, would you say that the infrastructure providers are at this point, and maybe you would divide that into categories, I don’t know of, you know, hyperscaler, versus you were talking a slightly smaller cloud service providers or regional cloud service providers?
Felix Schuster 24:13
Yeah, so the big three, they have most of the infrastructure out there that you would want, like they have confidential VMs, they have Intel SGX secure enclaves. And also the, I’d say maybe the second tier cloud providers, IBM, Oracle, they also have some solid offerings by now they don’t have everything that the hyperscalers have, but they have something. If you’re looking at the even smaller cloud providers, they typically don’t have anything at this point in terms of hardware, but also in terms of software offerings. We’re working together with the smaller European cloud provider, the cloud division of the little retailer. And yeah, we’re quite happy to work with his people and enable a non-hyperscaler to have competitive offerings.
Camille Morhardt 25:02
Interesting. Well, Felix Schuster, CEO of Edgeless Systems and brave entrepreneur, really interesting talking with you. Thank you for taking the time.
Felix Schuster 25:12
Thank you, Camille.