Camille Morhardt 00:28
Welcome to the InTechnology podcast. I’m your host, Camille Morhardt and I have with me today Chris Kelly. He is Vice President of Client Architecture at Intel, and also General Manager of Software Definition and Strategy for platforms. Welcome to podcast, Chris.
Chris Kelly 00:45
Hi, Camille. Thanks for having me.
Camille Morhardt 00:47
So we’re just going to dive right into a fairly meaty topic: we’re going to talk about how endpoints and software are affected by artificial intelligence evolving toward distributed computing. We’ll walk through it in stages here, I wonder if you could kick us off by providing some context as to this evolution of artificial intelligence into distributed computing models?
Chris Kelly 01:11
Yeah, it’s a really interesting topic, the AI world is kind of undergoing a supernova, we like to say; it’s almost like we’re being thrown to the edge of the universe at light speed trying to keep up with advances in AI. But the AI revolution, which is upon us is happening on clients now, as well. And AI models as they continue to get larger and continue to be refined, we don’t believe that those things are going to end up living on the cloud forever. There’s a bunch of different reasons for that, right. AI models on the cloud, as they continue to get more sophisticated, are going to be expensive to be able to transfer that amount of data between the cloud and an edge and an end user and an endpoint. And a lot of end users are going to end up needing very specific tailored responses, or co-pilot trained or inference models that have specific responses or content that they care about, versus trying to go use an AI model or have a chat bot response from the cloud.
If you’re an enterprise, for example, you want to be able to use an AI model to specifically help engineers design a chip. Or if you’re a retailer, you want to be able to have responses as a retail group that are specific to your own products; you can’t really put all that stuff in the cloud and have it be both cheap and private. So privacy and security is an important aspect of AI and client. So the only way to really make that work is AI models, instead of them only residing in one place in this sort of continuum of cloud edge and endpoint is to be able to have a lot of copilots and AI models reside on an endpoint.
Typically, I think what will happen is you’ll have large models that will be trained by data center and edge products, and then inferencing will happen on endpoints, we’re building capabilities into PCs that will allow that to begin to occur with our next generation product, they’re working closely with Microsoft to be able to open that up. But the fun part will be a hybrid AI model, where you’ve actually got this one model or a set of sub models that has to spread across the continuum of data center, edge, and endpoint and ensuring that all that stuff works together as a really interesting sort of hard distributed computing problem.
Camille Morhardt 03:18
I assume software is going to solve it. How is that evolving? Or what kind of different models are we starting to see emerge to help orchestrate that sort of hybrid AI as you describe it?
Chris Kelly 03:27
Yeah, and we played around with a couple of different techniques, I think there’ll be multiple techniques that get used. Typically in tech we’ll experiment a bunch, and then some winners will probably emerge, some standards will emerge. At Intel, we played around with the notion of extending Kubernetes, which is sort of the existing open source ish container technology and data centers. If you got a Linux based client or a device that can talk Linux, like using the WSL2 interface with Windows, it’s relatively easy to go do that. And that’s one mechanism. Like if you want to put a model inside a Kubernetes container, and you think you can have that reasonably perform, you can extend that then to an endpoint. There’s some issues with that there’s some security issues with it on a laptop or a desktop, a Kubernetes container has a tendency to be a pretty large image. And PCs and endpoints generally are concurrent devices, we don’t just do one thing, we do lots of different things. So when you layer a Kubernetes container on top of a client, other stuff stops working very well. So that will be one technique that may be used.
The other technique I think that’s really interesting that’s emerging in the ecosystem, and will provide potentially some additional ubiquity is using web workloads and web techniques to be able to then move workloads to where user data is. The key thing is not having to move data, right? You don’t want to have to move gobs of data you want to move models and applications, if you will, to where the data exists.
Camille Morhardt 04:50
Is that because of latency? Or is it because of transmission cost? Or is it because of privacy? What are all three are what are the —
Chris Kelly 04:56
Excellent all three, right, all three. It ends up being a much larger amount of data bits that you’d have to move if you’re moving larger and larger data, or even source data for a model, it just gets prohibitively expensive. And if you’re moving data around on a network, the hypothesis is that latency will forever get smaller. But as you approach the speed of light, there’s going to be a limit of how fast you can get things from one place to another moving the application to the data avoid some of that problem.
But I think the most interesting one for us at Intel is the notion of privacy—not just for regulatory reasons, but sort of also the right thing to do reasons. If a particular end user has preferences, or you’re inferencing on an individual’s end users preferences to improve the efficacy of an AI chatbot to give you better responses, you don’t necessarily want that in the cloud. And if you’re an enterprise, you definitely don’t want your private confidential proprietary information jumbled into the giant chatbot in the sky kind of thing. And you know, Microsoft and OpenAI know this. This is why the creation of specific copilots, for example, for enterprises becomes an important aspect. So all three, Camille, are important to get done.
Camille Morhardt 06:07
So what is the status of Moore’s Law today?
Chris Kelly 06:11
Yeah, it’s alive. But it’s morphed a little bit.
Camille Morhardt 06:15
What is it remind us what Moore’s law is?
Chris Kelly 06:17
Yeah, so essentially doubling the density of transistors in any given unit area, every roughly two years. Some generations, it’s to some generations, it’s a little longer than two, there been a couple of generations where we’ve been able to improve, even on that, but you know, the notion of all of us in the semiconductor industry is to keep the physical aspects of Moore’s Law alive. And we can continue to go do that; the advanced DUV techniques that are being used are pretty amazing. I mean, almost bridging on science fiction stuff.
Now, there’s a corollary, which we have to acknowledge, which is, it is getting more expensive to produce a wafer in order to be able to continue down the path of physical Moore’s Law reductions, right. And there’s enough demand and requirements in the industry to continue reducing that cost. But every time you have to jump over a big technology leap, it stands to reason that things get more expensive, certainly for a time. And so for advanced node manufacturing, it’s more expensive to make a wafer. How do we compensate for that? So we compensate for that change by using disaggregated technologies, chiplet technologies on package .to be able to then mix and match processes and suit those processes to taste for the products that we need to go make. Think about a microprocessor, an Intel product, not just about CPU cores, but we have a bunch of IPs on that particular product graphics engines, NPUs, a bunch of IO technologies, media accelerators, etc. Not each of those IPs needs to have an advanced node on the same pace that say a CPU core does, or a GPU may need to.
And so to balance sort of both cost and R&D investment and reuse opportunities, we use a disaggregated architecture to put the right IP on the right transistors. And as costs morph and increase that way, we can also then mitigate having to move everything to an advanced node where you may not necessarily need the advancement of it.
It’s funny, we don’t often talk about packaging technologies or packaging and test technologies as part of Moore’s law, but in my view, they absolutely are. There’s been as many or more advances in advanced packaging through spec memories, Direct Attach dye attach technologies, IO that allow for dye to sit right next to each other on a single package. And that whole combination, that whole sort of cocktail makes it such that you can continue to see the advances in Moore’s Law generally,
Camille Morhardt 08:39
At a high level, what are some of those sci-fi evolutions in actually chip production or transistor size reduction?
Chris Kelly 08:47
Yeah. If you think about the history of how we’ve made transistors, everything started out sort of on a single plane. And the high level changes that have happened over the course of what the last 10-15 years have really been twofold. So looking at different—call them exotic, you can call them just another elements of the periodic table—making chemistry changes to base transistors and transistor construction and metal construction. That’s sort of number one. The other one is taking that the central planar transistor construction technology and making it 3D. So we started using trenches and trench FETs and now you’re starting to see the evolution of transistors where instead of gates, and gate technology being only in one plane, or one and a half planes, you’re starting to see a full 3D gate technology where we can get more contact between the gate and the train, and densify transistors.
So it’s sort of like an all in thing to me, like so chemistry, physical construction, metal sec and metallization. And deposition and metals also have to keep up with that you have to interconnect all the transistors to be able to get signals in and out of them. You know, I don’t work in that part of the company. I work in the design part of the company, I’m always in awe of some of the stuff that our colleagues over in the TMG world come up with and what we’re able to do to advance the battlefront of technology. It’s pretty amazing.
Camille Morhardt 10:04
That is pretty spectacular. So what other things should people be thinking about or aware of when we think of client computing and moving forward?
Chris Kelly 10:13
I think it’s a really exciting time to be working on PCs. You know what the people have declared the death of the PC many times and I’m sorry to say they’re wrong. The PC’s has never been a more vital or important device in people’s lives—not just through the sort of explosion of video conferencing technologies when we all could come into an office, but you know, the PC and the PC platform is where you go to do your most important valued work that you’re gonna be judged on. Right. So if you’re in an enterprise environment, and you want to do something and produce content, you’re going to be judged as to the outcome on you’re probably not doing that on anything other than your PC platform. A PC is the ultimate Darwinian device. It’s a device in the endpoint stack that evolves the fastest, and is capable of handling and managing through evolutions and changes and we still believe that to be true.
Of course, the number one big change, you’ll see in PCs over the next three or four years we’ve talked about briefly already, which is the advent of AI capabilities, and the AI/PC era sort of beginning. If we’re honest, you’ve been able to run AI content actually AI models on PCs for a while. There’s instructions that we built into our CPUs that handle sort of matrix calculations. As the AI supernova continues to pace, we need a lot more capability to be able to run copilot models and helper models, transformers. We’re not even in the middle innings of AI evolution on PCs, we’re just beginning the game, if you will. And so that’ll be the big change that happens over the next two or three years, Camille, is adding AI capability, adding end user and developer capabilities to develop on them.
But the thing about AI, is AI is an enhancement, or horizontal augmentation to all of the things that you do with your PC today anyway. So like, you’ll have AI enhanced video effects, to be able to make people like me look way better than we do. When we talk about stuff like this, you know; whether it’s presentations or word documents or code, the work that you do in the workflow that you do will end up being enhanced by AI enhanced functions. So AI is not sort of like an application class on a PC so much as it is augmentation to workflow, where then that will be able to spawn either new end user applications, it’ll be able to spawn new ISV value as they add AI to their existing workflows.
So the thing that gets me the most excited about AI and PCs and to see what our ISVs are going to do with it are once the capability exists, and you can count on having the ability to run a model on a PC pretty regularly, what are our ISVs gonna going to do with the capability that we provide
Camille Morhardt 12:54
Being independent software companies.
Chris Kelly 12:57
Yeah. ISVs, Independent Software Vendor. Yeah, sorry for using the jargon. But yeah, that’s right. Adobe, Autodesk, the web and web workload folks that are making apps on the web using web API’s. All of those applications will be enhanced by AI capabilities locally on the client. And that’s the exciting thing.
Camille Morhardt 12:53
Very cool. Well, Chris Kelly, VP of Client Architecture, and also GM of Platform Software Strategy and Development, very great to have you on the show. Thanks for joining us.
Chris Kelly 13:26
Thanks for having me, Camille, good to see you and be well take care.
Camille Morhardt 13:29