Matt Adiletta: 0:12
We see an emergence now of a need of having this private network that manages data of the AI along with your normal data center network.
Tom Garrison 0:28
Hi, and welcome to the InTechnology podcast. I’m your host, Tom Garrison. And with me, as always is my co-host, Camille Morhardt. And today we have a very special guest, Matt Adiletta. Matt is Senior Fellow at Intel focused on the Data Center of the Future. He has experience in embedded software, data analytics, silicon micro architecture, system design, networking, and ASICs. And he is also the acting CTO for the Data C enter and AI group.
So welcome to the podcast, Matt.
Matt Adiletta 1:02
Thanks, Tom.
Tom Garrison 1:04
We have worked together many, many years ago. So it’s great to see you. And we’ve got a great topic today and that is data center of the future. Why don’t we just start with the basic, high level description of what is a data center of the future? What problems is it trying to solve that’s different than the data centers that we have today.
Matt Adiletta 1:25
Projecting what’s going to happen in the future is really a tough thing. But we can see what’s happening and what the trends are and how we’re going to address them. What I see is the data center of the future is scale; the scale of the data centers have just been tremendous. And as more applications and more work moves to the cloud, that scale is increasing. But it’s interesting, if you talk to the CSPs, the key thing that is first and frontmost to them is security. So people will go to the cloud and go to these data centers of the future only if they really feel as though their data and their application their presence is secure.
The second thing that I see about the data center of the future is that it has to have a consistency; it has to operate the exact same way every day, every moment that you go to it. So it has to have a consistency of behavior and response to you. And then you have to have this belief that in addition to being secure your private so people aren’t understanding what you’re doing, even though you’re secure in what you’re doing. So there’s this inherent privacy, availability, or reliability, and then performance.
And so now, if you look at that, from that forward-looking view, how do we do this at scale? And what do we have to do from an Intel perspective? Since I’m looking at it from that perspective, what does that data center look like? And what the data center of the future is really trying to do is take a systems view: how do I provide a consistency of all those attributes I described earlier about the data center?
Tom Garrison 2:55
So it’s thinking beyond, obviously, just the compute part of a server, which is one, it’s obviously important, but thinking about networking storage, the entire system, at the high level, and figuring out how to maximize efficiency out of that.
Matt Adiletta 3:15
Exactly. There’s multiple views, you have to take the status of the future. The first view is, as I was describing, as a customer going into the cloud as an example of that data center. And then the second view is, if you’re owning the cloud, and you’re looking at vendors to put into it, what are those value vectors? So their big value vector, is that, “Okay, I gotta be secure, I gotta be private, I gotta have consistently I have performance. But my metric of goodness will be how much does it cost me.” So I need to have all of those attributes providing at a best in class cost. And that ends up being something that references the TCO the total cost of ownership, which includes acquisition costs, as well as operational costs. And Tom, what you just said before about the storage, the networking, the compute the accelerators, that all comes as part of that data center in the future.
And what’s really kind of interesting is that, if you look at the evolution of data centers, there was a point in time where the data centers had an Ethernet network for computing and talking amongst themselves at Fibre Channel, or storage area network that would operate for storage. And over time, those networks converged. And because of operational costs, and expenses and ways to optimize that, and what we see happening, again, is things like large language models or ChatGPT. As an example, what’s happening with ChatGPT is that you need to take a bunch of accelerators and interconnect them in a very highly connected fashion in order to perform the function. Where we had multiple networks and different management networks and stuff, we converge to a single network within the data center, we see an emergence now of a need of having this private network that manages data of the AI along with your normal data center network.
But what we don’t want to do in this data center of the future is have that same problem we had before where we had two different networks. So you want to have a consistency, a security model, a privacy model and a data center operational model that provides this capability of difference, but not doing it with different things.
Camille Morhardt 5:22
It sounds like AI is driving a lot of these newer needs or newer requirements in the data center. I’m wondering, conversely, are there some kinds of architectures or things that are now possible or in development in the data center that are going to change? What’s feasible from an AI perspective?
Matt Adiletta 5:40
Yeah, I have to say Camille AI is, is obviously just exploding, right? And what are the limits to the AI? And what we see is that if you look at these AI models, they’re growing in size, and memory footprint. And there’s a couple of really interesting–not to get too down in the weeds here–we continue to show ability to add more cores, and integrate and shrink dies and follow a Moore’s Law density. DRAM and memory is having more of a challenge, they hit the wall a little earlier, but both of us are hitting a power wall. And that power wall as you go to these complex, large language models, is a substantial problem. And so for cloud service providers that are trying to deploy inference and AI capabilities at scale, power is becoming a really big thing. And while everyone’s working towards solar and replenishable, renewable power supplies, it’s still a key thing just to move the power around, have a reliable source of it. So everyone’s looking, how do we deliver these complex algorithms at the most power efficient points. And I think there is incredible innovation going on now in that space, and specifically how you do management of the problem, as well as memory and what the compute engines look like.
So within the data center, the future I mentioned earlier, the two networks, that little private network over here was actually interconnecting all those accelerators that are performing some of these functions. And whoever does that acceleration in the most power and consistent, secure, reliable fashion, I think will be the winner. And I know, that’s where a lot of our focus here now is how do we deliver that capability in a really efficient manner. We have a lot of interesting tools–if you look at the needs of bandwidth, and power and cooling, and compute, all put into the silicon, this new AI stuff is driving technology development at such a rapid rate. It’s really exciting.
Tom Garrison 7:37
So let’s just talk about power a little bit. So my understanding back from the data center days when I was over there, back then we were really focused on the power getting into the data center. People that are in the server world probably get this but people that aren’t, this will be news to them. So getting the power into the data center is a huge problem, just getting enough power. So think about these potentially millions of servers all in one place getting enough electricity in from whatever sources you have is a huge problem. And then power when it goes through your chips, and then your chips do whatever it is it’s going to do in the data center, it generates heat. And so heat is another form of power and you’ve got to get the heat out of the data center. Otherwise, everything burns up and melts and causes all kinds of issues.
So can you talk a little bit about the technology, in the context of the data center of the future, how are things like cooling technologies immersion, for example, how does that play a role? And can you describe what that is because I don’t think people can grasp what does that even look like.
Matt Adiletta 8:44
If you look at servers or servers at one point, when you when I first started working on them we were like 45 watt, and then went to 65 watt, and now we’re talking 450 watts is something that we will be selling now. And if you look to the data center of the future, I see some of these accelerators, they want to be close to 1500 watts. So you get 1500 watts in this little area that needs to be cooled.
What we’re looking at now is liquid cooling. And there’s two different types of liquid cooling that people are looking at. The first is, as you said, Tom, the immersion. Racks today are six feet tall racks with 4042 slots for putting servers in. And we would take that rack and turn it on its side. And basically it looks like the old freezer chest. And what they do is they fill it with a mineral oil and they cool things by taking the rack that was top and putting it in and then just putting the shelves in in the mineral oil. And the mineral oil doesn’t affect the electrical nature of anything, but it just has this ability to take the heat and so you move and pump the fluid. So in an immersion, you can deal with that.
Or you go into this crazy stuff called “two phase” where you’re putting in a tank and there’s these new amazing engineered fluids, which as you heat them, you can set the boiling point. So you have just like we had before a board with a chip on it, and the fluid is next to it. But you’ve set the boiling point such that it boils at the heat that you want this to keep the chip out. And what’s amazing is that when you have that bubble form and the bubble move off the surface, it takes the heat with it–it effectively breaks a laminar flow, takes it away. In “two phase: you get a surface that you try and get a boiling enhancement coating on it that gets these little bubbles. And the little bubbles are all bubbling off. And that bubbling takes all the heat away. And then you recover in the vapor, that fluid and it comes back in you recycle it. So you have this loop of evaporation or bubble boil and then condense to bring back to the fluid.
But the engineer fluids are very expensive. So now the question is, what’s the best way to get this temperature out. And right now there’s a big push on cold plates. And what cold plates do is servers are typically flat, you put the cold plate on top, and you run a fluid through the cold plate. And you can have either a single phase where you’re running a very clean water through it, and you’re floating in the cold water fast enough that you take the heat away, or you do a “two phase” where you bubble inside, and then the vapor escapes, and then you recover it and you bring more fluid. And so you’d have this “two phase.” And what’s happening is everyone’s moving towards one of those three cooling systems. But this cold plate stuff that gets much closer to how the silicon has to operate.
Camille Morhardt 11:28
We’re talking about hardware architecture, and I want to ask about software architecture, also. Could you just spend a minute talking to us about some of the kind of emerging trends in helping to deliver software efficiently.
Matt Adiletta 11:40
So in the software space, what’s happening is, we are going from these programs that were linear, procedural on one machine, to the data center in the future–which is actually today–an application where part of it’s running on this machine, but most of it is running on a bevy of other machines and those guys are reporting back. Now, when you change your mental model of optimization of a data center of the future from this monolithic program that runs on this machine, to instead a small piece here and lots here all reporting back, then what happens is that you now have a different view of what do you have to value in the processing here, as well as what do you value on the network. That delineation becomes really interesting when you bring it back to the security thing and that privacy thing I was talking about. So you can see, one of the things that we’re trying to do and CSPs are doing a great job at this right now is disaggregate the infrastructure of running the cloud, from the processor that runs the application. And so what has happened is, it used to be that you’d have a processor, and then you’d come out over a network to another machine over here. And that network was Ethernet, or is Ethernet, we come up to that distributed over here.
So now the question is, who manages the packetization of data and information from this application to that worker distributed out here through this network. And in order to manage that communication, the infrastructure owner, the CSP has to manage that network, which means they have to have some control over that network packetization and security and mechanisms there; which means that the infrastructure providers actually sitting on the processor that you’re going to run your application on. And in order to really deliver that level of security and privacy that we were talking about earlier, what we really want to do is take that infrastructure software off of the application processor, and move it into an infrastructure processor. Hence what we’ve introduced on the data center of the future is this IPU, the infrastructure processor unit. And what it’s doing is it’s taking the software that the CSP or the service provider is using in order to manage and deliver the data center capability to users and taking it off the application processor and putting it in a processor in front of it that is in between here and that distributed node. And on that other side–that distributed nod—there’s another IPU that’s sitting there before it goes to the worker that actually delivers that procedural function. So now we have this disaggregation of the infrastructure, and the network coming back here, which means software has to change.
Software has got this distributed mechanism to it. And now you have to synchronize because you could have lots of things going on, you could run an application, say I want to do these five things, and you go out to ask five people, you can’t go on until all five guys come back to you. So there’s this exchange capability. And now, if you’re designing the processor over here, you now have to have a different set of optimizations to that processor than if you’d had a monolithic program just operating here. So now you have to manage interrupts and control structures and moving data remotely and receiving data remotely in a much more efficient manner, because now it’s all about data movement and interactions with others.
Tom Garrison 15:06
So Matt, obviously, there’s a lot there. The amount of change there is crazy to actually wrap your head around. I always think it’s good when it gets somebody like us got their hands on all these details, to just say, what do you see in the future that really excites you?–in the context of either things like what artificial intelligence will be able to deliver what the data center of the future is going to do like what really excites you about the future?
Matt Adiletta 15:07
It’s often from youth that you get the most learnings. I have a son and I sat with him the other day. And he paid the $20 to get ChatGPT for a license. And he was doing some pretty complex programming in the middle of a pretty interesting Linux kernel area of coding. And he was having problems with it. And he asked me, “ Hey, Dad, can you take a look at this?” We started looking at things. And then he said, “I wonder ChatGPT 4 could help us with this.” We asked Chat GPT, for a question. It asked us a question back, which was really interesting. We asked, “How would you like to see that data?” and it said, “run this.” And it produced some code that we ran on his PC, that would generate a log file; we generated the log file, copied the log file back into ChatGPT, because they wanted to see the log file, got the log file back. And then it said, “Here’s the code you should run” and gave it new code, took the code, put it in, and the bug he was having went away.
We can do things now that only our imagination is going to keep us from extending to; we’re going to have such expert assistance on things. It’s going to be crazy. And I think figuring out how to leverage and use these tools effectively is going to unlock amazing innovations. I think it could help us on the thermal. I see the new partner of expertise available as just been an incredible, exciting thing. And I just see this opportunity for great innovation coming.
Tom Garrison 17:07
That’s great. Well, hey, Matt, thank you so much for spending the time with us today. It’s a great topic, and you’re a great guest explaining all this stuff at different levels. So thank you and I look forward to hopefully all the things you talked about coming true.
Matt Adiletta 17:24
Thanks, Tom.