[00:00:30] Tom Garrison: Hi and welcome to Inside Podcast. I’m your host, Tom Garrison. And with me as always as my co-host Camille Morhardt, and today we have an amazing guest, Tamar Eilam. She is an IBM fellow and chief scientist for sustainable computing. She’s been at IBM research since 2000 and has 20 years of experience in pioneering disruptive technologies. She’s passionate about applying innovation to address the biggest problems humanity faces today. So welcome, Tamar.
[00:01:01] Tamar Eilam: Thank you. Thank you for inviting me. That’s a great opportunity.
[00:01:05] Tom Garrison: Yeah. We’re excited to talk about this today, because we’re going to talk about sustainability and try to really level-set folks on what is sustainability and why does it matter? And those sorts of things. So let’s first though, start off with just the basics, from your point of view, as an expert in the field, what is sustainability and why is it so important when it comes to computing devices?
[00:01:36] Tamar Eilam: Happy to talk about it. This has been my passion since 2019. The reason why I decided to switch from working on microservices and cloud computing to focus on sustainability is because climate change is the number one most important challenge that humanity is facing today. And in order to change the trajectory that we’re on today, the number one thing that humanity can do is to reduce the carbon in the atmosphere and reduce the carbon emission that is the result of human activity.
Sustainability is about mitigation and adaptation to climate change. If we look at adaptation piece, it’s all about predicting climatic events and knowing how to respond to them and being prepared for climate change. When it comes to mitigation, the number one thing we need to do is really to reduce the carbon footprint in the atmosphere. So the issues that are specific to computing is that really, we are at an inflection point where there are multiple trends that are obvious and that are happening today.
And one is the exponential data and data transfer. Everything is on Zoom and all these video transfer games and so on. So that’s obvious and that, of course, comes with energy emission and so on. Then, there is the new emerging workloads in the cloud that are very energy hungry, such as AI. So if you look just at AI, which is obviously very popular for very good reasons, the energy for training AI jobs doubles every three to four months.
[00:03:26] Tom Garrison: Wow.
[00:03:27] Tamar Eilam: That’s crazy. And look, AI’s an amazing tool and we in IBM are all embracing AI. We’re doing AI, we’re living AI and that’s because AI can help us actually really face all of these challenges, including discovery of material for carbon capture, including analyzing satellite images and predicting climatic events. AI is a great tool. However, with power comes responsibility, as I like to say. So how can we use AI responsibly and how can we really work to make AI more efficient?
That’s the second trend. And then, the third trend has to do with the demise of Dennard scaling, also known as the flattening of Moore’s law. And basically, what that says is that we cannot continue to expect to get the efficiency improvements from general purpose computing chips, like we used to get every two years, you get more energy efficient and more energy efficient and more energy efficient. And that’s because we reached the limits of physics and that’s why there is a move to specialized systems and I will talk about specialized systems and why it’s important later, but because of these three trends, this has caused some to raise the alarm on the increasing energy consumption of computing in general.
In fact, the Semiconductor Corporation published a report at the Cato report and basically, the bottom line there is the energy for computing overall is growing in a faster rate than the energy that we’re producing, period, than the clouds are producing period. And that’s a problem because obviously we need more computing, but we need more efficient computing.
[00:05:24] Tom Garrison: We’ve been talking about data centers and the same applies also on clients. So the client devices themself, you have the amount of energy and amount of carbon footprint associated with building the device. And then, you have the amount of energy that it takes to operate the device. And I know one of the things, at least from my dealings in this world, I was really taken back. For example, on a laptop, the amount of energy over the device’s entire life, most of the carbon footprint is associated with building the device. Only 20% is in use. On the server side, it’s exactly the opposite. It’s 20% is embodied in building the server and 80% is over its operational life and the interesting thing is that a server’s carbon footprint is about 10X what a client is over its life. Depending on which kind of device you’re talking about and you want to reduce the carbon, you would have a very different set of actions to try to reduce that carbon footprint, depending on the type of device you’re using.
[00:06:34] Tamar Eilam: Exactly, exactly. And I was also astonished to discover this fact, and indeed, you’re introducing the second dimension–the energy for IT, the power usage effectiveness and the source of the energy, carbon emission factor. But another dimension here is the entire lifecycle of that compute device.
[00:06:54] Camille Morhardt: Can you explain what kinds of things, what kinds of practices we’re doing globally that are not helping us over the last 20 years, and then what kinds of things are truly going… Do you think we’re there? You said it’s a tipping point for the climate, but is there a tipping point for us and what we can do with technology at this point, or are we just still marching down this path with the arrows going up and down?
[00:07:20] Tamar Eilam: I’m looking at it more from an outsider, so I can observe trends I think more than someone who is always doing this. And what I’m seeing is that the systems guys, hardware, they always have energy in mind. Always. When you talk with systems guys, with the people that are actually designing chips, they always try to improve the energy efficiency. They always try to have the chips smaller and more energy efficient. Not news for them. They’ve been doing it forever. Software people, people that are actually writing code have absolutely no idea. With the years, we see the introduction of new operating systems and new platforms and cloud computing and serverless and all these trends and some of it is good. So for example, when you look at cloud computing, in general, cloud computing is good for sustainability. Why? Because it’s more efficient. When you have these excess scalers, they have more efficient hardware, they’re purchasing more renewal energy, but they also have automation.
So because they’re automating and they also have economy of scale. So workloads of different clients can run on the same hardware. If they’re smart about it, they can do very sophisticated multiplexing, so they get more efficiency, more utilization, and the automation, grow and shrink, maybe power down machines. So that’s why cloud is actually very good for sustainability. However, in general, with software, because there is lack of awareness, we’re reaching the point where we have more and more specialized systems that introduce latencies of microseconds, such as different accelerators and non-volatile memory technologies and so on. But because of the levels of interaction that you have in the software stack, we’re not always able to utilize it in the best way, so we need to catch up with software awareness to sustainability.
[00:09:19] Camille Morhardt: Thank you for that specific answer on computing and then, I guess just, do you have a more general answer as well for technology?
[00:09:29] Tamar Eilam: The way I look at it, technology is our only chance to combat climate change. We can’t undo the industrial revolution and we can’t undo the industrial revolution because we have too many people living on earth and we can’t feed all these people. So we can’t reverse the clock. So technology and innovation is our only chance to combat climate change–and science, obviously. So we’re making advances. We need to obviously completely transform the power grid. We need to introduce optimization in the way we’re leveraging renewable energy, which is the big, big, big thing. The thing with renewable energy is that it’s dynamic by its nature. It’s not constant, so it’s not predictable. When you have wind, then you have wind power. When you don’t have wind, you don’t have wind power. If you have clouds, you’re getting less of solar energy. So, how to deal with this unpredictability? We need to have ways to control or manage the load and to be able to plug it into our smart cities in the best way.
And we’re not there yet, but what is going to save us? The only chance we have is introduce AI, introduce innovation in order to manage this. Now, one place where I think that is the best match for the dynamics of renewable energy is cloud computing. Why? Because of its dynamic nature, workloads can actually move with containers, workloads can actually move to different zones to different regions and different tasks can run at different times. So you can actually match the dynamisity of the cloud computing with the dynamisity of the grid and thus, maximize the usage of renewable energy. You can’t move a hospital, you can’t take the hospital and move all the patients to a different region where you have more renewable energy at certain times of the day. It’s not even, you have more, it’s you have more at certain times of the day, but you can do it with cloud workloads and I know many people, including us, are developing the technology to do that. Spending more and more energy as humanity. It’s not going down. We need to match it with innovation to be able to address these challenge.
[00:11:48] Tom Garrison: So Tamar, you talked a bit about the hardware elements and I think most people, when they think about sustainability, they think about things like, I don’t know, recycling and whatever. It’s physical. I don’t think people necessarily internalize as much in the technology world is the importance of software and how you can have the exact same device, physical device, and in one sense, running certain code and having a vastly different carbon footprint than the exact same machine, but running different code. And so, can you speak a little bit about where are we at from an industry standpoint on sustainable software, sustainable code that runs on these devices?
[00:12:39] Tamar Eilam: I think we’re really just scratching the surface. We’re really just starting. From a high, high, high level, you need to divide two different things. There is the actual code that implements the algorithm. Obviously, if you have an algorithm that is more efficient, because you use more efficient way of solving a problem, then it’s going to consume less energy. Then, there is how you deploy the software. So, if you have… Most of the applications today are distributed. Where you put the data and where do you put the compute? You want to put them in the same location. So this is called co-placement of that data and compute, so you save on the communication overhead here. And then, there is the aspect of management of the code. So everything around the management is so important and I don’t think that people realize it, but if you take a platform such as Kubernetes, there is a placement algorithm there.
So, code is packaged as containers. Containers are lightweight, that’s a good thing. But then, as you get more and more pods, there is a component there that places the codes on the nodes. How you place the codes on the node will have an effect on efficiency, how you determine the size of the container because the users don’t know, so they come up with the size for the container, it’s too big. So vertical scaling is what we call it. Dynamically changing the size of the container will have an effect on efficiency. So as I said, it’s not only how you write your code, it’s how you deploy a distributed application. And then, it’s how you manage the entire thing.
[00:14:19] Camille Morhardt: What kind of efficiency improvement are we talking about here if you adjust your code?
[00:14:25] Tamar Eilam: In terms of the efficiencies, look, so if we’re talking about any code, without talking about specific workloads, it’s hard to give you an answer; but we’re talking about between zero to 20%. That’s the ballpark.
[00:14:42] Camille Morhardt: Well, which is huge if you’re talking about a server at scale; if you’re talking about a server farm, that would be huge.
[00:14:49] Tamar Eilam: Yes.
[00:14:50] Camille Morhardt: Right?
[00:14:51] Tamar Eilam: Yeah. But that’s what we’re talking about. Then, it comes to specialized computing where you have AI, you have a special AI chip. So this is the entire area of taking software and systems and co-designing them together. If you know something about the workload, if you know something about AI, you can design an AI chip that will perform much better for AI workload. So it cannot address all the workloads, it can address AI workloads. I think it’s fascinating and I think this is the future.
When you talk about specialized system or software and system co-design, look at the characteristics of that workload that you have. Can you tolerate failures? Can you tolerate some inaccuracies? Metrics multiplication, it’s a very specific thing which is used everywhere. Can we optimize for just metrics multiplication?
[00:15:44] Tom Garrison: So Tamar, where do you think folks, the people listening to this podcast today, where can they get smarter on what they should do? Where they can either… Actions they can take or people to listen to. Where should you send them?
[00:16:06] Tamar Eilam: Good question. So we have a lot of material on our IBM website if they want to get educated about what is analog AI, which is the future and what we’re doing with the digital AI, which is the reduced precision chip. We are trying to start a work group under CNCF for sustainability, so this would be a place to watch. Obviously, there is the general area of sustainability and GHG protocol defines how things are measured and so on. So there is a ton of material about just the basic stuff of how do we measure? How do we quantify? What is the life cycle, carbon footprint of products? And then, extrapolate from that to machines or software. So, that is very well documented.
[00:16:55] Camille Morhardt: Can I ask you, you mentioned little bit ago in this conversation that in 2019 you listened to I think it a lecture and it changed your life and it moved from cloud computing architect to sustainability innovator. What did you hear, what resonated with you that changed everything for you?
[00:17:14] Tamar Eilam: A presentation that Steve Easterbrook, a professor from University of Toronto gave in the ICSE conference, which is the International Conference on Software Engineering in Montreal and it was a keynote about climate. He’s a technologist that made the shift exactly like me to just work on climate. What he said is that the UN report from 2018, that was very grave, was actually really underestimating the problem that we’re facing. And the reason why it’s underestimating the problem is because it didn’t take into account the interrelationship between multiple different systems, such as the permafrost melting in the North Pole, resulting in the release of methane to the atmosphere, which is even accelerating the global warming even more. As exactly as he said, he said the research is going to come out. I followed it after I listened to this presentation and indeed, research started coming out about saying the 2018 very grave report is underestimating the problem that we’re facing.
And basically what he says and the takeaway from this talk was three different things, is one, talk about it. Talk about climate everywhere, which is what we’re doing now. Two is be political and what I mean by be political is take any opportunity to vote, any opportunity that you have. And three, make it your job if you can. As I said, it caught me and I couldn’t stop thinking about it. I started asking everyone in IBM is, “Okay, what are we doing about climate? What are we doing about climate?” I got pulled into multiple client conversations and then in January 2020, IBM Research decided to start a new program, which is called the “The Future of Climate” and I immediately joined. And then, COVID hit and we all went down and started working together on future of climate. And that’s where I am, until today.
[00:19:30] Tom Garrison: That’s great. So Tamar, this has been a great conversation, I think, to get people starting to think, even if they haven’t before that, about the intricacies around sustainability and how it relates to computing. Thank you for coming in and we look forward to all of the exciting discoveries that are going to be in this area moving forward.
[00:20:03] Tamar Eilam: Thank you.