[00:00:36] Camille Morhardt: Hi, and welcome to this episode of What That Means. We’re going to talk about artificial intelligence and helping to make it trustworthy or securing it or more private and actually going all the way down into the hardware layer to do that. I have three professors with me to have that conversation. I have Farinaz Koushanfar from University of California, San Diego. N Asokan from University of Waterloo and Ahmad-Reza Sadeghi from University of Darmstadt. Welcome today.
Can somebody kind of give an overview of the ways that artificial intelligence is adapting to improve security in general?
[00:01:19] Farinaz Koushanfar: Really, we generally see now is that we are at the brink of the next industrial revolution, which is rightfully called the automation revolution. And this is enabled by the power of artificial intelligence. So this automation and this revolution in automation that is enabled by the AI is of course also happening in the area of security and privacy. So we are using algorithms that are smarter, more intelligence to automate the security and privacy processes.
As we are using these engines, these artificial intelligence as really an engine for automation, there’s also another side to it. And that is, is your engine reliable? Is it secure? Can somebody just take out this engine and try to do malicious things to it, or when you are training your engine to do these automated tasks, can they actually impact it to make wrong decisions?
On one hand, we are automating many processes, including security and privacy processes. On the other hand, the artificial intelligence itself has vulnerabilities that we need to specify. And last but not least, this abundance of data and abundance of models that is happening, is also exposing a lot of sensitive information from people. So there’s also that privacy risk involved.
[00:02:55] Ahmad Sadeghi: Because all these algorithms we are talking about are not really deterministic, it means that they do certain decisions with a certain probability, and that means maybe they are all insecure. My personal view to that and my experience with the research that we have been doing with my group and also with my collaborators is that yes, they are all insecure because if you manipulate the data, the output will be manipulated. If you manipulate the model that is made, then everything will be manipulated.
These so-called artificial intelligence, they are getting integrated, embedded into many decision making processes. And it doesn’t mean at all that these decisions are the right ones. There are lots of security holds in these algorithms because they’re not made with security in mind and they can be simply manipulated. I believe that AI will be a big, big danger for something that I call it, AI-medic like pandemic. It will be bringing all of us into a crisis, a digital crisis where the decisions that these neural networks or whatever they are not clear to any human being. We cannot verify it. We cannot really understand it. And those who want to manipulate it, can manipulate it. I’m talking about Wall Street, they are doing it manipulation for many years, but now they have more powerful tools.
[00:04:31] N Asokan: So all three of us consider ourselves systems security people. We think about things that can go wrong if there is someone who is trying to attack a system. And this is something that has been done for more than three decades people have used artificial intelligence and machine learning for intrusion detection. But what’s different about how we approach problems is that we don’t just apply machine learning to get a faster, better solution but you also think about what would the bad guy do to undermine our solution? And this is something that machine learning and AI people normally don’t do. So I could give an example, think about a face recognition system, right? We want to have a system that’ll recognize our faces correctly as belonging to ourselves. So an AI expert would say the way you would validate the system is to collect 10 images from my face, 10 images from Ahmad’s face and Farinaz’s face and so on. And then show that my face will always be recognized as me, Farinaz’s face will be always be recognized as her and my face will not be accidentally recognized as Ahmad’s face. Thank God.
[00:05:39] Ahmad Sadeghi: My God.
[00:05:40] N Asokan: But if Ahmad is trying to break the system, he’s not going to oblige just this system by using his face to pretend to be me, he’s going to do something adversary because he’s a smart guy. So he’s going to do things like he’s going to wear glasses or he’s going put makeup on or lipstick or something like that, so that he would look like me. And this is an aspect that hasn’t been taken into account in AI based systems. But like Farinaz said, AI is everywhere. And AI is being used not only in security, but in human resources, in jurisprudence, in policing. So making mistakes here and designing systems that can be easily circumvented is going to impact us in ways that we haven’t imagined before. And then the other angle that I wanted to briefly mention is that AI can be used to improve security and privacy and all of us have done work trying to do that, but it’s also a tool for the bad guys. They can use AI to try to make their attacks better.
[00:06:44] Ahmad Sadeghi: One thing I wanted to add is, as a security researcher, we also recognize that the standard security aspects and security mechanisms or security methodologies not always apply to AI in general, but also to machine learning or other kinds of learning, like deep learning. They do not apply to it in a straightforward manner because we need to get into the algorithmic way, the machinery of all these neural networks to understand, because they have like the human brain, they may have many layers. We cannot really understand sometimes the information that goes from one layer to another layer, and again, another layer. We cannot immediately say, okay, we have standard solutions for that in security community. And we can immediately apply to it. There is a huge number of ways to do it. And some of them are kind of unknown to us. So there is lot of room for research also from security researchers point of view, to apply to AI systems in general.
[00:07:54] Camille Morhardt: Do we have to actually be able to explain the AI in order to be able to protect it, or are we able to somehow come up with protection mechanisms still without understanding truly how it’s working underneath?
[00:08:08] Ahmad Sadeghi: Some part of our research concerns finding new protection methods for these algorithms to be very general. On the other hand the explainability aspect is very important and it’s also hot topic of research in machine learning. But if you look at the literature and if you look at the results, the results are very, very limited for a limited system, not for a complex system. Think about a financial system. Think about a social network. For example, in a country like Germany, which is very much industrialized, but in Germany, we have lots of factories. The so-called small factories play an important role, but more than 90% of all the CEOs of small size, midsize and big companies in Germany, they are extremely concerned about the explainability. They want to know what happens if I cannot explain what is happening. So how can I have assurance about the trustworthiness of these algorithms. Currently, I don’t see how you can handle very complex systems to verify what they are doing and why… Think about lots of legal processes, think about discrimination, think about all these things that can happen, and that will happen.
[00:09:30] Camille Morhardt: So Farinaz, would you be able to help us understand some of the security perspectives? I know you work with helping to secure at the hardware level. I’ll say decentralized artificial intelligence and algorithms. Maybe you can explain kind of what those are and how you’re looking at that.
[00:09:47] Farinaz Koushanfar: Sure. You could think about securing the system, including systems that have AI in them as securing your house, right? If you want to secure your house, then nobody comes in, what do you do? You close all the windows and all the doors and try to just shut down everybody from coming, but that’s not practical. The truth is, in securing computer systems, the story is exactly the same. The vulnerabilities are really at the interfaces. So when you interface to others, you interface on a network or you interface to the memory, or you try to get things in and out your system, that’s where most vulnerabilities reside, then people can attack your system and try to get that secret from the interfaces. And the fact is hardware is typically one level below those interfaces. So prominent use of hardware is to store those secrets in a way that they cannot matched from the interfaces.
Ahmad is one of the advisors for the architects for the trusted platforms, for Intel, and that is being widely used. And the core of it is that hardware is kind of a secret enclave where things cannot be really attacked well from outside, but hand in hand with my hat as a computer engineer, I really think hardware also has a role in making things much faster, much more efficient because you know, right now to make artificial intelligence more efficient, there is a lot of accelerators out there, that a lot of these accelerators actually are not very robust and reliable. When you make your AI models more robust, be it distributed AI models or AI model. Can you simultaneously make sure that this robustness doesn’t add to the overhead of your system, that you could do this robustness in real time and you could have a end-to-end system that performs really, really well.
[00:11:59] Camille Morhardt: So at the most basic level, you’re basically saying encryption or various forms of encryption or similar hardware-level protection or software-level algorithm-level protection can tend to slow a system down–not always but often–and so part of the role then is to improve the performance once you’ve done that encryption.
[00:12:21] Farinaz Koushanfar: Yes. And security and robustness here, definitely in terms of AI systems goes far beyond encryption. For example, there’s the adversarial attacks to AI systems where people are providing instances that look legitimate, but they’re actually fooling the model to make a wrong decision.
[00:12:42] Camille Morhardt: Can you tell us more about those? Give an example of how that works.
[00:12:46] Farinaz Koushanfar: Sure. AI systems are working based on gathering a lot of data and extracting statistics about it. The space of what we’re trying to learn is really huge a nd AI models are trying to build a lower dimensional representation of really a huge dimensional space. And when they try to do that, if they don’t have enough data in all corners of this multidimensional space, if there is added noise to the data, then these models don’t have very well defined boundaries that are always correct. And then attackers use that to construct samples that have a little bit of structured noise in them.
So the input looks like really legitimate. For example, in terms of the picture, you can see picture of a cat that looks exactly like a cat, but it has a little bit of noise in it, which is not detectable say by human eye. It has been shown if this noise is constructed carefully. And it’s really on the boundary of the decision making the AI model could classify this as a dog.
[00:13:58] Ahmad Sadeghi: A horse.
[00:13:59] Farinaz Koushanfar: Yeah, horse. And now, that’s nefarious attack, just imagine projects in the auto industry where these algorithms are making real time decisions as your car is driving. And that’s now no longer just a cat versus dog versus horse mistake, but it’s a dynamic attack on a cyber-physical system like a car that can have really nefarious consequences. And there’s a lot of beautiful theory about adversarial attacks and there’s a lot of nice algorithms try to avoid adversarial attack, but one aspect that we’ve uniquely introduced and we’ve been working on it for now about five years is how do you make this solutions integrated all the way from system level all the way down to hardware, so that you could actually detect them in real time without impacting the performance of the AI system, which by itself is quite intense on the computational resources that we have, which is why a lot of companies so far have been focusing on building accelerators.
And now just imagine, you need to use these accelerators in your automotive system, but these are not very robust. Then, if you try to introduce robustness, you lose the real time aspect of detecting things.
[00:15:22] N Asokan: At some level, adversarial examples like this are fundamental. It’s not a problem with just AI based systems, but any system that tries to build a model that approximates reality. So at some level you can’t avoid them, but you can do better by detecting them or compensating for them by looking at the system as a whole and not just the model. That’s a doubt of the system.
[00:15:46] Camille Morhardt: Well, I have to say, when you’re talking, I’m thinking this is horrifying, this thought of real time traffic direction or functional safety kind of a system that’s using automation to at least enhance decisions and actions, or maybe even taking action autonomously. So, help me understand how far are we from having something that we can consider secure or is this just very, very early?
[00:16:16] Ahmad Sadeghi: We are very far from when you come to the safety and liability of an industry. So if an in industry is liable for safety of human beings, for example, automotive industry, avionic industry and many other industries or critical infrastructures like water companies and power plants. If it comes to that, I personally, and my experience talking to a number of engineers, but also project leaders in different industries, we are far away from those cases where regulators allow you to put a car where this vehicle is connected to a cloud and to other vehicles. And it’s a self-driving car. Until that happens, it’s a long, long way to go.
But Hey, about 20 years ago, 25 years ago, we didn’t have a smartphone and we have it now. And I think that critical minds, among many scientists, there are a number of them, but it is not enough to think about don’t rush about these things, because these things are decision making. It is automation that has randomness in it.
[00:17:29] N Asokan: I think we put hundreds of people in tin cans and raised them like 10 kilometers in the air. And if that’s not scary, I don’t know what could be. It must have been really scary 50 years ago. So in that sense, I’m optimistic. I trust human ingenuity that we are still in early stages of AI based systems. Eventually I think humanity as a whole will figure out how to do these properly. So things that look scary now would be honest and then would be used in the right way, a decade or two decades down the line.
[00:18:01] Camille Morhardt: Are there specific things that you three are looking at that are not kind of well known in the world when you consider security and AI?
[00:18:14] Farinaz Koushanfar: I was talking about one of them, which is looking into full stack solutions for an accelerated robust AI, which is something rather unique. Imagine that you have the world of small devices, all working together, learning things and not just small devices, like hospitals are working together, learning things, and we need a lot of privacy preserving algorithms to do this. And now on the top of those privacy preserving algorithms on the top of all the security requirements that we have for communication traditionally, and now are even extending to computation. Now imagine that you also have this one added vector of robustness.
We talked about this inference time robustness, which is the adversarial learning. There is also the big aspect of data poisoning where one single person or single entities trying to poison the data to revert the models in a way that is doing something malicious for them. And, we are moving from inference based models to causal and predictive models. So this is a problem to stay and to just get exacerbated as we go through this.
[00:19:26] N Asokan: Even though a lot of the publicity and blitz is on, “look Ma!” what our neural networks capable of doing? They’re doing fantastic things, but understanding how they work is going to pave the way for interpretability, explainability and so on. And, I think perhaps not enough people are working on that because the applications and showing dramatic improvements is so much sexier than trying to understand how neural network work. But I think that’s an important part that is sort of evolving now, many people are starting to think about the basics of what’s under the hood and trying to understand that.
[00:20:05] Ahmad Sadeghi: Asokan mentioned the fairness aspect. So if we have an algorithm that we push it through another algorithm, and that makes this algorithm anti poison, so mitigates the poisoning, assume that we have that, unfairness is like poisoning an algorithm. So the question is, can I have a filter that I push any algorithm that I want to sell you? So for example, recruiting, somebody comes like Amazon or anybody else comes to you and said, buy this, this is my algorithm. It’s good for recruiting. If 10,000 of people apply for a job, this can decide for you, at least the first phase before a human HR look at it. And then I say, but how do I know that it’s a fair? So you have to provide data. I have to check it. All these kind of things is very messy and it’s not efficient.
So how about I have a filter. I push your algorithm through my filter, and then it becomes a kind of, I add so much noise that is not bad for the accuracy, but it obfuscates certain aspects of this algorithm that may be so to say unfair. And I think this is just an idea, but we are starting very small. There is a big research community on unfairness, especially when you want to define what is fair, what is not fair. It’s very complex. And that fascinates me because it has to do with a number of disciplines that you need to work with. It makes it more challenging. So this is the known-unknown that I personally am very interested to look into it.
[00:21:46] Camille Morhardt: My God, we’ve covered a lot today, right? Fairness, privacy, security, decentralized, centralized, federated learning. We’ve looked at full stack, the interest in that and explainability and its relevance. I’ve really appreciated the conversation. Thank you, Asokan, Ahmad and Farinaz for joining me today, and I hope to have more conversations about AI and security in the future.
[00:21:10] N Asokan: Thank you Camille.
[00:21:12] Farinaz Koushanfar: Thank you very much Camille for reaching out.
[00:21:14] Ahmad Sadeghi: Thank you very much for the invitation.