Camille: [00:00:36] Hi and welcome to What That Means: Deep Learning and Artificial Intelligence Ethics. Today, we’re going to dive into deep learning and also ethics in artificial intelligence. We have with us Ria Cheruvu, who is the Lead Architect for AI Ethics at Intel’s Internet of Things Group. She has a bachelor’s from Harvard–a degree she began at age 11 and completed at age 13. And a master’s degree in data science, also from Harvard University. Her research at Intel is on artificial security and ethics, uncertain AI robotics, deep learning for the internet of things and computational models of intelligence. She has two patents pending on AI, IP protection and other patents on AI security and AI ethics undergoing the filing process right now.
Ria, it is wonderful to have you on the show. Thank you so much for joining.
Ria C: [00:01:35] Thank you so much for having me.
Camille: [00:01:36] The first thing I’m going to ask you to do is to attempt to define deep learning in under three minutes.
Ria C: [00:01:42] Okay, perfect. So deep learning is a subset of the artificial intelligence field that involves very large artificial neural networks. So an interesting way of thinking about deep learning is kind of like a parallel to human intelligence, but the idea is that giving these artificial neural networks–which are very similar to biological neural networks in the human and animal brain–we can construct these large algorithms or structures that can then perform a high level functions. For example, I’m trying to detect cats versus dogs, and that’s kind of the simple application moving onto more complex applications–uh, self-driving cars, predictions for retail, for example, smart edge devices and other different types of technologies.
So, um, some of the pitfalls of deep learning are that they don’t really closely mimic the human brain yet due to both algorithmic consequences or complications that are still being developed. Um, but to kind of summarize it, deep learning is our almost best attempt at trying to mimic human intelligence using algorithms and computational models.
Camille: [00:02:45] How would you describe it relative to artificial intelligence?
Ria C: [00:02:50] So deep learning is a subset of artificial intelligence. Whereas the field of AI can be considered as an intersection between neuroscience, philosophy and deep learning. There’s also this confusion as to what’s the relationship between DL and machine learning. So machine learning is the idea that you have a bunch of other different techniques outside of neural networks, and you can also apply for these inferencing and predictions.
Camille: [00:03:12] Okay. So there’s multiple different ways to use artificial intelligence and one form of deployment is deep learning, which uses the neural networks. And then there’s other forms that are out there.
Ria C: [00:03:24] Absolutely. And within the deep learning field itself, we have a bunch of different types of algorithms–reinforcement learning, supervised learning, unsupervised learning, active learning, and the list goes on and on. But the primary focus with the deep learning field is on supervised learning, which can be considered a sophisticated pattern recognition algorithms.
Camille: [00:03:43] Well, that sounds like a lot of good stuff to talk about. So let’s dive a little deeper.
The first thing that you, that you kind of brought up or the sub applications or applications within deep learning–supervised and unsupervised, an active learning, I guess would be another one. So could you talk about what those are and some of the benefits and downsides of them?
Ria C: [00:04:10] Sure thing. So the awesome thing with supervised learning is that it helps us to achieve different types of results on a lot of tasks. So for example, diving into that further, as I mentioned, you know, from all the way from detecting cats and dogs, which is very basic, simple object recognition to being implemented in self-driving cars, supervised learning is enabling pattern recognition, different fields.
There’s also this idea of transfer learning different types of algorithms, like convolutional, neural networks and other advancements in the field that are allowing us to perform tasks such as object detection and object recognition on different datasets and applying it to particular tasks.
The downside with this of course, is that we have to define all of our tasks. Uh, thinking about pattern recognition. And sometimes we would like to break that boundary and consider, “okay, can I just do this exploratory analysis of my data, where I’m just looking at the data, trying to figure out patterns without really having a defined input output,” which is required for supervised learning. So this ?exculpatory? analysis of data is known as unsupervised learning. And although it’s not really considered to be a mainstream part of deep learning. there are some awesome impressions for deep unsupervised learning, like generative adversarial networks, which allow you to, for example—
Camille: [00:05:27] OK, hold on. I want to come back to generative adversarial networks, but I just want to make sure I understand. With supervised learning you’re showing the machine an image, for example, of a cat and a streetlight. And the, the machine says, “okay, I think this one is a streetlight.” And you say, “uh, no, that’s a cat actually.” And then it learns.
So we have these labelers, um, and data scientists that need to sit and supervise essentially the computer as it learns. When it gets to the point where it’s 99.99, whatever percent likely to recognize a cat, we’re like, “okay, this one’s pretty good. Let’s move on to the next thing.” Now you’re talking about something else. If nobody, no human is there to correct the machine once it’s made a decision, is that the land of unsupervised learning now?
Ria C: Absolutely.
Camille: Okay. So say more about that, how that works.
Ria C: [00:06:20] Yes, definitely. Because I think that it’s again interesting to think about unsupervised learning as not requiring a human. But as we dive into the field of deep learning, these types of broad statements, um, they kind of start to both validate themselves and also starts to become a disadvantage. I mean, there’s a lot of different research in this space that’s attempting to overcome these challenges and these broad statements might not apply at that point.
Um, but the basic idea with this unsupervised learning is, um, for example, as I mentioned generative adversarial networks, which can try to generate text from images, or generate images themselves. And that’s the idea behind deep fakes, where we have this application, you can take an existing video and all of that training data, and then you can start to generate your own images using this type of algorithm.
We also have a 3D object reconstruction or construction from a single image and other really cool applications of unsupervised deep learning. So here, um, the issue and the reason why I keep mentioning, you know, um, these broad statements might not apply is that we do want humans in the loop as part of this process. For example, in the case of deep fakes, we have issues with AI safety and ethics. Who is going to control the creation of this algorithm?
Uh, where are we going to publish its outputs? How do we let the public know that this was generated by an AI algorithm and not by a human? So lots of different questions that are coming in as part of the development and publication process.
Camille: [00:07:49] In order for a machine to be able to generate an image doesn’t it have to already use supervised learning to get there?
Ria C: [00:07:58] They are very similar paradigms. So yes, it can be considered that it is kind of using that type of mapping function. But the interesting thing here is that instead of just trying to predict an output, it’s just trying to do whatever it wants. It’s creating an output, essentially. So the distinction is very subtle in the sense that it is kind of trying to learn the mapping. And of course this differs based on the type of algorithm, but for example, trying to recognize those facial expressions and then generate its own type of image. Um, but. Uh, the idea that the human is not really involved as part of the mapping, um, applies for both supervised and unsupervised learning. But the difference is between trying to predict an output versus creating an output, if that makes sense.
Camille: [00:08:40] Okay. So unsupervised is where we get all of the generative, generative art, generative music, deep fake, this type of thing?
Ria C: [00:08:49] Absolutely. And of course, um, the unfortunate thing is this is about to change. I recently read an article that a GPT3, which is a very large language model that is, I think, considerably in the supervised learning domain is starting to perform some of the tasks or generative adversarial networks were performing–such as I believe, being able to recognize or generate text based on an image and those types of functionalities.
Camille: [00:09:17] Well, why are you considering these adversarial? I’ve heard the term adversarial network. So what is adversarial about it?
Ria C: [00:09:24] Uh, so the idea is that you have a generative model, any discriminatory, uh, discriminator or a discriminatory, um, model that’s essentially trying to compete against each other in order to generate high quality outputs. And so there are lots of different problems that are associated with training ?gans? Um, there have been different algorithms and techniques proposed to help overcome those challenges. Um, but they, this entire field is different from the field of adversarial examples themselves, which are a fields or sub-fields in deep learning and cyber security that are covering ways to attack deep learning algorithms.
Camille: [00:10:00] Okay, so you, you broached the cyber security topic. So let’s spend a few moments on that. Um, how is deep learning being used in cyber security? And then maybe we’ll explore how cyber security is being used for deep learning.
Ria C: [00:10:16] The idea with deep learning for cyber security is that given the sophisticated pattern recognition algorithms that are very much data hungry–and the reason why I didn’t mention this, of course in the introduction is again, now we’re starting to build algorithms that don’t require as much data, and that can work with small data rather than big data and still form interesting extrapolations and find interesting patterns. Um, but nevertheless, in this cyber security domain, especially, more data is very beneficial for tasks such as malware detection or being able to predict user behavior and anomalies, et cetera.
Um, so in these types of situations, there are two different approaches that can, uh, leverage deep learning. Uh, one is detection and the other is response. So the first, I think is more popular. It’s the idea that you’re trying to observe, for example, a time series of data, and you’re trying to see, okay, where can I find anomalies in patterns? For example, in user behavior. Is a user downloading a malicious packet or some packet that looks suspicious that I could classify as malicious. And I see that their behavior is not similar to other typical users’ behaviors. And is that person a malicious agent who’s trying to hack into the system?
Um, and then the second prong or pillar if this approach is the idea of response–where you’re trying to autonomously correct or mitigate some of the harm that can be caused by a malicious agents’ or actors’ approaches.
Camille: [00:11:44] So Ria, can you explain in a little bit more depth, some of the types of attacks that we’re seeing using deep learning on, on cyber security?
Ria C: [00:11:54] Absolutely. So the first way to think about this as the idea that you have a application where machine learning is being presented to the user as a service. So, um, the first type of situation that you might have is a white box model, where you have kind of access to your entire model, um, and perhaps a different parts of your API. Or you could also download that model, as well.
The black box situation is where you don’t have access to download the model. And the only thing that you can try to do , to attack this API is to query the system. So it’s interesting to think about this black box scenario, because if you can just download the model in the white box situation, you can think of a lot of different attacks. But for this black box use case, it’s strange. If you just query the system and you look at it’s outputs, for example, let’s say the machine learning models, predictions, how can I just steal the model IP all of a sudden?
The solution for this is model extraction for that class of attacks. And it’s a solution kind of from a hacker’s perspective, because the idea here is that just by querying the system and by getting the outputs of the AI model and its confidence scores, you can start to reconstruct that model or even learn more about it. So when performing these types of attacks, the attackers might have some information other than just querying the system. For example, they might know what is the model class, maybe they know, “okay, it’s a type of convolutional neural network. Maybe it’s an Alexnet, for example, that’s bringing in the documentation for the system page.
So with this information, what they can do is they can start to kind of extract the details of the AI model. And one of the attacks in this space is to, given these details, start reconstructing their own model on the side, to get it, to replicate the behavior of the model that’s presented in the API.
Camille: [00:13:44] So these are, I guess, adversaries or bad actors kind of trying to sort out, what– are they trying to sort out what the cyber security defenses are of an organization? or are they trying to sort of break the IP of the artificial intelligence or deep learning models that an organization is using?
Ria C: [00:14:03] I think it mostly falls on the latter, but, um, the former is also definitely possible. Um, I would actually recommend that the former is important, but we should be able to do that or to encourage that type of automated red teaming type of approach, whether it be automated or not. But this idea that you do have red teaming and penetration testing, that’s being performed for AI models so that attackers don’t come in, try to exploit your system, try to figure out your defenses and then start exploiting those as well.
Um, so if we can kind of anticipate those problems beforehand using these techniques like red teaming or penetration testing, then I think that that would be a great step forward.
Camille: [00:14:46] Okay. So you, you talked about red team penetration testing. What are some other kinds of defenses that are out there? You are previously mentioned differential privacy and homomorphic encryption.
Ria C: [00:14:58] So this idea of differential privacy is adding noise to the raw data. So for example, a data subject will not be affected by their entry in a particular data set. And again, there are a bunch of different differential privacy schemes and algorithms, so this is kind of a broad sweeping statement, um, that should be maybe invalidated in the future. Maybe new technique comes into place and there’s a better way to add noise.
For example, just if you consider adding differential privacy techniques to a machine learning model, do you want to add your noise to the broad data? to the weight secure model? There are a bunch of different considerations here.
Camille: [00:15:33] Fundamentally, it would be if I give you my personal information the idea is the data set itself, removing any one individual isn’t going to affect the outcome, such that I could determine who that individual was?
Ria C: [00:15:45] Absolutely. That’s the guarantee that should be provided. And, um, on this kind of topic, I would like to introduce kind of a new technique or paradigm that’s relatively hidden. I’m not sure if, you know, if we’re getting more popularity soon, but it’s this idea of model unlearning.
So in differential privacy, the contribution of a data point is still non-zero. If we take this a step forward, what is a user wants to have their data deleted–which is kind of like–going along with the GDPR right to be forgotten. Um, so the idea is, you know, a user does want their entire data to be deleted from a particular system. How do we make sure that that contribution of that data point is zero?
So there is this new paradigm called unlearning, which is tackling this type of situation. And it proposes an intuitive and interesting solution in order to able to achieve that. I see it as actually very similar to the approach that’s being taken for federated learning, but it might be a parallel or maybe even a competitor to differential privacy solution.
Camille: [00:16:48] I think unlearning is very interesting. So Ria, could you give us another example of a description of a defense? maybe federated learning?
Ria C: [00:16:56] Absolutely. So federated learning is this idea where you have local computation that is then being aggregated on a central server. So one approach for using federated learning for machine learning is to locally compute your grades and then you’re sending this to a server that’s going to be aggregating them.
Cryptography is actually preventing the server from accessing individual information summaries of the data that’s being sent from these local notes. So it’s a very interesting approach. It can be paired with differential privacy, and homomorphic encryption to strengthen the scheme. Uh, there are different types of federated learning like horizontal and vertical schemes.
Um, there is also this interesting advantage of using deep learning with federated learning– more of a feature than advantage. The idea is that you can split a neural network model–it’s different layers–into private and shared modules, and then handle the federated learning scheme accordingly.
So, and there was this interesting paper on using the sprint LSTM or in other words, a natural language processing on neural network model, where they were able to split some of the layers depending on their position in the network and then send them either to the server or locally compute the gradients for the other layers accordingly–depending on whether or not they were considered to be important and needed to be secured or not. So that’s a very high-level description of their approach.
Camille: [00:18:19] When I think of kind of what you just described, I’m thinking about, um, maybe, maybe this works with mobile phones or cell phones where people are, um, you know, we have our raw audio file or actual conversation. We don’t want anybody to record that, ideally. Um, but we do want them to optimize with the, with the learning models, um, noise reduction in the background, or compensating for some sort of background noise or wind or hum or anything like that.
And so we’d like to be able to update the model with while retaining our own sensitive information and conversation content. Is that kind of what this is used for?
Ria C: [00:19:00] Exactly. And the idea of kind of adapting this to the IOT environment, I think is key. So for example, mobile phones or devices in hospitals, I really like also the retail application, as well. Um, and you know, kind of in the introduction, I mentioned predictions in retail, which is kind of vague, but the idea is, you know, you’re able to form marketing based on certain data that you’re recognizing from your environment. You do want those predictions to be fair. You also maybe want to retain some sensitive data. Let’s say user is inputting something that’s relating to their health, like their health records or something. Um, just at a particular kiosk, you want to make sure that, okay, and it’s giving you a prediction of, okay, this is the recommended medication to take, for example, um, you know, kind of like as a walk-in health care type of situation.
So the idea here is that we want to make sure that, um, “okay, this data is, is being encrypted. It’s sensitive. It’s not being sent to a server, but I’m still providing the user with the insights that they want to know or tailoring the product to their needs.”
Camille: [00:20:03] Well, and it’s definitely highlighting the importance of securing the model and securing the data that’s inputted into the model. That makes a lot of sense.
Ria C: [00:20:12] And I think that’s a great point. The whole idea with AI securities that we want an end-to-end security solution. So whether that be secure enclaves that are being partnered with a federated learning system, uh, to making sure, for example, your data is differentially private using homomorphic encryption in order to make sure that you’re performing these operations on encrypted text, without having to reveal a lot of details of the pipeline. All of these techniques could interact with each other potentially and are important for securing the entire AI pipeline.
Camille: [00:20:43] One other question I want to ask you about is when we think about kind of the future and, uh, artificial intelligence or deep learning, um, what are some of the areas that you think we need to be watching out for?
Ria C: [00:21:00] So I think AI ethics is definitely one of them. The idea that you can adapt security for ethics. There’s this kind of a blurred interconnection between the two, because if your AI model is secure and it’s making sure that sensitive data isn’t revealed through attacks, then you’re also ensuring the privacy for users. But in certain cases, there’s actually a contention between the two. If I have a system that is interpretable and explainable according to AI safety principles, it also allows attackers to learn more information about that AI pipeline or model in order to exploit it.
So in certain cases, these vulnerabilities are kind of coming at the contention of AI ethics, enhanced security. And of course, as to society, we might value ethics more than security because security is kind of a subset of ethics, I think. It’s interesting to consider it as such from a philosophical perspective, because I would watch make sure I’m safe and secure, uh, in order to, as kind of a moral right.
Camille: [00:22:02] But ethics is subjective, isn’t it? And, uh, whereas security maybe is less so, or would you say that’s not true?
Ria C: [00:22:09] Um, it could be true, definitely. But when you start to consider the connection, I feel like there’s a lot more questions that are raised there. For example, um, if you are securing users’ data, maybe some users provide consent to have their data sent to a server for insights to be aggregated and performed. Other users might not want to be a part of that at all. Um, and like, as I mentioned before, GDPR right to be forgotten type of framework, maybe they want to be removed from the system altogether.
So the security considerations might be subjective for particular users. And we might want to design our system in a way that reflects this. If we just ensure security and make sure we don’t aggregate insights for individual insights from any edge devices, then our deep learning models may lack the data to efficiently tailor products to consumers needs. So that’s just one consideration.
Camille: [00:22:59] So Ria one other question that might be just of interest to people listening. Um, I heard a rumor that you’re 16. Is that true?
Ria C: [00:23:10] Yes, I am. So I joined Intel as an intern about two years ago and, uh, just joined as a full-time employee this year. And I kind of got involved in this AI security effort based on previous research with open source communities and during my masters. And I’m thrilled and ecstatic to actually be able to work on this type of topic at Intel and to help push forth this interesting at the intersection between ethics and security.
Camille: [00:23:37] Did you go to college when you were like 14?
Ria C: [00:23:40] I graduated with my bachelor’s when I was about 13 or 14. And then I kind of did my master’s degree in data science from Harvard University and graduated this year.
Camille: [00:23:50] What was the most fun or sort of delightful thing about that? Because I imagine you were a bit unique. What was very cool about it that you might not have anticipated?
Ria C: [00:24:03] I think it was the breadth of knowledge that was available to me and the guidance of my professors, fellow students and of course my mom and my dad, as well. Um, because when I first kind of started out when I was 11 starting my bachelor’s degree, it was kind of this idea I wanna to pursue neural cryptography, which is another field of neuroscience. Neuroscience and cryptography is another topic I didn’t get the chance to discuss, but it’s this idea of applying neural networks to cryptography. It’s a solution that’s kind of being discouraged. And I think, you know, there isn’t a lot of evidence to support that it’s good enough to compared to differential privacy or some other like encryption techniques, for example, cryptography techniques like homomorphic encryption, but there’s definitely some value in that.
So as I started to explore the field of AI further and find all of these new solutions and these new topics, the entire interdisciplinary nature of AI always leaves me in awe. And I think that’s what I kind of learned as part of my both Bachelor‘s degree in Computer Science and then master’s degree in Data Science is it’s an amazing field that’s at the intersection of neuroscience, philosophy, and deep learning. So there’s a lot to learn.
Camille: [00:25:13] Do you live in Boston? or did your parents go to college with you? How does that work when you’re 11?
Ria C: [00:25:19] We did live in Boston for, um, for a particular amount of time. And there was some flexibility offered through the course curriculum to attend some courses, either online and others, where you could attend a certain portion of the course online, and then you could visit the campus for the remaining portion of the course.
So, um, my mom was with me every step of the way, and I think she’s kind of like my empowerment because she’s the one who motivates me and drives me to like, okay, she planned out the entire degree–um, all of the courses and everything. It was very clear planning as to, you know, how will learning go? what’s the best environment to learn in? and how to absorb knowledge? So all of that came together and now I’m here.
Camille: [00:26:08] So are you going to get a PhD?
Ria C: [00:26:10] Uh, the plan is to pursue one soon. Um, but at the moment I’m kind of very much happy working as part of research groups here at Intel. But, um, in the future, definitely part of the plan.
Camille: [00:26:25] Well, it was really a pleasure talking with you. You’re one of the most articulate technologists that I’ve ever met, um, regardless of age, I don’t know. It doesn’t seem to even be present when speaking with you one way or the other. So, um, I’m really had a wonderful time talking with you, and I really appreciate your definitions and your consideration and thoughtfulness to artificial intelligence and security and the implications that has for all of us. So thank you for your time today.
Ria C: [00:26:52] Thank you so much Camille. It was awesome speaking to you today.