[00:00:31] Camille Morhardt: Hi, and welcome to today’s episode of Cyber Security Inside Live from the Green Room. This is Intel’s Innovation Summit, and I’ve got with me Ria Cheruvu, who was just on stage with Pat Gelsinger, Intel’s CEO. She graduated Harvard University at 14 years old and joined Intel and a year or two after that, went back and got her master’s. She is now 18 and also Intel’s AI Ethics Lead Architect.
I’m really interested in talking with her about all things ethics as they relate to artificial intelligence. So let’s listen closely and find out what insight she has to give us. Welcome to the show, Ria.
[00:01:11] Ria Cheruvu: Thank you, Camille.
[00:01:12] Camille Morhardt: So what I’ve heard is over time there’s been concerns around privacy and surveillance, if you want to put those in a category together, concerns around bias, even unintended bias leading to discriminatory policies or frameworks or actions taken. This is all within the context of artificial intelligence and what people worry about. And then inexplicability. So if we call it unexplainable output from artificial intelligence. And I think that people are kind of familiar with those at a high level. Are those still the major concerns or have we moved on and there’s new worries?
[00:01:51] Ria Cheruvu: Definitely still the same concerns that are applicable on a wide variety of levels. And the folds at which these emerge is starting to become a lot more clearer. For example, with privacy and surveillance, this started to come up when we were thinking about situations like, okay, surveillance in malls or in airports. But now we’re starting to see that there might be some more downstream implications that we had previously not thought of that are opening up interesting new use cases for AI systems as well as considerations. So a great example of this is policing workers. When you’re using or applying these algorithms that are detecting hard hats or other types of objects, can they be unnecessarily or in an unanticipated way be used to police workers time breaks because you’re able to figure out when a worker is on a particular location and when they’re not. So these types of downstream implications where you now start to have multiple AI principles and AI ethics principles intersecting with each other, that’s the current focus as I see it today, Camille.
[00:02:47] Camille Morhardt: So it’s not so much that there’s new categories that we’re worried about, it’s that we hadn’t even thought of things that we could track now. Even if we’re not attempting to track them, it’s like, “Oh, well we now have.” Because when you talk about privacy and then you think about precision medicine and you think about even down to gene editing and now you’re talking about a corporation maybe even having access to your own personal genome, that’s pretty detailed.
[00:03:14] Ria Cheruvu: Exactly. And it is a little bit of both. So there are also some new categories that are emerging like sustainability for AI systems and energy efficiency. And a lot of the conversations around there are fresh new tooling ecosystems, new discovery of the nascent nature of overall all of the different directions that we can take as well as the regulations and restrictions that we want to apply. But yes, it does boil down to these main categories and then all of the use cases that we can start to see this applied in.
For example, just taking transparency in the healthcare domain as well, where we start to get into contentions with security, which we’ve always known, right? Transparency and security are always in contention with each other. But the way that we navigate that in relation now to user experience in making sure users are comfortable with the way that their data is being used, they know what’s going on, but we’re also able to differentiate between malicious users or attackers at the system. All of these very complex questions are really coming under the AI ethics domain and are starting to be addressed for different use cases.
[00:04:14] Camille Morhardt: I want to talk about some other kind of new fun, troubling, it depends on your opinion or perspective, use cases that we’re seeing. One thing is we have, I guess we would call it generative AI where we’re creating artwork or music or all kinds of other things. Tell us the difference between that and deep fake. Is it one and the same? Or is it how you’re applying it?
[00:04:37] Ria Cheruvu: Not the same. So it can be different methodologies that are applied for them. The concept generally behind these can be the same, but I would definitely put it under different categories because the models composing them are different. So with generative AI and with different types of networks that are used to generate images, it is definitely the same fundamentals that we’re starting to see as part of text generation and image generation models like DALL-E and CLIP that are out there. It has kind of spun into in and of itself its own motivations that we’re starting to see with these models as well. But the fundamentals are shared between different disciplines and are continuing to grow.
[00:05:14] Camille Morhardt: But can you say more about generative AI? What is it generating?
[00:05:19] Ria Cheruvu: So it can generate text, images and audio in certain cases. It is essentially being able to digest a large database of different types of samples of the data modality of your choice, and then is able to perform essentially a reconstruction of that and then output, or spit out in a sense, a reconstructed version of that data point that is matching the type of input. And again, the methodologies defer based on the exact model that you’re using. The overall fundamentals of this reconstruction element generally stay the same. But the more that we change it, and sometimes and in most cases the bigger we make the models, the more iterations we train it for, the better results we get.
[00:05:58] Camille Morhardt: So in that context, AI can generate say news articles that could be posted that are made up by artificial intelligence?
[00:06:08] Ria Cheruvu: Yes. And a colleague and I were thinking about this a couple of years ago, what are the implications of an AI model generating news content especially when human users and readers can’t really distinguish between AI models and what a human is writing? There is definitely the opportunity for misinformation to be propagated. I know that there is some recent uprise in, for example, students using AI models to generate their essays as well, which is starting to bring into play questions like, is that a good thing using an AI system to handle a lot of the grunt work that you would need to compile all of these references? Or is it taking away from the learning process as well? But overall, there’s a lot of concerns around the indistinguishability of the text that’s generated or the images, but also a lot of site into the potential of these systems. So it’s a very interesting balance that we need to tread here between the two.
[00:06:59] Camille Morhardt: Okay. And then also I’ve heard more recently that I know that AI can generate paintings of let’s say long since deceased artists. So we’re seeing what would van Gogh be painting now. And it’s sort of fun and interesting to look and see how good it is, but that this can cause kind of a different sort of question when it’s generating artwork from living artists who are also generating artwork at the same time. How do we handle that kind of thing?
[00:07:27] Ria Cheruvu: Yeah, I think it’s tied to where it’s like licenses, copyrights, and how do we really navigate all of that. We had a couple of very interesting backstage conversations at innovation on this as well. So Camille, today, the way that we are looking at it from an AI perspective is when you’re distributing these models, there are certain obligations that you may need to follow when it comes to transparency and documentation related items. I think the legal perspective to that is still nascent. It does need to be integrated fairly quickly. It is amazing to be able to understand the failures and the weaknesses that the model, the providence of the training data and related elements, but we do also need to have some guidelines around what data that the AI model can draw from.
It’s not always limited to that, but I think that’s a really good first step. If we’re able to understand where the data the AI model is being trained on is coming from, we’re able to constrain that database to the images that we know the developers of the system are able to create a model that’s able to use responsibly and track that over time. But that is contingent on not releasing models that are trained on these huge corpora of data sets that are all mangled together and we’re not really able to track where they came from and how do we provide that credibility back to the original source.
[00:08:42] Camille Morhardt: Or, I mean, you’re talking about responsible AI, but that doesn’t mean that everybody operating and using AI feels that way about it.
[00:08:50] Ria Cheruvu: Exactly.
[00:08:51] Camille Morhardt: And so this is the kind of thing we’re going to be encountering. Can you talk about deep fake? What is that and how do we distinguish it, or can we?
[00:08:59] Ria Cheruvu: Deep fake is a pretty challenging topic. I feel like it’s lost a little bit of its hype right now, which is kind of strange. The stable diffusion and generative AI part has taken the focus I think as part of the media. But deep fakes are always a constant concern because you have AI systems that are able to generate images, video and audio that’s pretty much indistinguishable to what you would expect to see in the real world narrated by a human or by nature. The interesting thing with deep fakes is the number of different use cases that they could be used for, I feel like the technology between these elements as well as the elements that we might start to see with, for example, image reconstruction where you’re actually bringing old images in black and white style to life, you’re able to color them and then to create 3D reconstruction, there’s a lot of interesting potential there.
And for me, I feel like deep fakes is kind of the opposite side of that, where there may be some potentially good use cases in education, but we’re seeing a lot of the opportunities to promote misinformation with this technology. How do we control it? Are we creating AI models to be able to detect deep fakes that are produced by other AI models? Is it some sort of a competition in order to figure out who is generating what and then be able to toward that immediately? Are there ways that we can use traditional computer science and cryptography and hashing in order to figure that out? I think those are conversations that are still ongoing. Again, the focus I feel has waned a little bit, but it’s going to rise back up very quickly as we see more efficient creation of deep fakes and hopefully more efficient creation of detectors and thwarters in that space.
[00:10:38] Camille Morhardt: The other phrase you brought up is stable diffusion. Can you describe what that is?
[00:10:42] Ria Cheruvu: Absolutely. Stable diffusion is one of these image generation models that’s very big out there. It’s getting integrated into a bunch of different applications. And I believe it is what is powering a lot of the current websites and services that are out there. I’m still getting a grasp of the technology and reading through the articles. I mean, can you believe it? It’s popped up within two months all over the place. Now we see the words like stable diffusion or use it as part of our application to generate images and stuff like that. So definitely learning a lot more about the technology in relation to generative AI. But the fundamentals are again around reconstruction and being able to get back to an image that you’d previously not seen really pulling together elements from different workflows.
[00:11:26] Camille Morhardt: Do certain kinds of models pose greater concern on an ethics front? I mean, artificial intelligence is so incredibly broad. It’s like, okay, we have deep learning, we have neuromorphic computing, we have federated learning, we have so much different stuff within it. So do we worry more about one versus another?
[00:11:45] Ria Cheruvu: It’s a great question. And from all of the colleagues that I’ve spoken to, I think the consensus is yes, we do want to have hierarchies of prioritization and levels with which we use to decide what AI model needs to have more stringent ethical AI guardrails we can put it as compared to another. And that is really based off of risks and harms in terms of analyzing the ethical implications of the system on society. Then again, in and of itself, those methodologies and the definitions and frameworks that we use to figure that out, that’s still under debate. If you use one metric or definition in order to, for example, identify fairness or bias of a system, you could accidentally, if you’re optimizing for that fairness metric, accidentally exacerbate another. So you start to see a lot of different metrics that you need to look at, some of which may not be relevant at all and you have to tailor it accordingly.
But putting aside those problems, yes, there is definitely a prioritization level or a risk level. I think personally the European Commission’s proposal on AI does a great job of doing the categorization. There’s a lot more work in refinement to be done in terms of what is general purpose AI and what isn’t. But having that delineation based on the use case of AI systems and how it builds up over time, that’s definitely very useful. For example, AI being used for determining access to employment or to education definitely has very, very big ethical implications and probably should be constrained very much. Whereas if we see the use case of AI in games or for Instagram filters, you probably don’t need that much of a constrained. AI in healthcare, we can start to think about the different obligations that we might need for chatbots or similar types of use cases. And definitely they have their own risks and harms associated with them. We want to treat them differently depending on the types of implications and harms that they can bring up.
[00:13:36] Camille Morhardt: What is the thing that you would worry about the most in this space?
[00:13:41] Ria Cheruvu: I think it may be not necessarily conflicting definitions and frameworks, but the applicability of a lot of these methodologies I think may be a concern today, Camille. And I say this from the perspective that we have a lot of amazing teams across the industry, academia, governments, and so many different organizations that are now recognizing the problem. We understand that something needs to be done and we are trying to handle it or attack it essentially from different phases of the AI life cycle. There are a lot of tools out there. I think the objective right now is to evaluate what exactly each of us is adding to that space so that we’re able to come up with some sort of a holistic solution. And I see that done as part of the traditional computer science disciplines, and I think that is something that could definitely be reflected as part of AI ethics.
And again, it’s a very nascent. It’s going to take us some time to get to the place where we kind of currently are with AI. We have a sense of the different overarching disciplines within AI. Reinforcement learning, supervised learning, unsupervised, right? We’re able to categorize it fairly nicely. For responsible AI and AI ethics, we’re just getting there. We know there’s a lot that needs to be done. We have these principles like transparency, privacy, energy efficiency that we know something needs to be done about. But as the tooling ecosystem continues to grow, I think these conflicts and these overlaps are definitely going to iron themselves out. And yeah, I think that’s the goal that all of us are working to get to.
[00:15:06] Camille Morhardt: I want to pick on one thing you said, because I had a question about it and you just kind of articulated it perfectly. You said energy efficiency as a responsibility, which I think you might allude to just greater of concerns for the planet or potentially sustainability or people might call it climate awareness. How broad do ethics extend when we’re talking about artificial intelligence? I mean, this is a compute tool. Are we talking social movements? We’re talking environmental situations like critical infrastructure? How broad can we expect a technology to cover ethics?
[00:15:49] Ria Cheruvu: It’s a great question. I think there are two questions hidden in that initial prompt. So for the first one, in terms of the extent to which energy efficiency is and the overall societal context around this, and also kind of getting into the second question around how far is ethics really going into technology, my personal opinion is it’s very far and overreaching because when we are mentioning ethics, it is a very loaded and important term. I think many of us in the AI ethics space, and myself included, we use that word in order to signify that there are implications for the greater context beyond just the technical mechanisms and the infrastructure we’re creating towards explainability and privacy.
For me, it represents that unification of the siloed elements where we are actually taking that into account, the bigger picture around societal debates that are surrounding the technology. For example, pulling on energy efficiency. Definitely climate awareness is a key part of this. And to provide some context into this, there are two different main categories of AI and sustainability. There’s sustainability for AI systems where we’re looking at optimizing AI models so they don’t have such a large carbon footprint or are consuming less compute, et cetera. There’s also the AI for sustainability where we’re seeing the rise of numerous different types of use cases for AI systems to help improve the climate, whether that’s detection at a larger scale of trends and patterns or even as simple as at a local level for users to better understand their footprint and what they can do to optimize or refine that.
So within this, I believe that that social aspect is very critical. It’s why we need so many perspectives at the table. I often say it’s like, I don’t know, maybe 10 or more different disciplines and domains that are kind of integrating into this. At least that’s what I’ve learned from all of the colleagues that are working in this space. But again, that bigger picture always needs to, in my opinion, be there even though we may in a siloed way work on technical mechanisms that enable each of the elements that are composing ethical AI.
[00:17:50] Camille Morhardt: With AI now tackling, like you say, greater social movements, is there a risk that there’s kind of this consensus among, I’ll just say the tech community, of what is social good or what is the right thing to do for the climate and that does not represent the opinions of people who are maybe disagree or don’t have anything to do with technology and so aren’t even following?
[00:18:15] Ria Cheruvu: It’s a beautiful question because in a sense the answers to that can be somewhat controversial. But in my opinion, from what I’ve seen to be completely honest, is when an AI engineer is starting to become a philosopher or starting to get out of their domain, there’s a lot of problems that can happen. It is a wonderful thing because a lot of the engineers who path finded and created the technology, they had a vision for where it should go. I’m sure you may have seen there’s a lot of debate around founders of certain AI disciplines in the way that they wanted to see the technology progress. There’s sometimes disapproval or disagreement with where it should have gone and where it is now that speaks to the space of AI overall with the whole artificial general intelligence route. That’s completely one way. And then there’s also this route of deep learning and just building better models that are faster and more performance. Sometimes large, sometimes small, and going that way.
But throughout these different disciplines, there is in a sense a mixture of what you had shared. So the first is we don’t have a lot of team members who believe that the disciplines are important, but that is changing with responsible AI. Because as soon as you mention ethics, that societal concept automatically comes in. And I’ve had many colleagues, they’ll immediately bring up trolley problem type of situations where you know, have autonomous vehicles, you want to decide what they’re doing, or AI in the workforce. There are very specific disciplines that come to mind to different teams of individuals as soon as you mention responsible AI. So I think that even the mention of it and the discussions are changing somewhat the idea that technology is the only thing that really matters, we’re not really thinking about the broader picture.
But when it comes to engineers or even specific teams, I’m also speaking to legal and philosophy, if they don’t have integrated disciplines or perspectives, definitely there is an amount of siloing. We don’t get the type of alignment that we need. And we definitely see this as part of the regulatory conversations as well. If you have more representation of stakeholders as part of these committees that belong to one group over the other, you’re going to start to see a couple of trends that are popping up. For example, if you have a lot of technical community members, you might find these definitions that may be hard to follow for late people or for regulators. And if you have more of the regulatory folks that are involved as part of making standards or just guidelines and documents, you have things that don’t have a lot of practical applicability back to AI models.
For example, some of my colleagues have commented on this saying, “Oh, there’s a regulation out there that says ‘This is what needs to be done at this point in time’.” But from the technical side, it really can’t be done. You can’t just make that statement. We don’t even have the testing infrastructure to validate that statement is true if somebody makes it. We definitely need the different teams looking at it. So it’s a balance when no one-size-fits-all solution.
[00:21:04] Camille Morhardt: What are we doing about data collection, massive asymmetry of information, those who are collecting the data, a lot of times people become concerned that that is consolidated, held within some subset of corporations actually even, not necessarily even governments that don’t yet have regulations or laws or policies, at least not global ones in effect? So how should we look at that?
[00:21:33] Ria Cheruvu: I like the word that you used as well in terms of the asymmetric kind of nature of the problem itself. So as part of this, and it’s very interesting, I think, to start this off by saying there are specific instances even where let’s say you did have access to unbiased data or you had access to the type of data you would need, you can also start to see label biased directly during the annotation stage itself when you’re cleaning your data, when you’re removing outliers. And if something doesn’t conform to the data scientist worldview and they take it out, or maybe it doesn’t even work as part of their problem statement or analysis, it can have a lot of impact on other populations as well. Whether or not you’re cleaning out missing data, again, stripping outliers from the data, all of these data science operations, they can contribute to label bias and many other types of biases as well.
But I think overall when it comes to the data collection and annotation stage, it is a key problem in a number of different ways. For example, till date, all of the problems related to data sourcing have had solutions like, “Okay, can we use data augmentation? Do we use the same data points that we have and then flip them around, turn them around, resize them, rotate them? Can we do something to change them up and then add them to our data set?” I came across a very interesting case study or use case in a textual data set being able to replace words that are gender specific to something that’s gender neutral. Although it seems like a very simple example, this tool was taking a look at doing it automatically. Being able to create that nice data set, give you not a very detailed visualization, but some sort of a report out of what’s going on and what were the changes part of that that were being made to the text.
I think that’s just such an interesting example of how we can start to implement these types of concepts at lower levels that data scientists and AI engineers can use hands on. If you don’t have a lot of great data out there and you are forced to use data augmentation, maybe there are methodologies that you can incorporate at these levels that can help boost or at least enable some sense of data quality and ethical AI that are incorporated here, improve the diversity and representation of your data sets.
So this is a poor example in the sense that, “Is that it? That’s all we’re going to do? Just a quick fix to data augmentation. How do you know what impact it has on the system?” But at each of these levels we definitely need to have solutions. My argument is maybe we start small and then we build up from there. And when it comes to data sets being constrained to certain organizations, a very big problem because reproducibility has already been a severe problem in the data science space. Folks publish papers and models. Sometimes they don’t share their data sets. I’m guilty of doing the same because we don’t have time maybe, we don’t want to go through a review process that’s very lengthy because we want to get our results out there before someone else does. There are a lot of factors in there that can contribute to that. We do definitely need to see more data sharing efforts.
[00:24:22] Camille Morhardt: Can you just explain what data augmentation is? I’m getting that it means you don’t have enough of one kind of representative within your data set, so you’re maybe sort of multiplying or extending that to assume there’s more. You’re making some kind of an assumption that may or may not be true.
[00:24:40] Ria Cheruvu: Right. It’s a great summary, but the first part of that premise is a little bit different. So with data augmentation, rather than looking at… Well, we are looking at representation but not from the ethical AI perspective. So for example, if you have your very typical CIFAR or MNIST data set, you want your model to perform better on 4s or 9s or you just want to boost the size of the data set to hundred to hundred thousand more samples or something like that. That’s when you would start to apply these rotations, fixes, flips, resizing, basically to boost the size of your data set. In terms of data set augmentation for ethical AI though, that’s the nascent space.
[00:25:17] Camille Morhardt: Okay. So what are the things that we’re worried about with data augmentation with regard to ethics?
[00:25:23] Ria Cheruvu: So when it comes to data augmentation and ethics, there is one key problem that pops up all the time, and this is the same problem that’s exacerbated with synthetic data. If you are generating additional data points for your data set, first of all, do they actually make sense? We do this all the time because we want to boost the amount of data because larger data equals better models, most of the time. It’s starting to become the case where you can’t always say that, but it’s very gradually turning into that. But it’s still the equation is pretty much better data or larger data equals better models. So when we are trying to boost the size of your data set and potentially in some cases even the quality, is it actually representative or reflective of the real world data when you’re creating these samples for synthetic data and data augmentation comes under synthetic data generation in many cases? Does it make sense? Are these the types of inputs and the trends that you would really see as part of the real world?
That I think Camille is the main question that comes into play with data augmentation. It starts very simple in terms of what is the real world applicability of the data, but then it starts to evolve into something that’s a lot harder to tackle, which is when you are dealing with potentially sensitive data, and let’s say you’re generating data related to race or gender, even proxy variables like handwriting or something similar, it starts to evolve into something a lot harder than we can take.
[00:26:47] Camille Morhardt: So would we do something like disclose the percentage of the data that was synthetic or augmented so that we could take some assessment of the risk of the assumption there?
[00:26:57] Ria Cheruvu: Exactly. That is what we want to be able to do. And with that, we want to be able to track that’s why the providence, and also validate the providence of the image data. And as we can see as we’re having this conversation, we can start to think that in addition to just having the percentage of the synthetic data in the training data set or even in the evaluation data set, wouldn’t it be cool to have a little bit more information about how the AI algorithm-
[00:27:20] Camille Morhardt: Which sets of data? Which sets did you augment? (laughs) Yeah.
[00:27:23] Ria Cheruvu: Exactly. Right. And then we run into security problems, which is okay if there’s a malicious actor who has a lot more information about the data samples that the model was trained on because you’re giving them information about the synthetic data, what couldn’t they do with that information? They can engineer inputs that are matching that or they could try to play around with those inputs and see what happens. It’s a lot to take because when we, for example, present these types of fairness and bias or even just basic representativeness, we’re not really even talking about sensitive demographic data here. We’re just talking about does it actually match the real world? And then we immediately have our security teams that come up and say and raise these challenges as well.
So it’s a lot to take on for one person or one team, that’s for sure. That’s why we definitely need multiple teams that are challenging us when we say, “Oh, we have problems with our data set. Let’s use data augmentation. Okay, we have problems with data augmentation, let’s use synthetic data. We have problems with ethical AI, let’s change our gender in neutral or specific terms of gender neutral.” When we have problems with security, we put access control and mechanism, something like that. That type of workflow that needs to be established, we definitely need to do that.
[00:28:31] Camille Morhardt: It almost goes back to the unintended consequences, where we’re on unintended biases. Now we’re sort of talking about another unintended scenario. While we’re trying to fix one problem, we may be creating another.
[00:28:43] Ria Cheruvu: Exactly. Right. The weirdest part about all of this that I’ve seen so many different folks in the AI and AI ethics space raise is that AI systems are fairly… It’s wrong to say that they’re dumb in the sense that if you put an AI model in an Internet of Things environment, you’re empowering it because it’s able to consume all of the data that you’re getting from sensing and it is able to actuate in so many different ways. But AI systems fail on ridiculously strange real world input. My team has personally experimented with this so many times. We’ll give it an image of a cat wrapped in a blanket and it’ll start to classify it as a kite or a burrito. And this is one of the state of the art that deep learning models that has seen so many cats and dogs and blankets. It’s trained on some of the largest databases, ImageNet and others, with millions of images, but it just can’t figure out these types of problems.
And we see the same thing with autonomous cars as well, which is rapidly changing. There’s always been up until last year or even continuing to now, this question of, “Okay, we have a lot of self-driving cars out there, but how will they behave at very busy intersections?” But it’s changing. We’re seeing solutions pop up to that a lot. But AI models are still fairly strange. They’re weird. And we are already starting to see a lot of these capabilities pop up. So as we refine AI models outputs, because we do want to get AI models to a point where they are able to contribute to society with a better performance, we’re going to start to see a lot of these issues increase. It’s a strange kind of correlation, but it’s true because the better that AI models get, sometimes the worse the issues are. So that’s something we need to handle.
[00:30:19] Camille Morhardt: Because robots are now joining us in our homes and on the street, I’d like to get your perspective on continual learning and the ethics around that.
[00:30:31] Ria Cheruvu: I haven’t been asked this before. So continual learning is very interesting because it offers a lot of challenges I believe. This is also applicable for offline training and a couple of conversations, very early stage that I’ve been having with some of our Intel robotics teams as well and externally, which is, when you’re developing your model, you have access to a plethora data, you have access to your ground truth labels. That’s the main part of it because that’s how you’re able to monitor the performance of your robot, your device, your AI model. When it comes to deployment though, you don’t have access to the ground truth in the sense that you need to have this offline human evaluation team, or maybe they’re online depending on the speed of the review. They’re actually reviewing the AI model’s outputs. They’re checking what’s going on. Maybe they’re telling, “Okay, you did wrong here, you’re doing right here.” Rejection or accepting the outputs of the model and they’re providing that feedback back into the system.
There are different models for doing it that way. That’s the offline training way, but it applies for online training as well or online learning and continual learning where you need to have at some point in time ground proof labels in order to assess the performance of the algorithms. There’s something else that’s added to this, which is qualitative behavior analysis, which is a term I’m just mentioning here right now. It’s not an actual term. But just if we take the example for robot in a restaurant, if it’s navigating over and it’s bumping into a lot of tables as it’s serving different folks, that’s a problem. We’re able to actually see that. Maybe our metrics that we’ve used, our quantitative metrics are not capturing that. The performance of the model may be superb, but something’s going wrong. Is it accuracy drift or is the sensor damage? What’s going on with the model?
So these two kind of quantitative and qualitative measures of evaluation are very interesting to consider. Ethical AI, we are hoping to actually bring from qualitative to quantitative. The reason why is the following. If for example, and there’s a great example of this with soap dispensers, I’m going to just take a tangent. But soap dispensers, computer vision based, no AI included. If they are dispensing soap for folks with a lighter skin tone compared to folks with a darker skin tone, there is an immediate problem that we start to see here. We’re seeing the same type of technology and biases reflected in AI models, for example, and this is just a generic example, that might be incorporated into robots and other applications as well.
So you need to be able to perform that type of testing. And in the case of continual learning, you want some of this to be done real time because you can’t always tell the model, “All right, pause. I want to do this test.” You do want to do that because at some point in time you want to have a very thorough examination of what the model is doing and give it that performance feedback. But you can’t do it real time and continual learning all the time. So you want some sort of a quantitative metric that you can anchor this on. And putting AI ethics into quantitative metrics, very challenging. That’s why we boil it down to the individual elements and start from there.
[00:33:22] Camille Morhardt: What are the soap dispensers doing?
[00:33:25] Ria Cheruvu: Essentially the soap dispensers are just dispensing soap, but what they’re supposed to do is if you’ve got a hand right underneath them, they’re supposed to be able to just recognize that and dispense the soap. But in the case of lighter skin folks, they do it perfectly. It’s all fine. In the case of darker skin folks, it completely ignores them.
[00:33:41] Camille Morhardt: It’s not seeing the hand or not recognizing that there’s a hand there?
[00:33:45] Ria Cheruvu: And a little bit more insight into this, at least from my current understanding, essentially because of the light, they were not able to detect darker skin’s folks hands. It wasn’t disposing it no matter how much you wave it around the dispenser. So it’s a key problem, definitely should be identified as part of your pre-design and development phases, not during deployment after you’ve released your product and it’s working in the real world. But I feel like that’s a good example for AI models as well. You don’t really anticipate these problems. But if you actually think about it, about benchmarking and evaluating on different groups of populations, then maybe you can catch some of these issues early on. But if it’s the case like continual learning, you need to be able to do somewhat some of that real time.
[00:34:26] Camille Morhardt: Continual learning. Maybe just give a very quick definition of what it is now that we’ve had the conversation about the implications.
[00:34:33] Ria Cheruvu: Yeah, sure. So for me, the way I see it very simply is it is what would happen if you could keep training for… Maybe not forever, for a given duration of time where the model is constantly consuming this data and it is able to predict and it’s able to learn on the fly.
[00:34:49] Camille Morhardt: Or often outside of a factory or in your home or outside of where all of the data scientists are sitting.
[00:34:56] Ria Cheruvu: Exactly. Right. It’s a very interesting use case. It’s very nice to be able to learn on the fly and be able to react and acquire newer skills. But then again, for example, another problem that starts to pop up coming out, which also happens with offline training is like catastrophic forgetting, which is where your model is forgetting information it previously got trained on to remember all of the new info that you’re giving it.
[00:35:19] Camille Morhardt: I have that problem. I didn’t know that machines shared that problem.
[00:35:23] Ria Cheruvu: Well yeah, now it’s an AI problem. AI does it a little bit worse than us.
[00:35:30] Camille Morhardt: Why does it do that? Is it because it runs out of storage or because it’s seeing a conflict in information coming and doesn’t know how to resolve it?
[00:35:36] Ria Cheruvu: Okay, so that’s a good question. From my current understanding that is very limited, all I know is that the phenomenon exists. We don’t actually know in many cases why it’s happening. So it isn’t necessarily, or at least to what we know, not a conflict of information, but it is just simply forgetting what it has previously learned. Now I’m not sure if it is a capacity problem or something like that. Actually, right after this conversation I’m going to go and check if they have mentioned that in the paper, but I’m pretty sure that we don’t have that actual reason narrowed down. So we know it happens, we don’t know why it happens. And that is the big mystery with AI.
[00:36:12] Camille Morhardt: Are we ever going to see ethical implications flip on their head and we as humans are going to have to worry about how we’re treating machines?
[00:36:21] Ria Cheruvu: Yes.
[00:36:23] Camille Morhardt: Okay. Say more.
[00:36:23] Ria Cheruvu: Absolutely.
[00:36:25] Camille Morhardt: I was not expecting you to say yes to that.
[00:36:37] Ria Cheruvu: Oh definitely, because… And I did end up writing a kind of short thesis opinionated paper on this and I got a couple of very interesting feedback points on that as well, which was-
[00:36:38] Camille Morhardt: Oh, I’m going to read that now.
[00:36:39] Ria Cheruvu: Oh, thank you.
[00:36:40] Camille Morhardt: I should have already read it, but…
[00:36:42] Ria Cheruvu: I’ll provide the summary here because it’s just so interesting. So there was an initial research experiment. I forget when, but it was a couple of years ago where there were these researchers who created a robot in a mall that was supposed to do some navigation. And they encountered, in my opinion, what I like to phrase it as an unanticipated concern or an unconventional concern where they had a lot of kids that were trying to kick the robot, punch it, move it around, shift it. Some of them were curious, some of them were violent, so they had to deal with this newer problem.
So what the researchers did, they built an attack estimation, I believe if I recall this correctly, an attack estimation algorithm and also a trajectory planning algorithm so that they were able to avoid kids that were coming their way and trying to attack them and either maneuver to the parents who were taller so they were judging based on height or maneuver somewhere else.
So it’s interesting because it raises a lot of questions around how do humans act with AI systems? One of my peers in this space was also telling me based on a recent study and a survey that they were working on that a lot of users of AI technology, we do want to design our actions or the way that we act in a way that helps AI systems. Again, it depends on the types of communities that you’re serving as well, but we’re willing to change somewhat in simple ways the way that we help AI systems. And let me provide an example of that.
For example, if we have an AI system that is helping us in a healthcare setting to report that we’ve taken our medications or do some sort of processing there, if it’s just as simple as angling your medication bottle a certain way or doing something, maybe turning on the light, I think we’re willing to do that to help the AI algorithm better detected. When it comes to that, we start to get some expectations as users. It’d be nice to have a little manual that tells us what is this AI model doing, where does it typically fail. So that’s one thing that we might want to see some information on. And again, we may also want to figure out, “Okay, how do we actually use the system? Where is my data going with the system?” We have a set of requirements and expectations that we want to know what exactly is this model doing.
Same thing with an interactive kiosk. If it’s taking my voice commands and it’s using that information, it would be great to know where exactly is that going to be used. Is it going to be used against me? We need to answer these types of questions and preferably real time rather than a user having to navigate to a website. But there’s also security considerations there as well because if again, you are exposing that type of information to the user that you don’t know in terms of the internals of the system and when it fails, is it possible to trick that system, construct adversarial inputs or in general hack the system so that it gives you outputs that aren’t really true?
The example that I provide in my recent paper is one, not of a malicious actor, but of a healthcare patient who really wants their AI system to log that they’ve taken their medications to their healthcare practitioner provider as well to their family, but they really haven’t. So there are trying to essentially hack the system in order to get the outcome that they want. And it happens in the healthcare space. So AI can be an enabler of that. So that’s why I think definitely we need to think about the way that humans interact with the AI systems and the ethical implications to bridges AI systems. Maybe not because AI has consciousness and emotions, which is a debate in and of itself, but because the effect that we have on AI systems, AI systems are amplifiers. So in general we are impacting the people around us and the people who are trusting AI systems by being untrustworthy towards AI.
[00:40:12] Camille Morhardt: Right. Well, you brought it up so now I do want to ask you your opinion about machine consciousness because it’s such a debate right now.
[00:40:22] Ria Cheruvu: I’ve been thinking about it a lot for the past few years with conversations with so many different people. I believe the consensus is we are not close to machine consciousness when it comes to a timeline. If we were to get to that point, it’s probably not going to happen in the sense that you can get some sort of a simulation or if a model of it. And I know quite a few people in the philosophy space who sometimes argue that we even can’t get to that type of approximation, which is a good point and I definitely take their word because they’re the experts in terms of what really is consciousness. But personally, I believe that we can get to some sort of assimilation or modeling.
And the reason why I say this is if we’re able to create maps of different human emotions or morality, which is work in progress, different ways that humans react in different situations, we can get to simulations of emotion. We see it a little bit today in chatbots where actually if you look at the technical advancements in chatbot technology, they’re really targeted around how do you best respond to phrases like, “Hey, how was your day?” rather than sounding robotic in a sense, actually having that type of nice little interaction. So I believe we’re going to get to simulations of it. We’re not going to get to possibly the consciousness or maybe the thought uploading, but you never know. So I’m totally open to the prospect of things changing over time.
But when it comes to extended consciousness as a whole, I do definitely see AI as an enabler or an amplifier or some sort of a process that you can use, from a philosophical standpoint, kind of extend your extended mind. And this actually takes me back to a great point that Pat made during the Innovation 2022 keynote, which was that everything is a computer. That is very interesting to think about because in a sense that everything around you was a device that you can use and that computes or something like that. So for me, I see AI as kind of an enabler and amplifier of that, and it goes back to the original research on the extended mind and other stuff from the philosophy space. So we’ll see. For now, short answer, probably not around machine consciousness, but I’m open to the possibilities and I’m not biased towards it so we’ll see what happens.
[00:42:31] Camille Morhardt: How do you see it as an extension of consciousness?
[00:42:36] Ria Cheruvu: Based on the definitions of this, I mean, the way that we currently interact with AI systems is, “Can you get this task done for me?” or, “Can you listen to my command?” And I’m thinking of voice recognition or assistance. I’ve got my phone right next to me if I’m telling Siri, “Send out this email” or something like that, that’s the extent of which we have it today. But as we start to see more human-centered AI systems like robots and others that are interacting more closely, for example, robots that are interacting with the elderly, we’re starting to see a lot more emotions and connections that are being created there. And eventually I feel like we may start to see AI agents that are really improving productivity by being this type of second backup that you can rely on.
We’re seeing it slowly, not there yet. But for example, with generative AI, this content creating machine that generates really weird images and text, that’s something that you could use as part of your creative process, right? That kind of backup thought generator or something like that. So I feel like as part of that, it’s really adding to the thoughts the persona that we currently are. Maybe it’s, for example, personalizing our feed of the clothes that we’re buying and the food that we’re eating. So it is adding to who we are as a whole, at individual, and group, and societal levels. So from that perspective, yes, I believe it can kind of extend our consciousness. Again, depends on the definition.
[00:43:49] Camille Morhardt: I was talking with Ashwin Ram at Google. He was talking about filter bubbles and exploring versus exploiting and basically setting up algorithms so that you can say for any given algorithm, let’s say it’s written for displaying the news versus it’s written for pure entertainment value, discovering new music or something. Within the application, the algorithm, they can set different algorithm parameters. So if it’s the news, I’m going to give you 80%, we’re going to show you things you might not have heard of. You might not believe this way. I’m making up the direction here. And 20% we’re going to exploit what we already know about you or what the algorithm already knows about you based on all the different information that it collects. It knows probably your political philosophy and how your stance on variety of topics and it’s kind of feeding you more of what you already believe. I’m really interested when you’re talking about this kind of extension of consciousness how that plays into it. Is AI going to feed us more and more or make ourselves into characaturizations of ourselves or actually extend us?
[00:45:06] Ria Cheruvu: I, actually, as part of my master’s research project, did do a little bit of research into confirmation bias with Google. We were looking into ways that we can thwart that, because currently and exactly as you mentioned, we are seeing, for example, AI systems that are reinforcing the recommendations that you’re picking. They serve to you what they think your choices should be. There’s also an added element to this when it comes to content recommendation where if you have different groups that the algorithm thinks you’re similar to, you only see things that those groups are looking at, especially true for political opinions, et cetera. So the approach that we had taken from a technical side in that project was remove that entirely, start to create clusters that are only based off of the topic. So remove all of the cool personalization related elements and basically get to a point where we’re able to cluster based off of opinion.
So in the case where we see very highly opinionated articles where essentially arguments that have the same type of thesis, we would group those together instead. So we’re not really caring about your personal attributes or your interest, but if there’s a particular opinion that you want to see, you can see that. And what we’ve seen is other research that we were drawing on was able to provide this type of spectrum of different opinions. So you see your opinion on one side, but right along the side of that you’re also seeing the other opinions as well. There’s a subtle decision to be made here though. Are you allowed to click in and only look at your opinion, which is also reinforcing bias? Or are you forced or essentially it just pops up no matter what you do to see other opinions as well? So you are kind of being forced to get out of that filter bubble. Or does it make sense not to have technology at all? Right?
But when it comes to extended consciousness, I feel like that may happen in the sense that maybe we have AI algorithms and automated systems that are in a sense forcing different individuals to get out of their filter bubbles. I know it sounds kind of horrendous when I put it as forcing, but if we do need to get to a certain level of common or shared understanding without having technology amplify the filter bubbles, we do need to essentially pull ourselves outside of that. And it’s going to be very hard for us to do that ourselves. So maybe it’s in our control, maybe it’s out of our control, but we definitely need that type of push as well. Otherwise, we’ll be completely immersed in our own opinion. We’re just fully enabled and amplified by technology to just stick to what we want rather than looking at other opinions.
[00:47:31] Camille Morhardt: Certainly in academics I would argue. And broader also, I think that humans tend to really silo, different disciplines. So you’ve got just a really big super broad example. You might have people working on energy efficiency in a server farm, basically trying to cool down the servers and they’re trying to figure out the best way to do that. And then in a completely unrelated scenario, that might be a block away. There’s people trying to figure out how to use the most efficient heating to heat a home or something like that. And never the twain shall meet the fact that you could take the energy from the servers and sort of just move it a block over and heat that area. This is really just a broad example, but these things exist everywhere. Is AI in a position to help us see across these boundaries that we don’t typically think or provide ideas for us that we wouldn’t even know to ask?
[00:48:24] Ria Cheruvu: Interestingly Camille, probably not the current set of technologies that we have right now where we have a lot of focus on right now. I say that because what we’re talking about right now is mapping between different disciplines or being able to discover unknown ideas or sources. So a lot of that we can think of in terms of the frames of current technology. And again, I am going back to the generative AI space where we are starting to discover new creations and concoctions by AI systems that we didn’t even know existed. But a lot of that is going back to traditional mapping and navigation of different disciplines. So I believe that it is an emphasis on both supervised, unsupervised and maybe other disciplines.
But definitely AI will be able to help us out with being able to make these interesting new connections, whether it’s at the idea generation stage or if it’s at the facilitation stage when we’re creating prototypes and we have AI systems helping us out there, or maybe even during the production stages of the type of technology or even the concepts that we want to enable and scale for monitoring for real time threat detection to that type of solution or that item. AI can help us at different folds of the process. I believe that depending on the capabilities we have currently today in the directions that research is going, some of these tasks AI can do better than others. Maybe for example, it’s better at monitoring and anomaly detection than it is at idea generation. But I think in the end we will get to a point where AI is able to contribute very strongly to all of these elements and phases.
[00:49:55] Camille Morhardt: So you’re 18, so this is relatively young in the scheme of humans. The other day the Intel CEO said you’d be CEO in 30 years. So I guess in that kind of a time horizon, if you can think forward, what do you think we’ll look back on from this time and say, “Oh my God, we were such babies in this space. We stepped left and we should have been thinking about right” as humanity dealing with artificial intelligence and its rise?
[00:50:28] Ria Cheruvu: From what I see today with AI and technology, we’ve tried our best and ventured into so many different domains. So I know that when we get to the future, we’re going to converge to something that’s amazing. We’re going to get to technology that we probably had not even thought about before. But all of these directions have definitely led up to this. All of the thinking, the failures, they are building up to this. Now there is an element in here that doesn’t apply to, where this type of optimistic point of view doesn’t apply to, and that is ethical problems. So for example, if we’re seeing the introduction of facial recognition into challenging environments or use cases in the sense that it just doesn’t set right morally to see certain applications introduced, to many of us, not just at an individual level, but we’re actually seeing that at a shared level. That’s something we do want to be able to prevent ahead of time.
And again, to be completely transparent, one of the key debates here is around do we ban face your recognition altogether or do we keep it? There’s folks on in many sides of the arguments as well. We all see the potential for it, but we all see the capabilities for destruction as well. So I think having conversations about that early on and definitely acting based on those conversations, that is what we definitely need to do. I don’t see it as something we’ll regret in the future because again, we have multiple teams and different folks that are working on it, talking about it and raising it. But it is something that we do need to get alarmed about and get alerted to. And I think all of us are again working towards that type of outcome.
[00:51:54] Camille Morhardt: Well, Ria, thank you so much. It’s Ria Cheruvu who is AI Ethics Lead Architect at Intel, and just very recently on stage with Intel CEO Pat Gelsinger talking about Gaudi and other AI chips.
[00:52:10] Ria Cheruvu: I did want to say it was an honor to share the stage with Pat and to demonstrate some of the wonderful technologies out there, Geti, Gaudi and our Intel Scotland’s team’s amazing work to get the optical innovations out there and a ton of other work with OpenVINO, with game development as well. So the challenges that we face today, we’re going to continue to face those from an AI perspective, but AI quality and a lot of these other key capabilities and performance related items that we’re looking at when it comes to technology, we’re solving it. So I’m very, very excited for the future and what we’re going to do.
[00:52:44] Camille Morhardt: Thanks again, Ria. Appreciate it.
[00:53:46] Ria Cheruvu: Thank you, Camille. It was wonderful speaking with you.