[00:00:28] Camille Morhardt: Welcome to InTechnology, and I’ve got with me today Neil Fluester from Crestron. He’s also host of Crest TV so we might talk just a little bit about video podcasting, but Crestron actually is an interesting company in that it helps make meetings seem more realistic. That would be my words, not theirs. It basically uses video cameras within a conference room to make it feel like you’re having either just a natural conversation with people who are remote or, in my opinion, like there’s actually a film director in the room, panning out, zooming in, somebody walks in, it has you look over there, it knows who’s talking, so it’s very fluid and flowy.
We’re going to talk about all of the things that go along with that, how that happens and some of the technology behind it, and also maybe some concerns that it could bring up and pros and cons moving forward. Welcome to the show, Neil.
[00:01:23] Neil Fluester: Thanks, Camille. Great to be here.
[00:01:25] Camille Morhardt: So, can you just tell us some of the things that go on behind the scenes in these cameras to help track people?
[00:01:34] Neil Fluester: Yeah, absolutely. Obviously, when you go into a meeting room, you’re going in for your sales meeting, your marketing meeting or training session or whatever. You’re not going to be a videographer, there panning and tilting, and zooming the camera around to make sure the shot’s right, you want to concentrate on the sales training, so the fact that we’ve got intelligence in the cameras that can actually make sure that people are seen in the right way. The biggest term at the moment is equity, meeting equity, everybody wants to make sure that they all look as good as they do in the office as if they are at home and to try and present that equal view of everybody in the room. There’s a lot of technology we put in the camera to do some production rules, as you talked about, the idea of having a video director going, “Right, switch to camera one, switch to camera two, right. Pan that, tilt that, zoom that.” But all that’s happening in AI and machine learning and in algorithms rather than a physical human on a video mixer punching camera one, camera two.
[00:02:29] Camille Morhardt: So what is it learning when you say it’s doing it in AI? What is the camera actually learning once it’s in the conference room?
[00:02:35] Neil Fluester: Yeah, so there’s certain rules that we give the algorithm to work with. When it comes to video, if anyone’s watching this on video rather than listening to it, the way in which you set your camera up is what’s called the rule of thirds. So you want your eye line to be in the top third of the image. If you watch any TV program, any video production that’s done professionally, they’ll use the rule of thirds so we want to make sure that the people in the room are presented using the rule of thirds. We look for then, the T of the eyes, the nose, and the mouth to then say, “Ah, it’s a person,” but it could be a picture on the wall. There’s certain places in the world where you have the picture of the president or the dictator on the back wall, so you don’t want the camera to track into that.
Then we add in additional components like bringing in sound. For example, we use what’s called “sound source localization” to make sure that we can tie in that person with some sound. But also, in the flip side of that, putting the person to the sound because if someone’s coughing or leaning back on a chair or creaking or punching away on a keyboard or whatever, you don’t want the camera to suddenly track them. These rules are there and we learn the system to say, “Okay, go find the people, go frame them correctly, and then make sure they’re a real person.”
[00:03:47] Camille Morhardt: So going even a little bit more into that, you were talking previously about avatars.
[00:03:53] Neil Fluester: Mm. The thing about avatars, I think, is great about is video for some people can be a little bit threatening or a little bit concerning, so being able to create these avatars is awesome now with the software that’s out there to be able to create these virtual representations of oneself in meetings. Again, it really does lower the barrier for a lot of people. COVID and lockdown, it was a massive increase in the use and adoption of video, not just in conferencing but in communication, whether we’re doing webinars, training, podcasts, YouTube, it just exploded. For us that do it on a regular basis, we kind of feel comfortable in front of the camera. We look at the camera rather than looking away and looking like we’ve got a lazy eye.
But for some people, again, the marketing person, the sales person, the finance person, they might not be so comfortable to look at a camera and to, again, be worried or slightly self-conscious so the fact that we can present them as an avatar and give them that kind of freedom and flexibility to not be nervous and not be concerned about what they look like in front of a camera.
[00:05:01] Camille Morhardt: So how about if I’m in a meeting with an avatar, how do I know that I’m actually talking with the right person? Is there some kind of a verification mechanism that there could be impersonation or malware or it’s synthetic human actually speaking with me? When I’m talking to AI, how do I know?
[00:05:18] Neil Fluester: How do you know it’s me?
[00:05:19] Camille Morhardt: How do I know it’s video, for that matter?
[00:05:23] Neil Fluester: Someone else could be me. Yeah, exactly, I could be anyone. “I’m Dave Smith, I’m just come in to impersonate Neil today, he’s a bit busy.” The interesting thing around that, and again, we talked about this, is around a lot of the technology now. Microsoft actually announced it in the latest Teams release that they’re now doing facial recognition and name placing in calls. So when you are looking at a meeting room, look down that long boardroom, bowling alley style table that all the companies generally tend to have. But being able to see all those faces down the room, but not just the faces or avatars, but to actually have a name tag on each one of them. Because what the Teams or Zoom or whatever platform meeting will do is have the name of that room, meeting room one. It won’t have Dave, Bob, Neil, Camille as the different people down the room, and that’s one of the things that now we’re starting to move into.
It opens a whole can of worms around privacy and, for us in Europe, around GDPR and things like that. But the idea now that we can have a room that’s full of people but able to individually identify those people so that you can actually then address people personally. So when the person three down on the right says, “I have a question. What did we think about if we did this for our Q3?” We can say, “That’s a really good idea, Dave, let’s make sure we put that onto the record. We’ll make a note of that and it can be tracked.” That’s the interesting dynamic that’s really coming into the meeting room now is that identification facially. Again, even if you’ve got an avatar that’s being presented, hopefully the facial recognition is good enough that we can put the right name against the right face and we won’t get people impostering.
[00:07:00] Camille Morhardt: So does that back people into a corner if they’re not interested in having facial recognition, or do they use some sort of an avatar that’s not actually their own self and then that’s what the system learns?
[00:07:13] Neil Fluester: Wasn’t there that funny case in North America of the cats. I’m not a cat. Where the-
[00:07:18] Camille Morhardt: Oh yeah, that’s very funny, that’s very funny.
[00:07:20] Neil Fluester: … where the kids are dipping, playing on. I think we could get all sorts of cats, and tigers, and dogs, and whatever joining meetings. We don’t live to work, we work to live, and work’s supposed to be fun most of the time so, again, there’s certain times when meetings can be fun. Maybe not court cases when you’re pretending to be a cat, but I can see that in certain circumstances and certain environments it could be quite good fun to actually portray yourself as something else. Whatever mood you’re feeling today, you could then portray yourself as that avatar.
[00:07:50] Camille Morhardt: But then could the AI actually recognize that it’s you? Would it pivot to voice then to make a recognition or what about the opt-out kind of—
[00:07:58] Neil Fluester: Well, yeah. Again, as I mentioned a minute ago, I think there’s some real challenges in certain countries around privacy rules. There are certain countries where you can’t be defined and there are certain rules, labor rules, and also union rules about being identified as being present in a meeting or how much you are inputting into a meeting. Again, the technology is there today, it’s actually been there for a long time. I remember demonstrating a solution probably five or six years ago that automatically could identify you and put a lower third with your name down the bottom. It would scan your active directory photo and put your name down the bottom. It is great and everyone was like, “Oh, wow, we really want this.” But, then again, there are certain jurisdictions, certain places where you’re like, “I don’t want that.” So the ability to, again, yes absolutely, opt in and opt out.
I guess it’s the same with recording. I’m watching and there’s a flashing red recording thing at the top of this call that we’re doing. There has to be that point that you are known that you are being recorded so you can feel safe here in what you’re saying and obviously be a little bit more cautious. I think the same is true for a live meeting. You want to be able to say, “Okay, we’re going to turn this on, we’re going to use this, so we all need to opt in or opt out of that.”
[00:09:13] Camille Morhardt: So what about once the software can identify a person, then you also mentioned this capability to filter by individual input. What you just mentioned is, “Well, I could actually tell you how much Camille versus Neil spoke during this meeting we know.”
[00:09:28] Neil Fluester: Oh, hopefully it’s 50/50.
[00:09:31] Camille Morhardt: Besides that, if I am not interested in listening to the hour long recording of the meeting that I missed, I could filter on just what my boss said. That’s where I’m interested in tuning in. And so, questions then come up for me about what does that mean then for, let’s say in this case the boss, who maybe is comfortable being recorded on an hour long meeting with our old standards of it would take somebody a lot of time to then go through that meeting and record only what that boss was saying versus… A bit out of context, right?
[00:10:14] Neil Fluester: You have to listen to it, it’s like a tape. You play it at the beginning and if it’s not non-linear that you can kind of jump. And that’s where we’re getting to a non-linear form. We’ve had some smart speakers have been around for a year or two that have kind of talked about doing this stuff. And again, the major collaboration platforms are now looking at this, but actually in some of the video cameras, microphone and speakers, they’re having this transcription capability in there. So the idea that you can actually transcribe this meeting, speech to text has got a lot, lot better than it used to be. I remember back in the day, sort of dragon dictate, I think it was called back in the sort of nineties on Windows 3.1, and you’d sit there and talk to it and it was about sort of 60 or 70% accurate, and you’d have to go and change it.
It’s pretty darn good now, so the idea that you could actually have your collaboration meeting, your sales meeting, marketing meeting, whatever it might be, have that whole thing transcribed. But moreover, the more powerful thing, as you mentioned, is that I can now nonlinearly go and delve into specific parts of those meetings. So I can create a word cloud and say, okay, when was the term X, Y, Z used? And I can then jump to those points or when did that person speak? That’s the person, that’s the chunk of that our meeting, I just want to listen to the five minutes that was on the Q four numbers or the FY 24 plan in that whole all hands meeting. The rest of it, top and tail, is kind of irrelevant to me. That’s the power that this idea of non-linear transcription, searching of a meeting can then bring to office users to make them more efficient. Layer that on within a level of AI, and then, wow, we’re then really going to start to accelerate this stuff forward.
[00:11:46] Camille Morhardt: It sounds like there’s quite a bit of convergence in the direction that this is heading. Is there any divergence? Are we seeing anything where one company’s going off to the left, one company’s going to the right, totally different opinions on how to move forward?
[00:12:00] Neil Fluester: There is these general trends, and again, there’s several major players in the collaboration space. I think everybody knows them, but they generally tend to be following the curve, following the puck when it comes to this kind of technology. Again, we had the things like, okay, background blur was the first thing. Then it was to say around the kind of auditorium views, these different layouts. Now as I say, the avatars and then the transcription. I think one thing that’s interesting though is that there are a few companies that are being, when we talk about video conferencing, it conjures up this idea of like, I’ve got meeting room A talking to meeting room B, and they’re having a sales meeting, a marketing meeting, or an all hands meeting, whatever it might be. But there are some specific applications that video conferencing could be very powerful for, video interviewings, video consultation for medical webinars, podcasts, training recordings now, and content creation where you can use this equipment.
And I think there are certain companies in the collaboration space that are having some quite specific application based focus. So they’re creating not just, you can do a video call from A to B or A to B to C to D with people working from home or in an office. But also you can create this as a video studio. You can create this as a video interviewing platform. And again, all these different vertical applications, which can be quite niche, but where you need some specific requirements around the technology or the platform to be able to deliver that. And again, I guess that would be, the divergence would be, as I say, those vertical specific applications that some of the collaboration come. But again, I don’t want to get in trouble if I talk about which one does which. They’re all great. Switzerland.
[00:13:39] Camille Morhardt: Yeah, no, let’s not play favorites. But you are seeing there may be a customization for a medical video recording versus a different kind for whatever reason.
[00:13:48] Neil Fluester: Absolutely. And I think video consultation was a huge thing through COVID, unfortunately, the technology wasn’t quite there. But now the adoption, from a user point of view and capabilities from a platform point of view, and then from a network point of view as well. I think it really is at a kind of tipping point where this stuff can be way more practical to be able to be deployed and used and people are more comfortable doing it.
I think the other interesting one that I saw recently is around things like contact center. I mean, contact center is a huge growth area, but the idea that you used to ring up a contact center and say, “I’m having a problem with my X, it’s this bit here and this bit’s broken, and what do I do with my router?” And they’d say, “Well, have you tried unplugging this and plugging it back in again?” If you could get your mobile phone and cell phone and go, “It looks like this and show them a video.” You ring up your medical practitioner and you’re trying to describe, “Yeah, I’ve got this rash on my arm. It’s a slightly redy pinky kind of color.” If you could go, “It looks like this.” To have that visual piece and bringing, again, video into those specific applications or for a contact center, I think is going to be hugely powerful moving forward.
[00:15:00] Camille Morhardt: Advancing video to the point where we’re in the room essentially with a person, is any downside, are people talking about any kind of downside or is it just pure like, “Yay, I don’t have to sit in traffic anymore, ever again?”
[00:15:13] Neil Fluester: I mean, I’ve been doing in the video conferencing business for, oh goodness, about 20 odd years now. There’s always been this, “Oh, video conferencing can replace travel, can reduce all your costs. You’re going to save all this carbon, save all these airfares by doing it.” But we, as human beings, we need connections. We need those physical connections. And I think it’s always, always, and again, I’ll get in trouble from my company for saying this, but the face-to-face meeting is still, as a human, what we crave for, we crave that human interaction. But I think there is now the ability for video to supplement that or enhance that. Yes, the practicalities of commuting to an office and companies are now having to earn the commute of their employees by making the office a more useful, interesting place for them to go at because they’ve got these great home environments, home studios set up, great cameras, great audio lighting and all the rest of the great stuff. So why do I need to come into the meeting for the face-to-face?
So I think there’s always going to be that concept but video is never ever going to replace it. Even if we get to putting VR goggles on and having virtual meetings around a table, I still want to go and meet people and go to lunch, have a beer, have a joke, meet around the coffee machine, and have that real human interaction. But when I then don’t travel to North America or to Europe or wherever to see these people, these connections that I have, video can be a great enhancement of that to then keep that conversation going, to give us our actions and follow-ups in between those face-to-face interactions.
[00:16:49] Camille Morhardt: So, Neil, you do Crest TV and just tell us a little bit about your perspective on how to do video podcasts or whatever we call them, I don’t know, video interviews? Video TV, that’s redundant.
[00:17:02] Neil Fluester: Again, I started doing weekly live video podcasts as lockdown started because it was really hard to reach our customers at the company I was working for. We used to go to the nearest Marriott and we’d have a partner event or a customer event, and they’d all come in and we’d talk to them and present to them with PowerPoint. Then if everyone’s locked down and can’t travel and can’t move, how do we reach them with our message and our content? So again, creating that video piece was something I then sort of looked into.
For me, I made a lot of YouTube videos around our products, and I like to be more real than produced, and I like to be more natural than scripted. I didn’t quite know what you’re going to ask me today. We’ve gone off at all signs of crazy tangents, and when I have my guests on, we invite them in and they Say, “Oh, what are you going to ask me? Can you give me a kind of bulleted list or a bunch of questions?” And it’s like, “We’re just going to have a chat and I’m going to tell you that I’m going to press record. I won’t surprise you. We’re not going to say anything bad about anybody. We’re going to make everybody look good and promote everybody.”
It’s that thank you economy. It’s the idea to kind of provide free information and free knowledge and learning, and by proxy and by thanks, you’ll then maybe buy some products. But it’s not for us to sit there and go, this product’s amazing. This is brilliant. Look at our new shiny X, Y, and Z. I’d love to bring in the thought leaders, the factual people, the experts in certain fields to say, okay, A, because I love finding out this information. So it’s more for me, really. But then if other people want to come along and learn stuff as well, then that’s a great thing as well. So yeah, we do it every Thursday, every week on YouTube and on our website. And yeah, it’s just a conversation and a chat about, could be anything generally, unified communications, AV and the industry that we’re in, but we’ve had some interesting ones over the years.
[00:18:56] Camille Morhardt: Yeah, and you’ll grab some people off a show floor or something and just kind of talk with people at different companies about the latest things that they’re doing and working on, what is the newest thing that’s out there in this field?
[00:19:08] Neil Fluester: In most cases, it’s our partners that we’ll go and promote and go and talk about, because again, that’s part of the partnership that we build with them. It’s the one plus one equals two or three or four, whatever you want to say about the partnerships, but the “better together” theory that us and them can create this capability or solution to solve customers problems is really great. And again, I love technology. I’m a geek. I’m a nerd. I love all the latest kit and toys and stuff. So again, it is me going around like a magpie going, “Ooh, shiny.” Looking at stuff.
[00:19:38] Camille Morhardt: I like that. Neil Fluester from Crestron and also host of Crest TV. Thanks so much for joining us today.
[00:19:45] Neil Fluester: Thanks, Camille for having me.