Skip to content
InTechnology Podcast

The Future of Enterprise AI with SambaNova: Bigger Models, Smaller Hardware Footprint (211)

In this episode of InTechnology, Camille gets into SambaNova’s enterprise-grade AI solutions with co-host Stephanie Cope, Portfolio Development Manager at Intel Capital, and guest Rodrigo Liang, Co-Founder and CEO at SambaNova Systems. The conversation covers SambaNova’s full-stack AI solutions, scaling AI at the enterprise level, and the future language expansions of AI models.

Try out Samba-1 here.

Learn more about Intel Capital here.

To find the transcription of this podcast, scroll to the bottom of the page.

To find more episodes of InTechnology, visit our homepage. To read more about cybersecurity, sustainability, and technology topics, visit our blog.

The views and opinions expressed are those of the guests and author and do not necessarily reflect the official policy or position of Intel Corporation.

Follow our host Camille @morhardt.

Learn more about Intel Cybersecurity and the Intel Compute Life Cycle (CLA).

SambaNova’s Full-Stack AI Solutions

Rodrigo talks Stephanie and Camille through the founding and mission of SambaNova. He shares how he founded the company in 2017 with two professors from Stanford University, with their focus on the entire stack. They wanted to build models down to chips to provide a better computing platform that allows enterprises to run bigger AI models trained on private data and inference at scale, which they have now achieved. Rodrigo emphasizes it’s thanks to many investments, including from Intel Capital, that SambaNova has been able to build the chips, servers, compiler stack, models, and trillion-parameter LLM known as Samba-1. What the company can now provide is a significantly reduced cost and improved performance of chips, as well as a way to democratize running models on fewer chips with easier deployment. Stephanie echoes Intel Capital’s reason to invest in SambaNova, noting how impressed they were with the company’s holistic approach to building a custom purpose-built AI accelerator while also offering it as a service to enterprise customers.

Taking a closer look at the Samba-1 model, Rodrigo details how it keeps up with new generations of models and how it can composite other open-source models. SambaNova is able to keep up because they provide a rack with hardware and software, which includes the Samba-1 model. While generative AI models can cost hundreds of millions of dollars to train, SambaNova believes in leaning into the open-source community by pre-training Samba-1 on the best open-source models in order to provide customers with a more cost-efficient solution. Samba-1 is currently a composition of over 90 experts, or pre-trained open-source models. This leads to more efficient performance, better response times, and faster model improvement.

Scaling AI at the Enterprise Level

Next, the trio dives into implementing and scaling AI at the enterprise level with SambaNova. Rodrigo shares how their focus is Global 2000 companies, or any company with private data used to leverage and drive business that can be trained into a gen AI model. While an enterprise could invest in the hardware, time, and people to train models themselves, Samba-1 provides the same level of results and can be up and running in only a few days. A common challenge Rodrigo says enterprises are facing in the AI landscape right now is wanting to take advantage of the open-source models by training their private data into it, but they still want to retain their ownership of that data. That’s where Samba-1 comes in because while the models are based on open source, they are pre-trained and pre-optimized, and then customers can fine-tune the models on their private data and own it in perpetuity.

What makes SambaNova’s solutions so scalable, says Rodrigo, is the composition of experts seen in Samba-1, which allows for a bigger model to run on a smaller hardware footprint. This is because they have created a hardware platform that allows hosting for hundreds of models at the same time and in the same memory footprint and hardware device, which allows swapping in and out of a model onto the substrate in a millisecond. In other words, it does what hundreds of GPUs could do but in a much smaller number, all while hosting hundreds of users instead of one user at a time. Another added benefit to Samba-1 is how it is rolled into a customer’s secure environment, allowing for a ChatGPT-like gen AI experience that’s much faster and privately run.

AI Models and the Power of Language

Finally, Rodrigo touches on the future of gen AI and LLMs in regards to language. While many LLMs are trained on the English language today, he underscores that AI will not be a technology only for English speakers, pointing to SambaNova’s work with Hungarian, Japanese, and Thai for their international customers. Rodrigo believes the future of AI and commerce will be powered by language. To achieve widespread adoption of gen AI, he outlines how models need to be segmented and regionalized for different languages and cultural customs.

Stephanie Cope, Portfolio Development Manager at Intel Capital

Stephanie Cope genAI generative AI Intel Capital SambaNova

Stephanie has been with Intel Capital since 2022, where she now helps accelerate product development and market adoption of Intel Capital’s portfolio companies like SambaNova. She previously led Strategic Tech Innovation in the Health and Life Sciences division at Intel. Stephanie was also a Yield Engineering Manager and Yield Engineer at Intel, as well as a Regional Manager and Application Physicist at Wyatt Technology and a Research Associate at Arizona State University. She has a Ph.D. in Physics from Arizona State University.

Rodrigo Liang, Co-Founder & CEO at SambaNova Systems

Rodrigo Liang genAI generative AI Intel Capital SambaNova

Rodrigo has been Co-Founder and CEO at SambaNova since 2017. Prior to founding SambaNova, he served as Senior Vice President at Oracle, where he led teams in designing microprocessors and ASICs for the Sun product line, and as Vice President of Sun Microsystems. He has a Master’s in Electrical Engineering from Stanford University.

Share on social:

Facebook
Twitter
LinkedIn
Reddit
Email

(00:00) Announcer: You’re listening to InTechnology podcast. This episode is one in a series with innovative companies that are part of the Intel Capital portfolio. In these conversations we’ll explore the key areas of technology Intel Capital invests in to shape the future of compute.

(00:24 ) Camille Morhardt: Welcome to InTechnology podcast Intel Capital Series. I’m your host, Camille Morhardt, and today I will be co-hosting an AI-focused Intel Capital portfolio company with Stephanie Cope. Stephanie is portfolio development manager at Intel Capital, which she joined a couple of years ago in 2022 from Intel’s Health and Life Sciences division where she was driving strategic innovation. She is now responsible for accelerating product development and market adoption of Intel Capital portfolio companies. Welcome to the podcast, Stephanie. I look forward to co-hosting with you.

(01:00) Stephanie Cope: Thanks, Camille. Thank you so much for having Intel Capital on the InTechnology podcast. I’m honored to be here and even more honored to introduce Rodrigo Liang, CEO and co-founder of SambaNova Systems. Rodrigo has been a luminary in high-performance computing for nearly 30 years. He spent the majority of his career building full stack hardware and software offerings, specifically for the enterprise. And before joining SambaNova and starting SambaNova, Rodrigo was leading the teams responsible for designing microprocessors and ASICs over at Oracle for their Sun product line, which at the time was one of the largest engineering organizations in the industry.

At that time with Rodrigo at the helm, he was continuously breaking world records for performance while at the same time building enterprise-grade applications. For those of you in the HPC world, you can appreciate what a unique combination that is. Today, he’s breaking similar performance barriers at SambaNova while making the latest advances in GenAI accessible to the enterprise.

So with that, Rodrigo, maybe I can hand it over to you to share the story of SambaNova in your own words. Maybe you can take us back to 2017 and share why you founded SambaNova.

(02:15) Rodrigo Liang: Well, Camille, Stephanie, thanks for having me. It’s great to be here. As you mentioned, we’re a grounds-up AI company. We’re thinking about how to build a full stack to help enterprises transition from this world that’s pre-AI to post. So back in 2017, I started this company with two Stanford professors, really thinking about the entire stack. How do we build from models all the way down to chips so that you can get a significantly better computing platform that allows you to train, allows you to inference, allows you to do everything you need to do at the enterprise level as you run these models bigger and bigger and bigger and you want to train them with your private data and you want to inference them at scale? So that’s really what we’re focused on at SambaNova. We’re out there building these platforms for enterprises to start on their journey with GenAI.

(03:10) Camille Morhardt: You’ve raised over $1.13 billion in capital, just to get a sense of how many people think this is a good idea.

(03:21) Rodrigo Liang: It’s a capital-intensive endeavor. We build chips. We build the servers. We build the compiler stack. We build the models. We build the trillion parameter LLM that companies can take and own, called Samba-1. When you take a full stack, it’s extremely ambitious of an endeavor. Yet, we’re out there shipping. We’re in three continents. We’re on a fourth-generation chip. We’ve got multiple versions of the software stack that’s been released over the years. We’ve got Samba-1 that we announced early this year and is shipping today. But we are the best funded chip startup in the world, and Intel Capital come in very early. But, really proud of the collection of folks that we brought together. If you’re taking on an endeavor like this, you got to come in with deep pockets and continue to invest and continue to innovate, and we’re excited to be in the game.

(04:08) Camille Morhardt: Well, here’s a question. Why in the world, you’re taking on…? On the one hand, it’s great. You’re putting together this comprehensive solution all the way from chip design–where you can optimize at every layer– through plug and play at companies, walking companies through it. On the other hand, you’re now taking on essentially any kind of competitor any way across that stack. So why is it working, I guess? What are you doing right? Why are people starting to see that shift right now?

(04:39) Rodrigo Liang: Well, I think if you look at the demand that exists on the market, there’s just an incredible demand for AI computing, and so you have other hyperscalers also starting to offer their own platforms. Of course, Intel and AMD have their platforms. So multiple people are trying to get into that computing space.

Well, SambaNova, when we started back in 2017, we started thinking about certainly the computing platform’s important. You’ve got to find a way that you can actually significantly reduce the cost and improve the performance of these devices so that you can actually change the trajectory that we’re on. If you’re looking at the Nvidia chips, they’re pretty aggressive about driving new chips out, and they’re driving performance and driving the size of these chips, but ultimately the cost is also going up and the supply chain of it and the availability of it is hard. So, a lot of people are starting to realize that they don’t want to be dependent on a single supplier because of those reasons.

But more than just the chip itself, I think what’s becoming a realization, which we saw fairly early on, is finding the talent to train the models on these chips is the next big challenge. If you’re a hyperscaler, if you’re a Google or an Amazon or a Meta, maybe Apple, you can hire thousands of machine learning PhDs to help you train these models. But if you’re not one of those–and that’s 99% of the companies in the world–well, what do you do with these chips? Even if you fight hard and you find these chips, then what do you do with them? How do you train them? How do you inference them? How do you get your data piped into these models so you can create some differentiated value? Those are really hard.

So, what we decided to do is that, unless we unlock that, GenAI will be a technology that’s only accessible to the top handful of companies. That’s really where SambaNova has really been focused on. How do I democratize this not just by making chips significantly better so that you can actually run these models with fewer chips at high performance, but also making it easier for you to deploy?

(06:50) Stephanie Cope: Rodrigo, if you don’t mind, I’d love to interrupt you for a second because I think you’re really getting at the heart of why Intel Capital invested in SambaNova.  There’s two main reasons and core components of our investment thesis and I think it was the market at the time and then, of course, this differentiated company that you’ve built. You know, there’s a huge compute deficit in AI, and if you look back decade-plus-ago when we were in the deep learning era, the exponentially growing gap between our appetite for consuming compute with AI was growing versus Moore’s Law. At the time, I think most of the market was addressing this problem through purpose-built processors. There’s one big challenge in that and that is the programmability of these custom-built ASICs.

So at the time, Intel Capital had a specific investment thesis around how to make AI more accessible to the practitioner. And SambaNova was directly addressing that with their holistic approach, both being able to build a custom purpose-built AI accelerator and also at the same time offering it as a service to their end customers and users. And so we led their Series B back in 2019.  Anthony Lin, who is head of Intel Capital and corporate VP at Intel, joined the board, and ever since then it’s been a great partnership between Intel Capital and SambaNova. And I think we share that unified vision.

But going back to SambaNova, Rodrigo, perhaps you can give us an update on Samba-1–what you’re bringing to market with this new product.

(08:22) Camille Morhardt: Then can you also, as part of that, explain how in the world you keep up with the pace of— you know, there’s a new generation all the time from multiple different companies. How do you keep up with that?

(08:34) Rodrigo Liang: Yeah, yeah, that’s the crazy thing. New models are showing up every month. Sometimes it feels like even faster than that, and it’s changing really fast. Just when you thought you had your Llama 2 figured out, Mistral shows up. Just when you’re starting to get your Mistral up to date and your Llama 3 shows up. That’s one of the real advantages of Samba-1.

And so our bookends, if you look at what does SambaNova sell, we sell a rack with our hardware, our chips systems and software all the way to a trillion parameter model that looks like a GPT-4 model but it’s already optimized and embedded into the hardware. So our bookends are chips that compete with Nvidia and a trillion parameter model on the other end that you can chat to or you can API to and you can train your private data into.

When you’re talking Samba-1, that’s on the software stack side. You look at what people are doing out there, Sam Altman talked about GPT-4 costing a hundred million dollars to train. You see Meta, even the Llama 3 models, it costs them $100 million dollars to train. So how many companies can go and train their own models? It’s just very, very difficult.  So what we decided to do here is, look, we believe this is a Linux moment for AI, that you’re going to lean hard into the open-source community. The open-source is innovating extremely quickly. It’s not just by the small suppliers and the small innovators, but Meta has open-sourced the Llama 3. Even OpenAI is open-sourcing GPT-3.5. Significant players open-sourcing these models that are becoming part of the open-source community and part of the workflow.

So with Samba-1, what we decided to do is instead of actually pre-training our model and actually having to keep up with everybody, what we decided to do is, how do I create a platform, a modular platform that allows me to integrate the best open source models onto the platform and allow them to interact together?

(10:25) Stephanie Cope:  Could you elaborate a bit on this interaction, because I think you’re really getting at the heart of why SambaNova is really a game-changing technology.

(10:34)  Rodrigo Liang: So if you look at Llama 3, it’s a great individual model. The Llama 70B is a really good model, but it can really be only good at certain things. Then as you go broader and broader in tasks, it doesn’t perform as well in everything.

But if I can give you 20 copies of Llama 3 all fine-tuned for different tasks, suddenly it’s actually an incredibly good collection of Llama 3 models that you can then use as a single composite. So we call Samba-1 a composition of experts. We’re able to take Llama 3, Llama 2, Mistral, Falcon, we can take Bloom, we can take a variety of open-source models… In fact, we just go out to Hugging Face, and we find the best checkpoints for various different tasks, and then we bring and integrate them into Samba-1 as a collection. Today, Samba-1’s already at over 90 experts that are all concurrently running on a single model. You can just chat to it, you can prompt to it, you can API to it, and have the experience of a very broad model, but it’s all leveraging open-source innovation.

So, it allows us to be significantly more efficient in operating that model because we can take a sliver of the model instead of having the entire monolithic model activate. Even if you just prompt a hello, if you say “hello,” these large monolithic models, all 1.8 trillion parameters have to get activated. If we’re a composite, we can only bring in a portion of the model. So, you’re significantly more efficient, you’re much faster in your response time, and ultimately, we can also improve the model significantly faster because we can incrementally add more experts, incrementally add the latest innovation that just showed up this week without having to incur the cost of retraining the model, which most of the monolithic players have to do.

(12:24) Stephanie Cope: Rodrigo, could you describe for us your typical customers, the typical industries that you’re going after, and what the main use cases are that they’re driving and trying to bring into production?

(12:36) Rodrigo Liang: We’re focused on Global 2000, and so any company that has private data that they use and leverage to drive their business, you’re going to want to train that into an AI model. We’ve already seen the power and capabilities of what these models do, and they’ll become everyday tools for every person in the company. So, if you’re already using your data to leverage your business so that you can provide better products, better services, better understanding of your market, better competition in the space, then you want to actually figure out a way to actually train that data into a model that everybody can use. For that, you have two choices. You can do DIY, go buy the GPUs and hire a couple hundred machine learning people and train it yourself, or Samba-1 comes and you’re up and running within days. So that’s really what we do.

Then you can do things, something as simple as I have a team of salespeople out there and they don’t understand the 200,000 SKUs that we offer in our company and all the detailed specs, I can just ask Samba-1, and it will actually find the exact part that my customer wants to use in order to actually make a better connection between the customer’s need and the offerings that we have. It could be something as simple as the customer call center. Well, a customer has a particular issue with it. Instead of having four to five or six different people and then have 50%, 50% of the calls never be resolved, maybe just from a chat interface, get to the exact issue and the exact thing that the customer actually needs. So, a lot of these ultimately get you a much higher accuracy response at a much, much lower latency, ultimately connecting the end result that your business wants at an order of magnitude, pace, and a much higher conversion rate.

We do think that in this next space, especially as you train these large, large models and you deploy them in production, you do need an alternative to Nvidia because you’ve got to pre-train, you got to fine-tune, and you got an inference. Today, we inference Samba-1 at 1,000 tokens per second. In comparison, GPT-4.0 using Nvidia chips are inferencing at 50 tokens per second. So you have to be able to train, you have to be able to inference, and you have to inference really, really fast. Why? Because the faster you go, the fewer chips you need to serve a population. If you’re going slower, then in order for you to maintain throughput, you have to then replicate many, many servers and that’s where the cost is starting to explode. We run it really, really fast, and we run it all multi-tenant that allows us to host many, many models concurrently.

So, there’s part in which we view ourselves as an alternative to Nvidia from that computing perspective. The other one is really that we see a lot that has done a tremendous job is OpenAI, it ultimately started giving everyone, consumers and enterprises, a great option for how to actually use AI without having to pre-train it yourself. Now, the challenge with OpenAI, especially in the enterprise, there are a couple of them there, that, one, most enterprises want to ultimately train their private data into it. If you’re a bank, if you are government, if you’re healthcare, you have restrictions either due to regulation or it’s just your IP, you don’t want to disclose that data into somebody else’s model.

That’s one challenge that we’re finding that the world wants is they want to train their private data into a model. But once they’ve done that, they want to retain full ownership of it. So that’s one thing that we’re serving with Samba-1, because we’re based in open source, we give you these pre-train models, pre-optimize, it’s already running 1,000 tokens per second, and you can actually then fine-tune that private data and own in perpetuity. Because customers are looking for data privacy, data security, they are looking for customization, and ultimately, they’re looking for more choice. They’re looking for more choice as far as how to operate their AI cheaper, faster, better, and having more options on what hardware they use ultimately drives better efficiency and better innovation.

(16:39) Camille Morhardt: Could you expand just a little bit on bigger model, smaller footprint when you’re talking about hardware?

(16:45) Rodrigo Liang: Yeah, so the construct of this is ultimately around building a full stack technology, that you have to understand what models you’re trying to run, and then you build the chips that empower that. If you actually look at any given model that you’re inferencing today, the utilization on the GPUs are extremely low. Why? Because when you and I are talking to ChatGPT, we’ll ask a prompt, “Give me three recommendations for a restaurant in Napa.” Boom, three, four seconds, it’ll give you the recommendation. Well, we read it. We’re chatting with our friends. We’re looking at the nice weather outside. That whole time that we’re not interacting with ChatGPT, those GPUs are live and waiting for our next prompt. Why? Because we’re an impatient culture. That as soon as I type the next thing, I don’t have the patience to wait 30 seconds for it to come back on. I want instant response.

That’s what you’re seeing today, that the GPU utilization is actually very low because it’s sitting there as a single-tenant waiting for that user to come back and interact with it even though it was able to respond to the first prompt in three seconds. So, in production, what you’re starting to see is that’s just not scalable. This idea that every user gets to have a dedicated hardware and that hardware is sitting there waiting for you is just not a scalable model.

What you really want to do is you want to do what we’ve already done with Intel hardware using VMware and doing virtualization over the last 20 years, which is thousands of users on a virtualized platform being able to do what they want to do on a shared hardware platform–that you can just spin up a VM, you’re able to use the same hardware, and you’re getting the economies of scales, and then you see the collapsing of the cost of the infrastructure.

So that’s what we did with SambaNova. We started thinking about, well, really what you want is what we call composition of experts. We actually created a hardware platform that allows me to host 500 models concurrently in the same memory footprint of the same hardware device, and we’re able to swap in and out that model onto the substrate in a millisecond. So, I can call this model, the next user can call a different model, and it comes into the hardware sequentially or in some cases concurrently, and you can swap in and out in a millisecond and produce a thousand tokens per second.

That’s really our construct that we created a hardware platform with a certain LLM architecture in mind, which is now reflected in Samba-1. Now I can put hundreds of open- source checkpoints on Samba-1. I can get hundreds if not thousands of users come in and concurrently. They can all pick different models that they want to use. It gets loaded into hardware. Then as soon as you’re done, we swap it out and the next user come in and can use a different model, inference on that model or fine-tune that model and then it shares the same hardware. This is where you take what otherwise would’ve cost, say, 200 GPUs and collapse it to eight because I can multi-tenant , host hundreds of users instead of having to do a single user at a time.

(19:49) Camille Morhardt: Makes sense. I know that your customers can work from either public cloud or on-prem if they want to keep the servers on-prem. Can you just give us an idea of a kind of an application or a use case that an enterprise can do maybe with SambaNova or just LLM or GenAI in general right now that you think it’s timely, it works?

(20:14) Rodrigo Liang: Our Use Case #1 that we use for enterprises, every enterprise that we talk to is all this data you’ve accumulated over decades, you don’t know what it says. Now, ChatGPT has given everybody a sense of, how do you unlock the world’s data so you can actually interrogate it for whatever you want and produce very pertinent generated content based on the public data? Now, how do I do the same for private data? How do I generate contracts in the way that I generate contracts as Intel, for example? Or how do I create offer letters in the way that we generate offer letters. Or how do I create customized product specs in the way that we always generate product specs? How do I generate FAQs just by understanding what are the most popular things that come in on the chat interface and produce intelligent responses on those FAQs?

So, you think about all these generative things that we do today by hand with humans trying to read all this technical documentation and translating that, those can be generated in three seconds by GenAI. Just by understanding the information that you already have, understanding all the internal documents, PDFs, Docs, chats, Slack channels, understanding all of that and say, “Give me the 10 steps for how I actually operate Samba-1?” or “how to install X, Y, Z?” Then you can look at it and evaluate it against what you would’ve generated yourself.

We have this playground called fast.snova.ai, which your listeners can try as a free open environment where people can go and just try. It’s a subset of Samba-1. We’ve got about 16 to 20 experts on there, various multilingual ones. You’ve got Llama 3. We’ve got Mistral in there. In there, you can ask something like, “Give me the top four key differences between the US and the French Constitution.” This is not something that has already been generated by somebody sitting there for you to go fetch.  Now, I do that comparison because that’s a very common thing that people do internally with the private data. “Give me the top four differences between the contract that I signed with Customer A and the contract I signed with Customer B.” “Give me the top five issues that I might have with the latest GDPR rules that came down.” “What are the top differences between, I don’t know, the export control laws in Thailand and the export control laws in the Philippines?”  That’s something that would take somebody some work. It takes three seconds for you to generate that through Samba-1. So you can ask that question.

Now again, very similar type of use cases where it allows you to, in a very short amount of time, get to the highlights of what you need to know. And so, ultimately, what we say, “Look, the productivity of every knowledge worker in companies will jump 10x over the next 10 years.” Why? Because you can see this. What would’ve taken you two days, maybe a week to research, you can get in three seconds. And what we can do is by, just deploying Samba-1, we can start reading all your private data and creating that knowledge base across your entire business–for your HR, for your finance, for your legal, for your manufacturing, for your supply chain. Every single one gets a separate expert that has his own secure access control. Then you can actually make that available to all the people.

So that’s really the way we think how AI is going to work, that the use case is just unlock the knowledge you already have and make it accessible to everybody because most people have already tried OpenAI and ChatGPT and they know what it can do. Now make it pertinent to my work environment by giving the data I care about.

(23:47) Stephanie Cope: Rodrigo, what’s the level of expertise of your end users? Is this going to take an entire data science team to enable? Do they just need to point to their data?

(23:56) Rodrigo Liang: We created Samba-1 with the idea that enterprises are going to be in different stages of their journey. We have certainly some very, very sophisticated customers. They’ve been training their own model, and they just want to ingest it into Samba-1. Today, we do this in Japan with one of our customers where they’ve got the most sophisticated Japanese model, and they want to offer that to all the customers; but they want to do it in concert with the 90 plus experts that we have on Samba-1. Samba-1 also comes in with all the other languages, plus it does all these other tasks, text to SQL and all these other things and plotting charts.

One of the amazing things we have an expert in that can go into a logarithmic table and look for points within log scale. If you say, “Hey, I’m looking for a device that under 87 degrees Celsius, it leaks less than 1.6 milliwatts,” it goes into a product sheet and looks at a logarithmic table and finds those points and curves. So they’re like experts like that.

But then you have very sophisticated ones where you say, “Hey, I have this model, I want to include it,” and so there’s a very straightforward path for you to ingest that model into Samba-1. Then the 92 excerpts become 93 or 95 or however many you ingested. So sophisticated customers like that, that are already training and use the models, and then we just become an expansion.

Then we have lots of customers that, frankly, they’re just using ChatGPT–un-customized–but they don’t want their prompts visible to somebody else. When you type a prompt on ChatGPT, it’s visible to the world. Stephanie, you’re in this space where you’re interested in companies, if you’re asking a lot of questions about a particular company you might be interested, that’s visible to the vendor, and people don’t like that. So, when we roll in Samba-1 with our hardware into your secure environment, nobody sees that because it’s inside your own firewalls. So, you can prompt those things even on a base model from Samba-1 that you haven’t fine-tuned. Within the first day, you can chat to it and have ChatGPT experience, but it’s running 10 times faster and it’s running privately.

(26:02) Camille Morhardt: Hey, Rodrigo, do you think that there’s going to be any sort of path in the future to vertically pre-trained models or task specific? Like, “Hey, we’re going to have the very most perfect customer service call,” and then you go tweak it for your individual company and answers. Or, “We’re going to have something pre-trained for, I don’t know, imaging in the medical space,” and then a bunch of different companies are going to adopt and then put their specific data in it. Are we going in that direction?

(26:32) Rodrigo Liang: Yeah, for sure. You’re seeing this already. Look, AI is not going to be an English-speaking-only technology. If you look at where things started, obviously with ChatGPT and most of the LLMs, the American English is extremely good when you chat to it. Gradually we’ve added other languages. But by the time you get to what we call “low resource languages”–these are languages where there’s not enough data labeled for us to train those models properly–the accuracy of those languages drops significantly.  Today, you’re already seeing companies like SambaNova produce very well-fine-tuned experts for languages like Hungarian. We powered the Hungarian language for the Ministry of Hungary and the largest bank in Hungary called OTP Bank. Or Japanese and Thai, so we work with companies in Thailand that will produce the best Thai experts. So, it starts with language. My belief is in the world of AI, you control language, you control commerce. If you’re not talking to the model properly, it’s garbage in/garbage out.

So, we have to have a linguistics model, which today we’ve proven that you can achieve a higher accuracy of the language model by having an expert model versus us be part of a large monolithic model. As soon as you do that, now you can start forking every language. You can start saying, “Let me have a legal expert.” Because today, if you talk to ChatGPT, the legal expertise is American legal. If you want to actually start going into the laws of every country, English is not enough. Because English, you want a British English or Australian English? So you start getting very domain specific, either regionally or vertically.  If you think about the word “quarter,” a quarter in sports is very different from a quarter in finance, from a quarter in cooking. So you’re thinking about the linguistics by domain, the accuracy will matter depending on context.

We’re starting to see it, and you’re starting to see even other companies starting to produce LLMs that are very vertically oriented. I’ve got an expert on finance. I’ve got an expert on legal. I’ve got an expert on healthcare or even subsections of healthcare. So, you’re going to continue to see that segmentation, and then that segmentation is going to go even further into per, language per region.

I do this example that I like to do. I go to certain parts of the world and I’ll ask ChatGPT to “give me the recipe for how to host a proper, formal dinner.”  But if you are in Asia, for example, that’s not at all the way that they would actually host a formal dinner. I did this in France, and even in France, they were quite appalled. So, what we want to do is regionalize this. Every region has its customs. Every region has the way they do things. So, you need these models to reflect that because regionally you want to make sure you promote and propagate the things that you’re doing, your people, and make sure that those are protected and continue instead of all aligning to a single generic American model.

(29:40) Camille Morhardt: Super fascinating. Hey, Stephanie, I have a question for you. How do you go about helping drive market adoption with such a new product that’s already expanding so quickly and changing almost every day it seems?

(29:55) Stephanie Cope: Yeah. As part of portfolio development, our goal is for Intel Capital to be an innovation partner with the Global 2000s, so I spend a lot of time talking to end practitioners, innovation arms of large enterprises, understanding what their pain points are with GenAI adoption, with infrastructure, etc. And for every one investment we make, we’re probably looking at upwards of a thousand companies. There’s a ton of noise in this industry, so many players. The market is changing so rapidly that we allow the enterprises to leverage our due diligence in that process of making these investments to understand where the noise is and where the needle in the haystack is to be able to really move the needle.

And I think in talking to those end enterprise customers, we hear the same pain points across the board. One is time to get their GenAI models in production. The second being costs; building out an infrastructure is cost prohibitive for many of these players. Then third is that skillset that Rodrigo already spoke to–being able to build out this team if you go the DIY approach.   And so  SambaNova addresses all of those pain points directly, and so we really are trying to be an innovation partner and a bridge to help bring SambaNova to market.

(31:11) Camille Morhardt: Well, thank you both so much for your time today. It’s been fascinating conversation about the future of AI and really how enterprises can adopt it quickly and scale it and actually put it into use. I also thought it was very interesting to hear this notion of bigger model/smaller hardware footprint, how those things pull together, and just even this future of moving from assessing all of this written knowledge that’s sitting within your organization, having access to that and interpretation of that, and this migration now or evolution toward actually the ability to take action or for the model to even generate charts as opposed to just tech summaries or even suggest to you, “Well, if you’re looking at this, you might consider also looking at this.” Really, really interesting. Thank you both for your time.

(32:04) Stephanie Cope: Thanks, Camille. Appreciate it.

(32:06) Rodrigo Liang: Yeah, thanks for having us. Look, AI is a race, and enterprises are trying to figure out how it’s going to impact their industry. So, we’re super excited to be here and help. We do think that the ones that figure out how to leverage the technology, drive their business and drive growth, those are going to be the winners in their sectors for many, many years to come. So, I’m super excited about the way that technology is unfolding and really great to have the partnership with Intel and Intel Capital.

More From

Scalable, At-Home Diagnostics: How SiPhox Delivers with Silicon Photonics (214)

Mark Rostick Joe Moye Kevin Reid Beep Intel Capital autonomous transportation microtransit

Beep’s AI-Enabled Transportation Solutions: Connecting Communities and Extending Mobility Access to All (213)

What That Means: Net Positive Water (212)