Announcer 00:00
You’re listening to InTechnology podcast. This episode is one in a series with innovative companies that are part of the Intel Capital portfolio. In these conversations, we’ll explore the key areas of technology Intel Capital invests in to shape the future of compute.
Camille Morhardt 00:24
Hi, I’m Camille Morhardt. This episode is part of the InTechnology Intel Capital series, and today I will be co-hosting a conversation with Nick Washburn, who’s a Senior Managing Director at Intel Capital. Nick is a voting member of Intel Capital’s Investment Committee and co-manages the Cloud Investment Domain; that focus includes early-stage investments in Cloud-native infrastructure, developer tools, AI, ML, and data platforms. He is also a director on the board of the National Venture Capital Association and is a member of the Kauffman Fellows.
Welcome to the podcast, Nick. Who have you brought to the conversation today?
Nick Washburn 01:04
Thanks for having me, Camille. I’ve brought a good friend and close partner to us here at Intel Capital, who I’ve the privilege of working with for a number of years. Kurt Mackey, who is the co-founder and CEO of Fly.io. Interestingly, Kurt is based in New Orleans by way of Chicago. He was in Chicago when we first invested and is now in New Orleans. So, I have one person I know in New Orleans, which is always exciting. So, I will turn it over to you, Kurt. And why don’t you start off with what is Fly.io?
Kurt Mackey 01:33
Sure, by the way, it’s always good to know one person in New Orleans just as an excuse to come to this particular–it’s like knowing a person with a boat, right? Fly.io is a developer-focused public cloud, the general theory of this is that in ten years, the next version of AWS or GCP–the kind of big public clouds–is gonna be something that starts with developers; it’s gonna be like kind of hyper focused on DevOps for a lot of reasons I’m sure we’ll get into, but primarily because it makes sense for devs to be able to ship their own work faster, and automate away a lot of the code the cruft that enterprises accumulate over time.
I think the interesting thing about us is we started not quite as this, we started as a way to run apps close to users as kind of an edge compute platform. And over time, as we worked with customers, we just sort of just discovered that we were actually building a public cloud. Obviously, compute should run close to users. That’s not like a novel feature. It’s kind of a baseline requirement of whatever the next type of cloud has to be.
Camille Morhardt 02:27
So, what does it mean to be oriented towards developers?
Kurt Mackey 02:33
There’s probably two ways to tackle that question. So like, I think the simplest is there’s a class of developers–and I was one of these– what I call kind of impatient, pragmatic developers who just want to ship things and effectively make money off of them and they don’t necessarily have the patience for all of the kind of ceremony that goes into the infrastructure and the bureaucracy that’s kind of sprung up around some of these things.
So like, the people I love to talk to most about that are PHP developers. The whole world is talking about like VCs putting a ton of money into JavaScript ecosystem, all these frameworks and things and PHP developers are just off making money somewhere; they’re just not interested in the hype, necessarily. The tool is a way to enable their apps and businesses, if that makes sense. And so I think that a developer-focused public cloud is actually perfect for the PHP devs. Not the most glamorous thing for JavaScript thought leaders necessarily, because it’s kind of boring in a lot of different ways. But what it reduces to is devs, being able to incrementally ship impactful work quickly, with a minimum of ceremony; they don’t have to go ask permission for certain things, they don’t need to go talk to another team to get the most bog standard of applications up and running. And they don’t need to go talk to people to evolve from kind of our first love simple thing to a very complicated, scalable app down the road. It’s just taking away a lot of times, you have to have conversations effectively.
Nick Washburn 03:53
Kurt, when the public cloud server was born, there was a couple of shifts that happen. One was obviously migrating capex cost for a company to operating costs–meaning I don’t have to run my own data centers. But there was also a simplicity story at the origins of the cloud, as well. You don’t need to think about a lot of things, you can kind of just ship an application on some primitives of compute storage and networking and gets up and running. Now since that time, things have obviously changed to where we are now where it’s incredibly complicated to use a cloud environment.
What was the driver of why did it start simple? Why is it complex now? And then why is Fly back focused to start on simplicity, even though certainly things can scale?
Kurt Mackey 04:34
I think if you go back to the origins of AWS, generally, what AWS grew into, was, I think a reaction to data centers are not they’re like a known quantity. We know how to build and operate data centers as a company. And if your company happened to come across a person who knew how to build and operate a data center, they would have not necessarily seen the benefit of the cloud. What happened, I think, is AWS made it to where the companies that were really bad at operating data centers just didn’t need to worry about it. And what it did is it just replaced expertise in a team you’d hire locally, with tooling that were consistently that did what the vast majority of enterprises needed, that let them kind of get the things that they’d have to go procure from Dell and Intel over some procurement cycle that they then have to manage. And so, it just put a lot of that behind API’s.
I think the trick here is API’s instead of people. And I think that’s the whole story of the cloud. But they also grew up in this world where they were trying to replace a local data center in Chicago, or someone buying co-lo in Atlanta, with a cloud. And so, what they did is they ended up building these clouds in these cities and you’d end up moving next door, basically, like you’d end up going off of your data center that you leased from Equinix, or Digital Realty Trust. I think they, what they did was they got rid of the people element that used to exist, I’m old enough to build this company, because I have racked and stacked servers. But if you go talk to 99, out of 100 startup founders have never even seen a physical server in their life at this point. And so that’s kind of the impact Amazon had on the world.
Nick Washburn 05:58
I think the interesting thing, Kurt, is you’re still racking and stacking servers because Fly runs its own net.
Kurt Mackey 06:04
Yes. It’s actually a superpower for us, because we can build data centers and networks and like not many people can do this. So we can actually tackle this from first principles in ways that are fun.
Nick Washburn 06:14
Now coming at it from an on-ramp to shipping apps quickly comes being opinionated, somewhat, and having an abstraction that is high enough to enable someone to create a dock or image use from the CLI and it kind of geographically runs. What are the trade-offs of that?
Kurt Mackey 06:33
Um, that’s a good question. I think it’s one of these is a trade-off of all abstractions. Most developers can run Fly launching and get their app working on the cloud; they don’t necessarily understand enough of what’s happening to make good decisions about how to build their application all the time. One of the things we’ve been wrestling with recently is like disks fail. If you store data on a disk, with enough customers, someone’s losing their data every day. And this is true for us. This is true for AWS. Developers that aren’t quite as experienced with infrastructure just don’t intuitively understand this. And what happens with the abstractions we’ve given them is it makes it so easy to get set up on infrastructure, that you may never have to learn these things until it becomes a problem down the road. Whereas if you were doing it on hard mode, if you were going in racking and stacking servers, you’d end up developing this expertise that I think goes into helping and building a good back-end application.
I think that’s probably the problem with any kind of technology; it automates a lot of what used to be hard, there’s always leaks in the abstraction. And even with automation, I think the people who use it best are the ones that most deeply understand what’s happening under the covers. But again, we get back to like the PHP devs. And that’s not them, and they don’t care. And so for the most part, the trade-off is like finding the right, I think, UX that matches what they expect to have happen. And so what we shouldn’t be doing is giving PHP devs a single volume that can fail and lose their data; what we should be doing is giving PHP devs a way to launch your app that is the right kind of layout for their particular application. And fortunately, like almost every full stack app has similar requirements–they need to run app servers, they need a database that doesn’t go away, they need some kind of way to send email, they need backups of their database. There’s just a lot that’s very standard across literally every project on the planet. And so what I call it is a tractable trade off, but it’s, that’s that to me, the big one is you can be very naive about what’s happening under the covers and still make progress. And this is actually great, right up until something goes wrong.
Nick Washburn 8:26
How do you think about the scaling needs? So when you balance the Fly platform, there’s a very quick time to value for a Rails developer, a PHP developer, Elixir, pick your framework, they can get a full stack app running very quickly in a distributed way.
Kurt Mackey 8:55
Yep.
Nick Washburn 8:45
But as an application gets hit harder, more scale, more intensive reads and writes, the magic of Heroku back in the day was for Rails and Postgres, I can get going very quickly. But it had that technical graduation problem largely because of the abstraction layer. How have you thought about this when you kind of architected Fly?
Kurt Mackey 9:04
I think the really interesting thing about watching apps scale is they tend to scale in complexity faster than they scale in absolute footprint, if that makes sense. Like applications your building will get complicated long before you have a ton of traffic because you want to solve some new and novel problem, and, like, integrating an LLM is a great example of this.
Since the beginning, we’ve actually thought really hard about the Day 1 experience for devs, which is getting the app up and running and then the Day 100 experience for devs, which is actually be able to deploy more complicated stuff on top of it. And we’ve defined our ideal customers at this point as people that appreciate the value of PaaS Day 1, but want to go beyond that, very quickly–a Platform as a Service, a PasS. And so what we did is, we kind of started at the bottom–and some of this gets into how we think the next public cloud is gonna look–but we start at the bottom. And basically, the hypothesis is like, if you give people powerful enough primitives to launch compute, where they need it, even all the way down to like the granular like run one process on very little RAM in this city, until it dies, and then run it somewhere else, it’s actually pretty easy to build a platform as a service on top of it. And so we ended up doing was our product is actually the lower level API for running a VM to get compute and storage and RAM where you need it–compute including GPUs at this point; but the actual onboarding self-service model is kind of the orchestration bit that makes it easy to launch an application. And what we did is we built infrastructure that we managed to run the VM API. And then the orchestration all happens client-side–like our CLI actually has all of the magic for orchestrating a PaaS for people.
So like when you run the Fly launch, it does all the smart stuff in the client to detect that it’s a Javascript app, or a Next app or a Ruby app. It kind of configures the app in the way it needs, it goes to like our build servers and gets a Docker image built, and then it actually manages rolling this out on to compute across the world. And so, we’ve tried to simplify what we’re actually running because I think running complicated orchestration stuff that for some devs, it’s really hard to run, it’s really brittle. And they tend to outgrow it very quickly. And then we’ve done a lot of work to make the first-time experience kind of open source and available users. So, we actually have users go look at our open source kind of orchestration bits, and they’ll either build their own on top of it, or they’ll customize ours or leave and submit pull requests to our CLI at this point.
And so the general theory here has always been, we want to make some happy paths stuff very magical, but we expect people to eject from that and just use the lower level primitives. Which leads me to, I think one of the big philosophies in the world world that makes this make sense to us is that I expect the next big public cloud to actually be kind of a rebel alliance of maybe 50 companies. If you go look at AWS has a portfolio of products, there’s 50, good ones, like 200, not awesome ones that are just there because it checks some box that some person needs. And so, my guess is that the next public cloud is going to have a company focused on each of those top fifty problems building exactly what a developer needs from that. Building something that in our case, we tend to pitch this as building something allows the developer to ship new kinds of apps that they were not previously able to ship.
And so, part of why we focus so hard on the compute primitive as we think our place in the world there is as the compute provider for devs. We expect that devs who run apps will use us for compute use something like Supabass for Postgres, use something like Tigris for object storage. But we have a very narrow focus there. And a lot of the rest of the work we’ve done is just to kind of support the funnel of people coming in.
Nick Washburn 12:32
What do those integrations look like? So if I want to have a delightful developer experience, naturally, that would include if my object storage provider is Tigris. Is Tigris running on Fly itself and in the same network, so the consolidated effects very fast?
Kurt Mackey 18:25
Upstash runs Redis on our platform, Supabass does Postgres on our platform, Tigris does object storage on our platform; they all run co-located with the rest of our services. So you get sub-one millisecond latency to run database queries, you actually have sub-millisecond latency to talk to your object storage–that’s hard to do even on AWS at this point. There’s a couple of places I think working closely with partners makes sense. And one of them is that like the underlying infrastructure should behave the way that kind of a monopoly public cloud would today, which means low latency between components for the most part. And the other one is the UX should actually make sense. And so for us, we love the Day 1 Heroku experience, which is launch an app, obviously. But what we do is our launch process actually goes and gets you Supabass and will actually go and get your Redis. And also and actually go and get you object storage and offer these things up to you the same way it would if you were on something like AWS doing this. And this is the close partnership with the extension providers.
Camille Morhardt 13:41
How are you seeing or are you seeing the increase in machine learning and AI make changes to how you offer data center?
Kurt Mackey 13:51
That’s been an interesting one, because we started we were all about the latency and still are; like, to me, if you’re running an app, and it takes more than 40 milliseconds to get information back from the thing like it’s too slow, because human perception is 700 milliseconds is instantaneous. And I am very impatient and want instantaneous everything. And so I think two things happened concurrently. One is that edge computing became a thing people care about. Our customers–like the 250,000 people on our platform–run their apps close to the users just by default, because that’s what we do when you deploy the thing. But that actually necessitates a different kind of infrastructure than GPU stuff. And we actually saw this a little bit. When it went, I think, was CDM, us, GPU stuff, CD ends had very lightweight power requirements, they wanted to run into 100 cities; devs on us have heavier compute requirements, I kind of want full CPUs and a lot more RAM. And so like cloud flares infrastructure is hard to shoehorn into that for them, because they just don’t have that kind of footprint. We have fewer regions that are a lot fatter. And then GPUs are like an order of magnitude worse, they just need so much power. And the cost of anything with a GPU is so high, that like, power costs about the same as the GPU itself does. If you spend 10k on a GPU, you’re probably going to spend $10,000 on the power of it over the next three years.
And so, what we actually found is as we shipped GPUs, we’re even collapsing it down—we’ll have GPUs in like six regions. And part of this is because putting it where the power is cheap is the right choice. But the other part of this is the models just take so long to do anything. If your compute is going to spend 32 seconds giving you a response, like you don’t particularly care about that last two seconds anymore; it just becomes less of a pressing issue. So from like an actual infrastructure perspective, we went out to edge and collapsing back to like literally cities with hydropower is kind of what it’s going to end up with, just because we are all spending too much money on converting stuff into heat at this point.
The other interesting thing about GPUs I think, is they’ve actually, it’s changed how even devs attack building and shipping applications. I think that up until generative AI, stuff became obviously valuable to people building apps. I think we’re to the point now where most devs are aware of what like our LLMs can do, can imagine ways of integrating in their application. To me, it seems like almost like a generational change in the same way like a relational database and able to hold the kind of apps; what we’re gonna have now is relational database, letting you build those apps, and then LLMs letting you build even more different kinds of apps. It’s like that level of power, I think.
And one of the interesting things about the world we’re in right now is the whole way you deploy and manage models, nobody knows; we haven’t figured this out yet. And our bet on some of this is you deploy a manager ML code the same way as the rest of your code. And so that’s actually turned out to be a pretty good pitch for– the way we think about the world is we build for developers, and we sell to developers. But we also sell to whoever the developer reports to. And basically, being able to build and ship an ML- backed application with the same ease as a full stack application has been a pretty good pitch to the bosses of engineers at this point.
Camille Morhardt 16:41
Are you seeing/envisioning enterprises basically adopting, like LLMs or Gen AI more customized? So it’s essentially a smaller model versus trying to incorporate all the information on the internet? Are you seeing smaller and more customized, more distributed kind of a model? Do you think that’s the future there?
Kurt Mackey 17:01
We’re actually seeing some pretty clever stuff. And I have a lot of hot takes about where inferences are gonna happen in the future, just because of the constraints and power and expensive all of this. But we’re seeing customers do pretty clever things where they have actually a really, really small model that works fine on CPU, and then they have a fatter model that’s GPU-based, and then what they’ll do is they’ll try the CPU model first to see if they’re happy with the results for their users, and then flip over to the more expensive one, if necessary. So, we’re seeing I think, a lot of what I call model optimization. And it’s not necessarily small models–it might be kind of an onion of small to larger, fatter models, but it’s definitely where a lot of us going and it’s, it’s realistically, it’s because Nvidia has such pricing power over the market right now. I mean, if you look, and I would like to buy a gaming graphics card for $1,500 and run my ML on it and sell that to people but Nvidia’s like “no, you got to spend 20k on something that’s 12% better, instead.
And so I think, between the power and the expense of actual electricity and the pricing power of Nvidia, I actually we’ve seen a lot of pressure on developers to like reason about their infrastructure in a way that they know can be efficient down the road. And I actually think the next thing that’s going to happen is we’re going to see a lot of device local inference. So what you’re going to see is people building models that they can actually ship to the client, and they can run on a really fast local, probably under loaded CPUs, low power that I’m not paying for, first, and then they’re gonna do something like we’re seeing with the users on small CPU models, and then fall back to these big fat models over time. I think we’re gonna see kind of those three layers of I call them model intensity, maybe, it’s like size, but there’s more to it than that.
Camille Morhardt 18:35
So hybrid, you know, all over the map, pretty much anything, depending on what you can get away with. I mean, really tailoring, I think you’re saying really tailoring the compute to the application or the use case and that can depend on cost of energy, cost of hardware, latency requirements, all kinds of things, yeah.
Kurt Mackey 18:54
I think for this particular case, it’s probably the cost of energy that’s the biggest driver of this. Like, if I’m an app developer, and I want to have a good solid business, I don’t want to be paying for electricity for my people’s ML stuff. And so I’m happy to just kill their battery life on their phone in exchange for not paying for electricity. There’s a really, really strong incentive to stay out of GPUs right now.
Camille Morhardt 19:15
I wonder if we could pivot just a little bit and talk about security and or privacy. So, however you want to tackle that–whether it’s your recommendation to app developers for what they should be thinking about right now, or whether it’s what you think public cloud should be thinking about, from a cybersecurity perspective.
Kurt Mackey 19:34
One of our founders, the big security person, Thomas Ptacek, and it’s funny, because he wrote the Security landing page for us. And he’s like, “this is the cloud you would build if you just let security nerds make all the decisions,” which is actually relatively accurate. And so like, the first thing we did is like we’re fundamentally architected in a way that can be very secure, depending on how you deploy your apps. And so, we’ve made a lot of choices that actually limit the UX in some ways, we’ve made a lot of choices about how we store sensitive database secrets or sensitive credentials for things on your behalf to like, how we do virtualization, how much we isolate people’s processes that I think gives us actually a surprisingly good security posture for a bottoms-up developers startup.
I think typically, when people build stuff for devs, they kind of not necessarily ignore that, and be aware of like, what really good security looks like under the covers, because they’re so focused on the UX. And there’s so much tension between UX and security. And so, it’s kind of cool to be in that spot.
One of the fun ways this is manifest is we’re finding all these customers who want to run ML, but either because of regulations or because of their own paranoia, they cannot allow their training sets to be reused. And what we’re finding is that because we have GPUs in such a secure environment, they actually trust our infrastructure for doing this–they trust that they can do ML stuff, they can ship their models on us and know that no other user is going to get access to it. I think that’s the biggest, scariest thing for devs right now.
Camille Morhardt 20:59
So Nick, a question for you cause I think it’s fairly rare people get to talk with certainly somebody from Intel Capital about what you think about in making an investment decision. So, I’m just curious, what is top of mind for you? Or what is something that we might not think of that you focus on?
Nick Washburn 21:18
Yeah, I mean, I can contextualize it in the origins of Fly. You know, I think we tend to be pretty thesis driven, as opposed to opportunistic. And I first read about Fly, I think it was a hacker news post in early 2020, when Fly was starting to transition, I think, from very much run workloads close to end users to more of a like a public cloud. And it just resonated, one, because you know, there’s a lot of developer love; but two, we shared a belief that what we call a modern distributed cloud sack was emerging, which had a couple of kind of core components to it. Delightful developer experience, meaning I don’t need to go and see a menu of 700 services to stitch together and have a platform team to be able to ship an application; I can could something I know in the language I love, making no changes to it, and it magically runs in many different regions. And that was kind of what we read about users were doing at that time.
And the second thing was, we did believe that you needed to run your own metal to be able to do this. Actually, there are PaaSes and versions of IS out there that are abstractions on infrastructure provided by other public clouds. But to really be able to offer like low-level primitives, you need to have your own metal and why do I think we need to have low level primitives? To solve a technical graduation problem. That’s what Kurt was talking about, like a Day 1 experience, but a year two, like this is still running mission critical workloads that are built into.
When we look at developer tooling, infrastructure software, the aperture upon which you look at it is a little different, because the companies all look the same–like it’s often very early revenue, a lot of usage, but whether it’s kind of core infrastructure, that’s like being adopted by DevOps team, that’s a very enterprise/sales lens we’re looking at, versus when I look at something like Fly early on, it’s almost like looking at a consumer investment. We’re looking at usage, product love, how many applications the same users putting on the platform, what that like journey is to individual developer. And you know, when we’ve got to know Kurt, and got to know, Michael, and Thomas and Jerome, the other founders, and what the worldview was, it was very aligned with our worldview. And so we jumped on it.
And it’s been a privilege to work with them since and see the journeys and usage from when we first invested it to where they are no–like amount of users and applications and databases on top of it’s pretty, pretty amazing.
Kurt Mackey 23:49
I want to add on to what Nick said about the graduation problem and running on your own metal. And I think one thing that’s easy to not notice is that the big clouds exist to eat software margins. They’re heavily incentivized and very good at extracting all the margin they can from the dollar that your end users spending. And I think part of the graduation problem is people on Heroku, it got very expensive very fast, because Heroku could not actually be price competitive, or even in the same ballpark as AWS for things. And so we ended up doing the physical hardware, both because we had to to kind of ship the features we thought we knew. But also because there’s no way to build an enduring business that’s threatening to AWS on top of AWS purely for economic reasons. Everyone thought I was crazy when I said that in 2020. I think Nick might have been the first person like, “yeah, that sounds right,” like nine months later, and then the later round investors were like, “yeah, obviously that’s true. Obviously, that makes total sense.” So we’ve finally managed to convince people that.
Camille Morhardt 24:45
Kurt Mackey, CEO and founder of Fly.io, thank you so much for your time today. Very interesting conversation.
Kurt Mackey 24:52
Thank you.
Camille Morhardt 24:53
And Nick Washburn, Senior Managing Director of Intel Capital. Thank you for co-hosting with me. It was a lot of fun.
Nick Washburn 24:59
Of course, it was a pleasure. Thanks for having me.
The views and opinions expressed are those of the guests and author and do not necessarily reflect the official policy or position of Intel Corporation.