DevOps Chats

DevOps Chats: Database for Cloud-Native Apps with MemSQL

Data and database technologies are experiencing disruption. The volume of data collected is exploding, creating new potential for disruptive services and new companies. Data is distributed across locations in the cloud and the data center. Also, cloud technologies bring a plethora of database types, distribution options, analytics capabilities and AI/ML processing. Any company not leveraging data to the benefit of its customers and business are just asking to be the next victim of disruption.

Nikita Shamgunov, MemSQL co-CEO/co-founder, joins DevOps Chat to discuss how these disruptive factors changed how containerized and cloud-native applications approach data collection and storage. Speed, high volume data collection, distributed design and scale are all challenges Nikita sees as vital for cloud-native applications of today and the future.

As usual, the streaming audio is immediately below, followed by the transcript of our conversation.

Transcript

Mitch Ashley: Hi, everyone, this is Mitch Ashley with DevOps.com, and you’re listening to another DevOps Chat podcast. Today, I’m joined by Nikita Shamgunov, co-CEO and co-founder of MemSQL. The topic today is really looking at the state of database and the market and technology as we think about database now. Nikita, welcome to DevOps Chat.

Nikita Shamgunov: Thank you, thank you—excited to be here.

Ashley: It’s a pleasure to have you on. Thanks for joining us. Would you start by introducing yourself, tell us a little bit about what you do and a little bit about MemSQL?

Shamgunov: Absolutely. My name is Nikita Shamgunov, I’m the co-CEO and co-founder of a database company, MemSQL. Prior to MemSQL, I worked at Facebook, and prior to that at Microsoft on the SQL server kernel. So, I’m a kernel engineer by training. And before that, I graduated with the Ph.D. program from St. Petersburg, Russia.

Ashley: Excellent, excellent. Well, let’s jump right into it. We were talking a little bit before we started the podcast recording. You know, databases have been around for a long time, and you know, the technologies created decades ago are still in production in many areas, but there’s a lot of changes that are happening, too. Why don’t you talk a little bit about some of the market forces that are happening to change the database environment?

Shamgunov: Absolutely. So, there’s several market forces and industry forces that are, I would say, piercing through the market and technology. And the first one is that the data volumes are growing, right?

Ashley: Mm-hmm.

Shamgunov: And a lot of companies—and you don’t have to go far for examples, you know, companies like Uber and Netflix and Amazon and Google drive most of their value from data, right? That somebody in the ’80s would describe any of those companies as a glorified database application—but, of course, they are a lot more than just that.

And with the data volumes and variety of that data growing, what separates winners from losers is how well people can capture and act on data.

Ashley: Mm-hmm.

Shamgunov: Right? A great example is, you know, when you’re hailing an Uber, you know, you pop it in your phone and then you know exactly where the cars are, how fast they’re gonna arrive, how long it’s gonna take you to the destination.

And that’s really, really powerful. We’re all used to this right now, but 10 years ago, that was not a reality and not a possibility. So, that is the force number one, right? People see who are those companies that are advancing in the market and they wanna take on the same opportunity in their state, in their category.

They also are afraid of being disrupted by the next Uber, the next Facebook, the next Amazon—particularly Amazon.

Ashley: Exactly.

Shamgunov: So, the second market force that is going through the market is architectural. And, because the data volumes are growing and the Moore’s Law is not working any more, so we cannot just rely on better, cheaper hardware, we have to build distributed systems.

And, in the database world, distributed systems, for the most part, have been exotic until quite recently. And we’re starting to see distributed database systems—and the reality is, it’s just very hard to build a distributed kinda system of record transactional system and make it production ready, because the amount of technology that goes in there is enormous, and different, kinda other database products that are successful in the market have been products for 30 plus years. You know, thinking about Oracle, thinking about SQL Server.

Ashley: Mm-hmm.

Shamgunov: And when this massive architectural change comes in where it’s a business reality that you need to re-architect those systems, then it becomes very, very hard for the vendors to actually deliver those. So, that’s the second market force. Transition [Crosstalk]

Ashley: Let me ask you a little bit about that, Nikita, if I could, about the—

Shamgunov: —towards distributed systems—go ahead.

Ashley: The distributed architectures, do you see that happening largely because of the displacement between on-prem data centers and private clouds and public clouds? Is it more pushing content to the edge and doing caching? Is it pushing content toward a geographical location? What’s forcing the distribution of databases?

Shamgunov: This is a wonderful question, first and foremost, because I was about to say that the third mega market force that is happening right now is moved to the cloud.

Ashley: Mm-hmm.

Shamgunov: Right, and cloud, if you make it parallel with electricity, right, the majority of the world consumes electricity in the clouds, you know? You just plug your stuff into a socket and here comes your electricity.

Ashley: Mm-hmm.

Shamgunov: On premises, generators still exist at important places—manufacturing, health care, hospitals—but it’s a tiny percentage of the global electricity consumption.

Similarly, it just makes too much sense to consume IT in the cloud, and within IT, you know, there’s applications, analytics, databases, machine learning—you know, all the whole spectrum of services that typically a cloud provider now offers. Now, if we buy into this worldview—and certainly, we see what happened with electricity, and this is the same thing that’s happening in IT—then you can split the same into different workloads. And once you zoom in on the database workloads, you will realize that, yes, you wanna take workloads that currently run in data centers and shift them into the cloud.

Cloud is different, you know? The most important thing in the cloud is that software just runs as a service. But also, it creates an opportunity to rebuild every single piece of IT for the cloud. And that’s already happening. In our space, it’s already happened in data warehousing, but it also happened with identity because of Okta. It happened with cloud storage, with services like S3 on Amazon or Dropbox as a higher level service. So, basically, as infrastructure IT is shifting towards the cloud, we are rebuilding it for the cloud. And now, it just so happens that, through that rebuild, you want to cater to the next generation workloads. That’s where distributed systems comes in. And also, you’re architecting for the cloud because the cost equation in the cloud is different.

So, being scalable and elastic allows you to provide a better cost equation, as opposed to legacy architectures. So, if you take an Oracle and—yeah, you can run it in the cloud, in EDM, no problem. Well, actually, it’s gonna be way too slow and way too expensive as opposed to the system that’s architected specifically for the cloud that scales both for cost, right, so you can only—you will only consume as many resources as you need from the storage and compute standpoint. Storage compute network bandwidth—whatever you need to run the workload. But it also is gonna be a lot more efficient, because running a database as EDM, everybody will tell you this is not such a great idea, but if it’s a database service architected for the cloud, then you can build for and around—around limitations and for all the services and resources that are available for the cloud.

Ashley: What I’d like to ask you about that is, it seems like there’s a parallel between lift and shift for applications and lift and shift for database. It isn’t architected for the cloud if you’re just taking the application into a cloud environment, but you’ve gotta re-architect, rebuild, take advantage of things like database as a service, for example. Is that correct?

Shamgunov: Absolutely. Alright, and then in the application space, what’s different than the cloud is that certain applications can be more popular than others. And if you just put those apps as you had on premises in VMs—so, very likely, you both overprovision and underprovision resources.

For less popular applications you are wasting the whole VM to run an app, and for a very popular application you don’t have enough network bandwidth or CPU to run this app at scale. And that’s why Kubernetes really changed the game, and if you build against Kubernetes, you can pre-release scaled, stateless applications. And then, if you run it in the cloud, of course, the database that you use to power those applications, you consume as a service.

Ashley: Say a little bit more about that, because we’ve talked on other webinars and podcasts about the statefulness of data versus the more dynamic nature of building software and rebuilding software that may not match up with the state of the database. How do you handle that in a Kubernetes environment?

Shamgunov: So, there are two pieces to this. The first one is, you run your application in the Kubernetes environment and you consume data through a database as a service. So, that is a blueprint of a cloud architecture today. Let’s say you run your Kubernetes on Amazon, your application is written in whatever language, but let’s say Node.js, and you fire up DynamoDB, using an Amazon example, as a service, you connect to that database from the application and there it is, there’s your app.

Now, in the world of relational databases—which, by the way, is a much larger slice of the market than object databases—

Ashley: Sure, yeah.

Shamgunov: And for a very good reason. You can run it in a similar fashion. You know, in this case, you can launch MemSQL as a service, you know, run  your application inside a Kubernetes cluster.

The other thing that you can do is, you can bring MemSQL with you into the Kubernetes environment. And that gives you control and cloud portability. Now, not only do you deploy your application into Kubernetes and the application scales depending on the demands and the popularity of the app, but also the database that stores the data and powers the app can be deployed in the same Kubernetes cluster, and it will do the same thing. It will scale or contract, depending on the needs of your application.

So, what it gives you, it gives you cloud portability. So, you can pack your bags and go from AWS to Google, for example, and the other thing it allows you to do is to take it and bring it down, right, from cloud to on premises. And today, the reason to do so is usually twofold. It’s either security and compliance or it’s cost, right? If the workloads are pretty static and you cannot drive a good amount of efficiency through the elasticity of your both application and data infrastructure, then on premises may still be cheaper, and then that creates an incentive for you to bring it down on the ground.

Ashley: How important is it to have kind of a single tool to do ETL versus separating those as you move not only within one cloud but across clouds?

Shamgunov: Well, I think, at the end of the day, we are talking about lock in, and we’re talking about developer productivity. Those are completely different. Now, lock in is something that a lot of people in our space experience with Oracle as a vendor.

Ashley: Sure.

Shamgunov: Where they go through what they describe as extortion cycles. Every time an Oracle renewal comes in, then you end up negotiating with Oracle, and if you wanna reduce the amount of the database spent, it becomes a very difficult conversation. You know, Oracle runs an audit, which they have the right to, so they inspect and find all the applications that consume Oracle. If you reduce the amount of consumption, Oracle jacks up the maintenance fee or jacks up the next license fee.

So, it’s really, really hard to get out of spending a lot of money on this vendor. We hear that time and time again in major, major enterprises. I had a conversation with the CIO of a major bank, and the Oracle spend there is $1 billion over several years.

Ashley: Not surprising, yeah.

Shamgunov: And this is absolutely insane, if you think about it, right? The reason for that is that Oracle has been a great partner for delivering their services, you know, cost aside. You know, banking runs—in their bank, it’s a consumer bank, and it runs on Oracle.

But at the same time, competitors catch up to the functionality, right? MemSQL comes with a very strong system of record capabilities and, to this day, that has been the stronghold of Oracle, and you know, with a much  more attractive price, with a much more flexible deployment model and the ability to run in the cloud as a database as a service where we guarantee SLAs enough time. And with an integration with AI and ML, we see that, slowly but surely, that equation is shifting. And, you know, people seriously ask questions, “Why would we spend so much money on this technology and can we not do this? We would save hundreds and hundreds of millions of dollars and drive shareholder value, here. In addition to that, we’ll have our development teams move a lot faster,” right?

So, back to your question about across clouds and developer productivity, right? The first one is lock in, and I gave you an example of an Oracle lock in. But now, the CIOs don’t want to fall into the same trap by getting married to a single cloud provider. And, you know, thank God, we have multiple cloud providers available on the market, of which three in the United States are very, very strong, and you know, for the most part, similar in capabilities—GTP, Azure, and AWS.

So, it’s very typical for a key enterprise to choose two or three cloud vendors. So, that prevents the lock in. However, in order to truly avoid lock in, you’ve gotta choose service providers that allow to run their software the same way on all the clouds.

Ashley: Exactly.

Shamgunov: And ideally, they deliver their services as a path service, you know, like a database as a service, and that service is available on every cloud. And if that’s the case, then, you know, you take a dependency on a vendor, but you are not taking a dependency on the mega vendor, which is one of the clouds. So, that is something that is top of mind of every CIO that I know.

Ashley: You make a good point about Oracle, kinda going back to the lock in of the 2000s and maybe ‘90s. You know, for VLDB, that was really the best game in town, arguably. You could also Microsoft, maybe.

Shamgunov: Mm-hmm.

Ashley: But that isn’t true today. You’ve got many options when you go to cloud providers and folks like yourselves, I would imagine, live in all of those cloud environments or multiple of them. So, you can also serve customers that, of course, are gonna have a multi-cloud strategy.

Shamgunov: Absolutely, you know, I couldn’t have said it better.

Ashley: Okay, great. [Laughter] Lock that one in—great. What’s the number one challenge as folks move out of—let’s pick Oracle again—move out of that traditional VLDB environment and move into the cloud. What’s the paradigm shift that you have to make to really think about how you can leverage the flexibility of the cloud?

Shamgunov: There is a three step strategy if you wanna get out of Oracle. The biggest stickiness of any database technology is all the applications that are built against the database and they’re using and sharing that data across the applications. And then different database technologies can either be app compatible—you know, an example of that is MySQL and AWS Aurora, right?

Ashley: Mm-hmm.

Shamgunov: Aurora is a new database, but the compute piece of it is actually MySQL code, so they’re app compatible. If an app works against MySQL, it’s gonna work against Aurora.

So, that’s why, in the low end of the market, in which both MySQL and Aurora play, this can be a viable strategy. And it’s a very good one, right, because it’s so easy to just point your app at a different database. And it’s very hard for a different database product to achieve application level compatibility between the two databases.

The second one is workload compatibility. So, all our workloads that are running on Oracle can be moved to another database, assuming there is a good enough reason. You know, assuming the database you’re moving the workload to has all the same set of features and provides strong system of record guarantees and so on and so forth.

So, if you are workload compatible, then you can move from one database into another, but you need to put in some work, right? You need to augment and potentially rewrite certain pieces of the application.

Ashley: Mm-hmm.

Shamgunov: And, in our space, there’s a massive piece of the market that is called tier one workloads. Those tier one workloads typically run on Oracle, and they have either very, very strong system of record requirements or they have performance requirements, or they have availability requirements, or they have very strong concurrency requirements.

And because of our architecture, we are practically the only game in town that can be the destination for moving workloads from Oracle, and especially systems like Oracle—you know, heavy Oracle systems running on enterprise storage or systems like Oracle Rack or Oracle Exadata over to running on MemSQL. And we have countless examples of how we quote-unquote liberated our customers from Oracle.

So—and this is, frankly, our biggest opportunity. So you will have to put some work, but if the pain that you’re experiencing is so high that you’re willing to put that work, we are a fantastic destination for moving your Oracle workloads.

Ashley: Well, excellent. You’ve been a great guest and a fascinating conversation. I wish we had more time to chat more. Maybe we can do that on another podcast.

Well, thank you, and with my audience, I’d also like to thank you, Nikita Shamgunov, for joining us, co-CEO and co-founder of MemSQL.

Shamgunov: Yeah, very happy to be here. Thank you so much.

Ashley: You bet. And thank you to our listeners. This is Mitch Ashley with DevOps.com and you’ve listened to another DevOps Chat podcast. Be careful out there.

Mitchell Ashley

Recent Posts

Valkey is Rapidly Overtaking Redis

Redis is taking it in the chops, as both maintainers and customers move to the Valkey Redis fork.

10 hours ago

GitLab Adds AI Chat Interface to Increase DevOps Productivity

GitLab Duo Chat is a natural language interface which helps generate code, create tests and access code summarizations.

15 hours ago

The Role of AI in Securing Software and Data Supply Chains

Expect attacks on the open source software supply chain to accelerate, with attackers automating attacks in common open source software…

20 hours ago

Exploring Low/No-Code Platforms, GenAI, Copilots and Code Generators

The emergence of low/no-code platforms is challenging traditional notions of coding expertise. Gone are the days when coding was an…

2 days ago

Datadog DevSecOps Report Shines Spotlight on Java Security Issues

Datadog today published a State of DevSecOps report that finds 90% of Java services running in a production environment are…

2 days ago

OpenSSF warns of Open Source Social Engineering Threats

Linux dodged a bullet. If the XZ exploit had gone undiscovered for only a few more weeks, millions of Linux…

3 days ago