EP 24: Cloud Performance Testing

The purpose of cloud performance testing is to identify and eliminate performance bottlenecks that might occur within an application. Performance testing allows DevOps teams to check the speed, scalability and stability of the apps to ensure that everything is working as planned under the designated workload. In this episode of DevOps Unbound, Kristen Webb (Tricentis), Donald Lutz (Taos) and Andreas Grabner (Dynatrace) join hosts Alan Shimel and Mitch Ashley to discuss the key benefits of performance testing in the cloud and what to consider when building a cloud performance testing strategy. Join the experts to find out what you can do to ensure that your apps are running smoothly and remain available during critical times. The video is below followed by a transcript of the conversation.

Alan Shimel: Hey, everyone. It’s Alan Shimel, CEO of Techstrong Group, and you’re watching another Devops Unbound. For those of you who may be new to our audience, Devops Unbound is a biweekly video series sponsored by our very good friends at Tricentis. And we explore a wide range of topics around devops and devops-related technologies, practices, processes and people. As I said, we do this biweekly. It’s every other week, for those who’re a little challenged, and then around once every month we also have a live – what we call roundtable format of Devops Unbound, where we’ll have usually a little bit of larger panel. But we have a really large panel in that we opened it up to a live audience, and you and everyone joining are invited to participate in the roundtable by interacting with the panel and our host, asking questions, giving us your thoughts, and it really becomes sort of a group-hall/town-hall kind of meeting. When it goes right, it’s great, with wide levels of participation.

Actually, this is a really good point today, good time to bring this up. Today’s panel version of Devops Unbound is on a holistic approach to performance testing. But the upcoming or the next live roundtable, and I’m gonna have more information on that later in the show, is later this month, on February 24^th, which is a Thursday, and it’s gonna feature a performance testing in the cloud. I’ll give you the actual title in just a bit. But once of our panel members here, Kristen, is going to be there, as well as some other great guests. You can sign up for that on devops.com. If go to Upcoming Webinars and check it out, it’s February 24^th. And as I said, I’ll bring more information on that in a little bit.

But for now, let’s talk about holistic approach to performance testing. And in doing so, I think the right place to start is with our audience. I’ve already kinda preannounced Kristen, so I’m gonna ask her to go first, if it’s okay. Hey, Kristen, welcome to Devops Unbound. Please introduce yourself to the audience.

Kristen Webb: Sure. Thanks, Alan. Kristen Webb here. I’m the director of product marketing at Tricentis. Currently running the product line around neo-loader performance and load-testing solution. I have about 20-something years in technology. I’ve lost count. I’ve worked for a long time on the product-development side, so I remember back when we were testing large software releases at Macromedia and Adobe and trying to get our products out on time as part of a broader release and making sure there was no watermelons, what we used to call the performance bottlenecks at the end of our three-week mean time between failure testing. So performance testing is near and dear to my heart, as is all application development, background, and excited to be here.

Alan Shimel: Absolutely. Macromedia, that’s a name I haven’t heard in so long. It used to be such a big part of my workplace. And I just totally blanked on it until you mentioned it. But it brought back good memories, so that’s a good thing. Yep, a smile to my face. Next up I wanna introduce you to Andreas. We call him Andy. It’s Grabner, yes, Andreas?

Andy Grabner: It is, but feel free to pronounce it the way you like, as long as you call me Andy. I always make the joke if you call me Andreas, then I typically did something wrong or rude, because my mom calls me Andreas if I did something not appropriate.

Alan Shimel: I get it. Our youngest son is Bradley, and everyone calls him Brad, and you say Bradley, and he kinda – he knows he’s in trouble. But, anyway, Andy, welcome. Introduce yourself to the audience, if you don’t mind.

Andy Grabner: Yeah, thank you. So my first start in performance engineering was, I think, pretty much 20 or 21 years ago maybe now. I was working at Segway Software. I was a performance tester on a performance-testing tool called Performer, still around. But I remember those days, and it was a large load-testing project, and I – with my first kind of customer gig that I did. And I also remember that my performance engagement was scheduled for one week, took exactly ten minutes, because we broke the system with two virtual users, and the virtual user was my left finger and my right finger hitting the Refresh button on two different keyboards. And I really quickly learned that performance testing, while we always think about the magic 10,000 – millions of virtual users, sometimes it’s as simple as just hitting Refresh quite a lot on your website.

Over the years I’ve morphed from a performance engineer. Now I’m working at Dynatrace for the last 14 years, actually, in the observability space, and I’m a self-proclaimed devops activist. I really like the idea behind devops, and I try to use channels like yours and my own channels to really advocate for what devops allows organizations to do in terms of tooling transformation, people transformation, process transformation. And I’m very glad to be here.

Alan Shimel: Excellent. Thank you, and welcome, Andy. Thanks for being here. Our third panel member today is someone I’ve had the pleasure of knowing – I don’t know, Donald. It’s gotta be 20 years, 19 years.

Donald Lutz: It’s 20 years. Yeah, it’s too long.

Alan Shimel: Twenty years. On our original team at Still Secure, what became Still Secure, with Mitchell and I – but he is a developer-testing person extraordinaire. I’ll let him introduce himself. I don’t want to embarrass him, our friend, Donald Lutz. Hey, Donald. Welcome.

Donald Lutz: Welcome. My name’s Donald Lutz. I’ve been doing software architecture for probably about 30 years. I’ve been focused on how to do development right, how to use design patterns correctly. The past ten years I’ve really dropped in the cloud. Right now I’m the senior cloud and software architect at Taos. We are a public cloud company. IBM just acquired us. Sullivan rated us number one for public cloud. We also are on the Gartner Magic Quadrant. So I’m sort of spending a lotta time helping us productize our consulting offering. So the whole idea of testing and how to do performance testing right and chaos engineering, those are really near and dear to my heart. And how do you use the right design patterns in your code to avoid all these performance issues?

Alan Shimel: Well, an ounce of prevention is worth a pound of cure, as they say, right? And we’ll get to it. Our last member, though, is my cohost and friend, Mitchell Ashley. I’m gonna let Mitch introduce himself, though. Mitch, take it.

Mitch Ashley: Oh, good to be here with everybody. Thanks for making me follow some really pretty tough people to follow. Oh, my gosh. And, by the way, Macromedia, Dreamweaver, Authorware, remember all that good stuff? And I’ve known Donald – tack on another ten years. So it’s a small world. No, I’m CTO with Techstrong Group working with Alan, and I also am principal with our analyst firm, Techstrong Research. So, great to be here, and I think we could fill up our whole session with war stories about performance testing, Andy. Maybe we’ll save that for another day, right? But good to have everybody here, and thanks to the audience for joining us.

Alan Shimel: Thank you, Mitchell. Just real quickly, I think back to performance testing at Still Secure, Mitch, if you remember. It broke our budget, ’cause we didn’t have all the virtual – you weren’t able to do it from one machine then. I tried to go out to a place that had enough to do a load of –

Mitch Ashley: You kept Dell busy.

Alan Shimel: – Jesus. Man, was that crazy. Anyway, though, let’s jump in. In my mind, one of the most overused words I hear in tech is the word holistic. So let’s jump on that. When we talk about a holistic approach to performance testing, is it really holistic, or are we talking about, “Hey, what’s a great approach to performance testing?” How should we be doing performance testing?

Kristen Webb: I can take that to start. So building on what Andy was saying earlier about his two fingers being the load test to start, I think what we mean by holistic or just in – the sound approach to performance testing is starting with the code, starting with the APIs, starting at one user and making sure before you check your code in or your API in, your micro-services in that you’re – does this code work for one user?

And then as you go through your development process and you start to build the components of your application and you integrate those components, you start to hit those components with maybe 100 or so users. And you’re getting feedback along the way so that by the time you have an application integrated within your whole system that mimics preproduction, by then you’re able to hit the load at hundreds of thousands, potentially. But you already had feedback along the way, feedback loops to say, “What’s performing? What’s not? What do I have to fix along the way so that by the time you get to the really high load, you have fixed a lot of the issues?” And what you’re fixing at the end is, “What do I need to tweak within my system to make sure this application or API is gonna perform?” So that’s one part of holistic, or we need to think of a better word, maybe.

The word I tend to use is standardization, is – you need to cover all your bases along the pipeline, but you also need to make sure you’re covering all your protocols and all your use cases. So having a standardized approach for everything from API to monolith like we just talked about, but also what’re all the protocols you’re covering from HTTP to JavaScript to SAP and Citrix? So having an approach that works for all the protocols and across all those use cases, whether it’s from the protocol perspective or even the browser, ’cause those are two different types of performance tests but and also one that’s –the third thing I think about in terms of standardization is the skill set for the people who are going to be doing the performance testing.

So you wanna be able to not only have your center of excellence be able to performance-test, but as you move to devops, you want to be able to have a more distributed model for who can participate in the performance-testing process and create more of a self-service approach so that a lot of times – if they can access a browser, they can run a performance test. They can share results. And that encompasses having the approaches both for performance testers who want to build within CI pipelines with an as-code approach or a CLI or for the groups that want to use more of a no-code approach with a GUI tool. So those are the three things that I would say mean – what we mean when we think about holistic.

Alan Shimel: Love it. Thoughts on that, team, panel? Agree, disagree? You wanna add?

Andy Grabner: I wanna add something. Again, I’m not a native speaker, so maybe the word holistic means also something different to me. But I wanna add one thing and then add a second thing. But the 1^st thing, what you said, Kristen, is I think in the beginning you said developers need to understand actually how their code is doing with 1 user and 5 and 10 and 20 users. I think what we’re talking about here is people understand where are the breaking point and where are the scalability issues, because I think if you are building new code, however small or big it is, if you put it under different sets of load – I mean, you don’t have to go to the millions of users, but if you look at it at 10, 100 and 1,000, you immediately know where are your connectivity issues between your components and issues scaling the load, where things may break apart and where don’t they scale, because obviously we can’t throw a lot of hardware on a problem and we somehow get performance but at a certain cost. So I think it’s very important, as you said, from the beginning on make developers understand where are – where they may have some bad architectural decisions that will later on cost them a lot of money in order to scale this. I think that’s one thing.

But for me, holistic also means not only looking at it in development and starting early, but for me holistic means I first need to understand what are my real performance goals. If I’m developing a new feature, a new app, I should sit down with whoever comes up with the idea and trying to get their thought on what do you expect from my software? How should it run under certain load? It’s like our nonfunctional requirements in production. And then really take this and then break it down into, what does this mean for me as a developer? To give an example, if we’re building a new mobile app, and let’s say there’s a feature to log in, how long do we expect the login to take? Let’s say a second. Okay, if it is a second under 10,000 users, what does this mean for me as the developer that implements 2 or 3 APIs in the backend?

So I think the term of a performance budget is something that I also hear and have heard over the years. So if you break down performance as we agree on with the business, if we have performance goals, and then we can break it down into more technical goals to individual teams. Then we, I think, take a holistic approach, because then everybody can really work on their component and really always make sure that their component is working and operating within the context of their performance budget that they have so that there’s a higher chance that in the end, when all the components are put together in production they’re actually meeting our business and our performance requirements in production. So I think these are kind of the two things, right, that I wanted to add.

Donald Lutz: Yeah, I’d like to add one thing. The thing I think about a lot is ’cause I’ve been building a lotta micro-services over the past ten years, and thinking about synchronous versus asynchronous communication and how that plays out, because if you have too many APIs, you have to use things like circuit breakers. There’re a lot of things that can affect that. So if you start looking at things like Kafka, where do I split my workload so it’s asynchronous? How does that work? What patterns do we put in there? That makes a big difference, because most developers are always thinking synchronous, and I find that to be a real difficult issue, because you know what? We could put this in the background. When you look at Amazon, just ’cause you place an order doesn’t mean that actually went in and got billed against your card immediately. That whole synchronicity concept has to be embedded in the concept of performance.

Andy Grabner: Yeah, true. Yeah, it’s also a nice decoupling of components, right? I mean, that’s the event-driven model, as you call it, right? You put a queue in the middle, but it’s basically an event-driven architecture and very important piece. And because the event-driven architecture also allows you to better test and also performance-test components in isolation and then also come up with beta capacity-planning models, right, because you know how many events fall out of the first component, when you put a certain load on it, it means, how do we need to size the queues or wherever it ends up, and then what do we need on the other end to work on the queues? I mean, that’s a very good point.

Mitch Ashley: Maybe that’s why holistic is being overused, Alan. The topic has not only evolved. I think it’s grown, because I remember one of my first experiences working with Donald was on ISS and TP Gateway server, a third-party product, and he was new to me, and he knew how to tune it. We called it tuning then, right, but the load, the capacity, the scalability, and we’ve added onto that customer experience or user experience goals, metrics. How long does it take to log in, that kind of thing, with the experience to get back a response on a mobile device versus on a Web browser, whatever that might be? What’s our API performance goals to partners and third-party services, things like that? So I think that’s made the world a little bit more complex, maybe a lot more complex to deal with. But it also has brought us closer to what the end result is we’re trying to accomplish with whatever system or software is. Deliver this service or make this function happen for whatever it might be. Kinda help us also think across the system horizontally and vertically of what we’re trying to deliver.

Alan Shimel: Yep. I think there’s another element, though, that I’m interested in your all thoughts on. When I think back to over my career how performance testing has changed, what’ve been the big factors? Well, I think as Andy referenced, early on, man, it was hardware-dependent, right, because, I mean, if you wanted to scale performance testing, you had to be able to test it on the hardware that was running. And that got expensive quickly, and not only expensive for the actual cost of the equipment but from a people perspective. Right? And then the virtualization of performance testing, wow, what a godsend. We can just have Andy here working with, as he said, his left and his right index finger, and that was enough to break his – the app he was testing or the software he was testing at the time.

And then I think with devops and dev testing and so forth, the idea of automating our performance testing, right, being able to shift it left into – further left into our development process, right, making – ’cause it used to be you did all your coding, and that took one month of your three-month cycle, and now two months of that three-month cycle were your testing cycles, right? And, man, by shifting left, I really cut that out a bunch. All of these profound changes, right, in performance testing – as we sit here today, can we say that there really is a best practices, holistic or not, approach to performance testing? And where’s the people part of it, right? There’s always people, process and technology. What’s the people part of that? Kristen, would you mind kicking it off, or –

Kristen Webb: So, I think there is some approaches that are best practices-ish, and one of the first steps is really understanding where your priorities are in your applications and where the risky areas are, because you can’t test everything. So you need to really prioritize what to test first and where the issues are most likely gonna come up. And, two, after you do that, it’s about establishing what your SLOs are, your service-level objections – objectives, sorry – and understanding, what do you need to hit within what level of virtual users hitting that, within what response times, and what you want that end user experience to be like. And building those SLOs within the pipeline so that when you’re running the test in a continuous manner, you’re able to say, “This didn’t hit the SLO when we ran it against XYZ load,” and you send that back, and you have actions based on that result to fail the build, for example, send a ticket to Jira so that it’s integrated within your already-existing development process is a good way to go about it.

And then also taking in production feedback is a good part of this, too, to make sure that you’re understanding what’s actually going on within production, and is – are my tests mimicking the real-world experience that my users are having under what load conditions and making sure you’re updating your Kubernetes environments to match that level of code. Do you need a new cluster based on that information that you’re getting from either APM tools like Dynatrace or other monitoring tools like Prometheus or Microsoft Monitoring. And bring that observability data back into the test definitions and updating them continuously to make sure you’re keeping up as things change, because in reality things do change. I mean, we have seasonality, or we have new marketing campaigns that may be a business that went digital. Now they need to keep up with their spikes in load. So you wanna be able to stay ahead of that instead of reacting.

Alan Shimel: Agreed. Panel, thoughts?

Donald Lutz: The telemetry idea has become – really the past two years I’ve been doing a lot of open telemetry so we can know what’s happening at every point. We put it in L. We put it in Datadog. We put it all over the place, but also the whole Kubernetes cluster thing, how do you monitor it correctly? What’s the right thing? There’re a lotta tools that I’ve used, but there’s – the whole Kubernetes things opens up a larger can of worms because of the way the control loop works in Kubernetes. So you have to kinda figure out how do you wanna deploy them? What do the environments look like? What’s the appropriate load-testing there? That’s why I’ve got into a lot of chaos engineering, breaking things in production, seeing what happens. I just turned this service off. It doesn’t talk to Slack anymore. Where’s all those messages going? Did we notice this not working? That whole idea I think is really valuable. Surprisingly, it makes developers very neurotic. They don’t like it a lot, because you’re basically saying – it’s sorta like doing improv, because we’re just gonna do it and turn stuff off, and they’re like, “Well, I think it works.” So it’s sort of a very strange response.

Alan Shimel: It is. Andy, you’re at Dynatrace, right? This is kinda right up your alley there of what you guys do.

Andy Grabner: Exactly, yeah. And so to add what you already said, both of you already said, on a very mature scale, I think of people at IC, they just go always straight into production and maybe using canaries or future flags and just deploy their new codes and then expose it to, let’s say, a small percentage of users. And I’m saying about the very mature, this is what we would love obviously more people to do, but the reality is just a small number of organizations can really do it. But it’s the ideal scenario. You deploy. Then you use your telemetry to figure out, did that code change now? Does it actually behave better or worse than what’s maybe currently out there?

What I wanna add I think also what Kristen said. I think what we need to do is we need to get closer to the developers from the beginning. I know that’s different terms, and I’m just throwing one more term out there, performance-driven development, because I think we’ve taught engineers – and I’m an engineer by trade as well, right? We’ve taught them over the last 10, 20 years to do test-driven development, where you write your unit tests and functional tests to cover functionality. I think we should also educate them to do performance-driven development, which means you are – as you write your new micro-service that exposes five APIs, then write your API test that you can run once or can run continuously or with multiple virtual users to actually generate some load. And I think, too, they’ve become much more developer-friendly. Kristen, again, pointing to you, I know you have a great product where you actually allow developers to define their stuff as code, right? And code could be something like JavaScript, whatever it is. But I think as the tools have matured and we are making it easier and more natural for developers to also write these types of tests, I think we will also get more at option. I think that is very important, because we always talk about shifting left. It’s a great term, but shifting left in the end, if you think about it, if you take everything and shift everything left, we put all the burden on the developers. And I think this is nothing that is sustainable. Therefore, we need to make it as easy as possible also for developers to do all the things that we ask them to do.

So that means we need to give them tools that are natural to them in their habitat, which means in their ID, right? They need to be able to check in code, and as part of their commit process, they then will automatically not only get feedback on code coverage and functional coverage but also, “Hey, your code change now has slowed down your performance by 20 percent. Are you aware of this, or are you consuming 50 percent more memory? And, by the way, then bringing Donald, what you said also, learning from production – then you can correlate that with your production data and said, “Hey, your service runs in production with a certain amount of instances, and if you are making this code change, it means you have 50 percent more memory consumption. Are you aware of the additional costs?” So there’s a lot of things I think that we need to do, but it starts with really enabling developers just to do more in their natural kind of environment that they’re in, yeah.

Kristen Webb: I’m gonna build on what you’re saying, Andy, because I wanna make sure I’m clear about what I mean about shifting left, and I do think it is bringing it to the developer but as a service. So I think of it as building out a self-service rather than putting the onus on the developer to build the performance testing within the pipeline. So if it’s automated in the pipeline that they’re operating within, then it’s really up to the performance engineer to build out – and the automation engineer to build out that testing platform for them so that really they’re just kind of functioning, like you said, within their IVEs and within their – whatever they’re using to check in.

And, really, they’re just getting the feedback once they do so that they know how to act in the moment about how their code’s performing for that one user to start. And then later, as they get further on, they get more information about an increase in load to a few hundred, maybe, once they have, like you said, a micro-service that might expose more APIs. And once you’re distributing your code like that, how is it going to impact other code, and how is other impact code and endpoints going to impact the new code that you introduce, so really kind of delivering that, and that’s what I was talking about earlier in terms of creating a self-service approach so that more people can participate in a process, but the performance engineers are still the masterminds behind building that.

Mitch Ashley: I have a question for the panel. I’m curious. How do you think about this aspect of it? It may be for me a better word than holistic is systemic, and I’m thinking about how all parts fit together to contribute to the performance you’re delivering, as well as other things. It’s not just testing our app and our app code. It’s testing the whole stack, vertically and variations of that, different environments that it might run in. And a lot of it’s not our code. Donald was talking about Kubernetes, just as one great example. Is that part of what we mean by holistic as well?

Donald Lutz: I think so. To me, one of the things I think about the holistic problem is that you get all these CICD pipelines built, but you don’t have a gitops process that keeps the platform aligned. You’re working on that, so fundamentally you don’t have the cluster. You’re not doing infrastructure as code where the cluster is effectively as you need. Therefore, when developers deploy code through their – runs all the tests, but fundamentally part of the cluster didn’t get – so I’ve been thinking a lot lately in my current position, how do we get across our whole organization, gitops deployed in all elements of our groups so we’re using that effectively with everything else we do, ’cause that is the one piece that is happening. Argo CD’s a great example. It works really well, but just getting that to happen and realizing you’re building a platform is also a thing that people need to think about.

Alan Shimel: Agreed, agreed. You know what? I just wanted to throw another log on the fire. So, cloud. Game-changer? Makes no difference? Easier? Harder? Donald, you’re kinda the voice of the developer here, as well as IT. What do you think? And cloud, you work with a public cloud provider. What do you –

Donald Lutz: Yeah, I mean, public cloud is what we focused on fundamentally. That’s what we’re doing for all our customers, and cloud is simpler but more complex at the same time, ’cause they’re – fundamentally companies are used to having their various people within their data center. And when you move to the cloud, it’s upsetting. They have to figure out how to move all the workloads, but it’s also – one of the things that cloud gives that I’ve been thinking a lot about is if you can work with all three public clouds, you start increasing your reliability in everything, but how do you contain the costs so I can run this in GCP, this in Azure, this in AWS? Companies wanna go there, but they’re sort of always going, “How do I do that? Why do we do that? What do all these terms mean, and how do you sorta get the various executives to understand, ’cause we’re trying to sell them on sorta the advisory services. This is why you should do this. This is how you get your devops working. This is all these things. This is what you should do with open-shift.” It’s a really complicated soup, because they feel like it’s sort of like what happens to a lot of industry. They’ve outsourced it to all these people, and what’re all these people doing? Are they really making a Pinto, or are they making a fancy car? What are they doing?

Kristen Webb: I mean, I’m curious, Donald, how you guys do performance testing for the public. But I will say that cloud is a game-changer in a lotta ways, because it’s not eliminating risk, right? It’s just moving it from one place to another. It’s shifting it from on-prem. to the cloud and also introducing new risk that – new considerations that didn’t have to be made before. And with auto-scaling systems, sometimes people assume that moving to cloud means that implicitly they don’t have to worry about performance anymore because of auto-scaling.

And the concern there is that when you’re in the cloud, you’re really bumping up against a lot more services. And you can also run into noisy neighbors and a lot of different scenarios that will introduce bottlenecks from distributed systems. I mean, you could have your SaaS application located in France, your hardware located in China and your middleware in a third location. And you need to make sure that a hybrid system like that is gonna perform, and it can be assumed that it would, just because it’s the cloud. And you really need to make sure that your – whether you’re migrating apps or re-platforming ’em that you have a baseline of performance tests that you know you’re measuring against to make sure that you don’t regress when you move to the cloud and that you’re able to test those again once you do move there.

Donald Lutz: And I have another thought I’d like to add to that, ’cause I’ve sorta done it. You run all your services in – pick it AWS, Azure or GCP. And then there’s the other thing where you say, “Well, I wanna run all my services in Kubernetes in all three public clouds. I don’t really wanna consume the services directly from the provider. There’s some new, cool APIs that’re out there, where I just wanna consume it in Kubernetes. So you’ve changed the idea that your hybrid is really Kubernetes, wherever it runs. So then you’re not really consuming it in the public cloud; you’re consuming the services through the cluster. So that also changes the whole discussion.

Andy Grabner: Maybe what I wanna quickly add – and my first response to your question, Alan, was it doesn’t matter, because it – or it shouldn’t matter, the same with cloud native. When we talk about cloud native, cloud native doesn’t mean your cloud-native apps have to run in the cloud. Cloud native really means how you architect your apps, and they can essentially run anywhere, right? So it’s not where. Cloud native doesn’t define where it runs but how the system runs. And this is why I think, also to Kristen’s point, you don’t get, let’s say, performance for free, or you cannot sign up for performance as a service or for resiliency as a service just by checking a box on your favorite cloud provider, and then all of a sudden you have magic happening. But you still need to factor all of these decisions into your architectural decisions that you make, and you have to do the proper testing in the end. I also think that the cloud was great, because all of a sudden we got a lot of services that we could build upon and really focus on. It’s a business value that we could generate, because we didn’t have to invest time in, I don’t know, maintaining a database and certain stuff like this, because somebody took care of it.

So I think overall, the cloud was a great thing. From a testing perspective, to add one more thing – and I think also, Kristen, you said it again, if you have to make smart decisions where your software runs, because you always need to have in mind where your end users are, where your consuming services are, because that also means you need to test from those locations, because you may now have different latency. You may have different costs out of a certain data center, out of a certain through provider. I think these are some of the additional considerations, but in general for performance testing and scalability testing, in theory it shouldn’t matter whether my application that I’ve architecture well runs in one cloud, in three clouds or on premise. That’s kind of my answer. It shouldn’t.

Mitch Ashley: Alan, too, I think one of the, I think, real game-changers about the cloud is just access to resources. And in the organization, when you do it internally, the two greatest bottlenecks are acquisition of hardware and firewall rules, to be straight about it. And you can kinda control your own environment, whether it be resources or security, whatever it might be. But that flexibility gives you more – not just more options. You can rethink how you can automate how you set up an environment, “Let’s configure it this way,” your permutations, or, “We’re thinking about changing it in some specific way,” and you can automate how you test that. So it seems that it gives you a lot more flexibility. Sometimes flexibility means complexity, too, but that to me was one of the biggest game-changers of moving to the cloud.

Kristen Webb: Yeah, I mean, with hardware being the most expensive part of performance testing and – like you said, it’s difficult and time-consuming to go through IT to procure resources, to get them provisioned, to make sure you can access them at the right time and there’s no conflicts. So what you’re talking about, really, is more the app development, app testing within a cloud CI system, which is a whole other level, in addition to application deployment in the cloud. And that’s really exploding, too, and it does minimize – with neo-load SaaS, you can get up and start it and run a performance test within five minutes now, because it’s a fully managed service using AWS and Google, and we do that for you. So it is a game-changer in that way, too.

But also, this is where I do disagree with Andy a little bit. Wherever your app is, it has to perform. It has to run well, but the cloud does introduce more risk, because – well, first of all, it’s shifting the risk from on prem. to cloud, but it’s also – you’re going to hit a lot more services now, particularly where you have a new architecture, where it’s a modernized architecture, and it could be a – what are they calling it, a microlift now? So more services are gonna be affecting you in that architecture. It could be really unrelated, and when you have that kind of a distributed system, you might not know where the bottleneck is coming from.

Alan Shimel: I mean, I think the cloud does introduce more complexity, if you will, but there’s – I mean, look. The advantages of cloud I think far outweigh the disadvantages of cloud, right? I think that there’s chips on both sides of that scale. But you know what? I mentioned in the beginning that in addition to this prerecorded Devops Unbound episode February 24^th – we do have the live roundtable for our audience. So I’d like to invite everyone that’s watching this, if you wanna learn more about what to consider in building a cloud performance-testing strategy, we do have this live roundtable. We have folks from Dynatrace and Tricentis and more joining us on February 24^th. The roundtable, funny enough, is called “Strategies for Performance Testing in the Cloud,” and if you go to devops.com under the webinar section, you can sign up for this webinar right there. We’re calling it a webinar. It’s a live roundtable. You will have a chance to have your questions answered, to speak to the panel members directly via chat, and help you with your strategy. So don’t miss that, February 24^th, just a few weeks from now.

We’re running low on time. I’d like to throw one other kinda question out at the three of you for our audience’s benefit, right? Right, okay, you convinced me. I need a holistic approach to performance testing. Other than listening to the three or the five of us on here today, can you point me to some good places, if I’m in the audience, where I can go find out more about this, where I can get smart quick? Donald, we let Kristen lead off everything, so if it’s okay, I’m gonna ask you to go first on this one.

Donald Lutz: All right. Well, I mean, there’s a couple books I like that aren’t necessarily about performance testing, but one of them, it’s – I always forget the name. It’s over there, Building Micro-services by Sam Newman. It’s a really good book about – it’s a second edition came out. There’s another one that came out by the guys at – oh, God, I can’t think of where they were, but to me that’s a great reference, because it covers everything from culture to how you build your micro-services, how to performance-test it. I like a lotta written things. There’re various videos that’re out there. There’s a couple that Martin Fowler – that talk about this in particular. So those’re kinda the things off the top of my head.

Alan Shimel: So I think Sam Newman, like Martin, were with ThoughtWorks, if I’m not mistaken.

Donald Lutz: Yeah, he’s independent now. He went off on his own, and now he consults.

Alan Shimel: Yeah, yeah, they did leave. I remember. Yep. Andy, how ’bout you?

Andy Grabner: Yeah. So I would say, and this is a little commercial on what we do on the side of Dynatrace, we have a holistic approach to SLOs, to service-level objectives, with our CNCF Project Kaptn, spelled K-A-P-T-N. It’s a CNCF project, and what we do there, we are providing events-driven orchestration to actually embed performance engineering as part of your delivery process but also as part of your operational processes. And there’s a great integration also with Neutis, with neo-load, so that means we actually provide a framework on top of Kubernetes to enable developers to simply check in their code, and then Kaptn is orchestrating the deployment, the testing like triggering and neo-load test and then automatically evaluating your SLOs and then pushing it forward all the way into production but always evaluating SLOs, so always evaluating your performance criteria, your resiliency criteria, your business criteria. So if there’s one thing, check out Kaptn, our CNCF project, because I think it will definitely contribute towards having a more holistic approach to performance engineering.

Alan Shimel: Thanks, Andy. Kristen, I saved you for last.

Kristen Webb: Thanks. So a couple things. One is we have a performance advisory council that is run through Tricentis, but really it’s not about product as much as it’s about approaches, and it organizes all the performance experts from around the world, including – Andy, I think you participated in it a lot. And Henrik, who is gonna be joining me on the webinar later in the month, used to run those performance advisory councils, and he’s at Dynatrace now. So he’s a great resource as well.

But we have a site for the performance advisory council, where you can see all the recordings of experts around the world telling their stories about what they found worked for them and what their learnings were and how they integrated with various development processes. In addition to that, on Tricentis.com/neoload there is a ton of information about the problems that neo-load is addressing and how we’re addressing ’em, the integrations we have across traditional tech stacks and the devops tech stack and how we work with Dynatrace to help shift right and bring that production feedback, along with all the monitoring tools for that observability we have. And we also have a trial that’s available to learn more.

Alan Shimel: I love it. Hey, Kristen, thanks very much. Again, just to reemphasize, Kristen mentioned it, February 24^th, “Strategies for Performance Testing in the Cloud” with some really great, smart people, including Kristen. Check it out on devops.com. You can register, and you do need to register to attend that live roundtable, though. Hey, Mitchell, I’m gonna let you take it home.

Mitch Ashley: Well, thanks, Alan. I was just reflecting and really absorbing what our panelists had to say today, and the thing I take away is I’m impressed with the sophistication of how we think about – and the sophistication of the people who work in testing now. Long gone are the days of testers are people who didn’t like to code, right, who were developers that didn’t like being developers. We’ve got some extremely brilliant thought leadership happening in testing, whether it be in the products that’re being created, people that’re trying to solve both complex problems and sometimes fun ones, too. So it was kind of a renaissance for testing time for me. It’s a great time to be a tester. It’s a great time to be a QA person and a software engineer working with test. So it’s a fantastic topic, and I hope people will check out the resources that everybody suggested.

Alan Shimel: You’re right. It is. Is it an old Irish proverb, “May you live in interesting times,” or something like that? It’s great times to be a tester. Anyway, Donald, Andy, Kristen, thank you so much for being on our panel today. Mitchell, as always, thank you. To our audience, thank you. We hope you found this interesting, and if you did, again, I’ll beat the drum one more time: February 24^th, “Strategies for Performance Testing in the Cloud.” Sign up for it. Don’t miss it. I think it’s gonna be a great one. But until then, this is Alan Shimel for Devops Unbound. Many thanks to our sponsor, Tricentis. Thanks to all of you for watching in, and we’ll see you soon on another Devops Unbound.