For many coders, the idea of distributed systems development might seem like a dark art. They think of high-performance computing wizards crafting mysterious spells with job schedulers to work magic with hundreds of connected nodes in massively-parallel systems. Or large, distributed content delivery networks that balance resources to deliver content across the world no matter how high the demand. The folks doing those jobs have often worked in specialist distributed computing environments for years. But the complex problems they face at a massive scale aren’t so different from those that you deal with every day. In short, you’re a distributed developer too. It’s just that no one bothered to tell you.
Everything is Distributed
The world is a lot more distributed today than it used to be. Back in the day, code would talk directly to one mainframe or microcomputer, generally on the same LAN. Your PDP-11 and your batch order processing software had an intimate, exclusive kind of relationship.
Computing environments began to change with the advent of open computing in the late eighties. It ushered in the era of multiple computers working together other over a network. The concept continued to evolve beyond PCs. Now phones, servers and smart kettles all expect an easy way to talk to each other.
Today, distributed operation is so ubiquitous that people barely see it happening. When you type anything into a browser, it fires off dozens of queries to different backend systems owned by different organizations. These systems generally run on different infrastructure.
Even a small application that makes a database query is a distributed system. The application often talks to a system—or even multiple systems—halfway around the world.
Many apps these days are compositions of multiple API calls to different online services. When a developer gathers weather data from a service provider and feeds it into a cloud-based online geographical information system that decides how much to irrigate crops in Iowa tomorrow, what is that if not a distributed query?
Software is becoming more distributed over time. As software became more complex, developers have long wanted to carve it up into smaller pieces. This began in earnest with the software component movement around CORBA in the mid-to-late nineties, gravitating to XML web services in the early noughts and then to RESTful APIs. All of these involved forms of remote procedure call to increasingly granular services.
In the last decade, this atomization of monolithic software has become the norm rather than the exception. Container-based architectures ushered in the use of microservices for better development agility and maintenance. Most developers that started their careers in the last five years or so will have been forced to think about how these services communicate from the outset. Distributed design patterns are a fact of everyday life.
And it’s Still Difficult
While your distributed applications might not be as complex as those facing HPC and hyperscale services, they still come with some challenges that increase as systems become more granular. What happens if a remote API takes longer than expected to deliver a result because it is overloaded, meaning that API queries to different systems arrive in a different sequence than expected? How should you react if a network or system outage or an expired access key prevents an API from responding at all?
Problems like these arise unexpectedly, meaning that you have to get more creative with code from the outset. Good developers do more than master object-relational mapping and send queries. They program defensively to ensure that their programs retry steps when a failure somewhere else delays that job or stops it altogether.
There are different design patterns to handle most common distributed programming scenarios today, even though you might not think of them as such. For example, developers have been building load-balancing code into their applications for decades, distributing requests between different servers to maintain performance.
Anyone who has tried to maintain transaction integrity when writing to multiple databases will also have dabbled in two-phase commit design, which ensures that all of the databases get updated (or that none of them do).
The more experienced you get, the more complex and esoteric these patterns become. At some point, many developers move beyond basic CRUD models in their applications to handle more complexity when reading and writing data. For example, they might read data from a website in one format but collapse multiple records into one when writing data to the website’s back-end store. That means writing code that creates different types of read and write queries. That pattern, called command and query responsibility segregation (CQRS), is also a form of distributed programming.
Then, there’s the Saga pattern, which people use to maintain application state across long-running processes comprising multiple transactions. This pattern relies on an event bus along with messaging and queuing code to handle services with unpredictable delays in response times. It ensures that the overall process transaction finishes in an acceptable state or rolls back.
As you venture into more sophisticated patterns that hold transactions together while maintaining performance, you’ll find your code becoming more complex and, therefore, application development becoming more difficult. Such is the life of a distributed developer, even if you didn’t realize that you were one until now.
Gaining this knowledge and experience as a developer is important, but once learned the hard way, you don’t have to keep doing it manually. There are technology solutions that manage some of these complex distributed patterns for you. As the volume of plumbing code to support distributed design patterns increases, so does the need to offload it. In an industry increasingly built on abstraction, it’s time to turn distributed systems management into a separate service and unburden your code.
And—if you’re looking for an example platform specifically designed to address this issue—check out Temporal, a completely open-source (MIT license) microservice orchestration platform which enables developers to build scalable applications without sacrificing productivity or reliability.