Blogs

Managing GraphQL at Scale

GraphQL, the query language invented by Facebook, has attracted a lot of interest as a means to conveniently access data programmatically. GraphQL structures internal services with a unified schema for data, empowering product engineers to easily grab what they want from many microservices to populate the user interfaces (UIs) they’re creating. By unifying company data into a single graph, GraphQL combines the benefits of SQL with the ubiquity of API access.

However, GraphQL does introduce some challenges when you’re trying to scale it across a large enterprise and partner ecosystem. For example, since GraphQL channels all requests through a single endpoint, security must carefully segment access to all the underlying infrastructure to avoid data overexposure. Matt DeBergalis, Apollo’s co-founder and CTO, also identified potential issues around changeability and performance when adopting a graph at scale.

As the use of GraphQL within large organizations grows, development teams will have to overcome these obstacles—especially if companies intend to productize public APIs and reuse the same single graph company-wide. I recently met with DeBergalis for an update on managing GraphQL at scale. According to DeBergalis, we’re still in the early days of the graph revolution. And, future GraphQL maturity will require more rules-based declarative architecture, efficiency improvements and the adaptability to enact agile iteration of graph schemas.

State of GraphQL Adoption

So, what is driving the rise of GraphQL? Well, DeBergalis noted three key contributing factors:

  1. Many many microservices. In recent years, there has been an explosion of microservices. These modular services can be challenging to wrangle and are often exposed inconsistently throughout a software ecosystem.
  2. The rise of omnichannel platforms. Product engineers must now port the same data across disparate devices, whether streaming in a browser or on a Peloton screen.
  3. Users expect a hefty, data-rich experience. Expectations are at an all-time high. It’s now expected that an e-commerce shop, for example, will include product recommendations, auto-fill keyword search, pricing by geography and many more advanced features.

Juggling all these factors creates a “complexity bottleneck,” DeBergalis explained.

GraphQL responds to such complexity, helping unify the backend to enable product developers to fetch precisely what their client applications require. And, GraphQL adoption took off pretty quickly, particularly among product engineering teams, DeBergalis added. GaphQL plays nicely with familiar languages and frameworks like JavaScript, React and Relay. It’s also a bit different than previous trends, with demand being driven more by the database folks; “query people are really leading the charge,” he said.

With the right technology stack, developers can add a graph layer between the underlying core APIs and the UI they are building, creating a universal API with a self-documenting schema. “For the developer, the amazing thing about GraphQL is throwing away all this boiler code,” DeBargalis said. GraphQL acts as “a foundation for tomorrow” to help deliver future business requirements, he said.

With regard to architectural styles for APIs, 94% still use REST and 31% use GraphQL, according to the 2021 State of API Report from Postman. Though this percentage may seem small now, GraphQL adoption is steadily growing among large companies, such as PayPal, Netflix, Shopify, GitHub and Airbnb.

Three Issues When Scaling GraphQL

Using GraphQL is like using a mobile phone. You wouldn’t want to use a different phone every time you called a different number, would you? You would rather have all your contacts accessible from the same device. Similarly, GraphQL excels the more functionality is added.

“Due to the network effect, a graph becomes more interesting as we publish more into it,” said DeBergalis. “What we have found is that GraphQL is most interesting and valuable at scale.” So, what are some inevitable consequences of GraphQL at scale? DeBergalis outlines three main architectural issues.

Performance

Performance downsides are one disadvantage that developers often cite when considering GraphQL. GraphQL has no built-in caching features, and the payload query could introduce additional latencies. Mitigating lower-level issues around throughput and reliability will likely require additional tooling from the community, DeBargalis said. This is partially why the Apollo team has begun to embrace Rust in new attempts to make infrastructure for GraphQL routing hyperefficient.

Changeability

A second problem enterprises may encounter when scaling GraphQL is embracing agility for such an all-encompassing fabric. “The companies that do GraphQL really well change it all the time—they treat it like a product,” said DeBergalis. This agility is at odds with the traditional REST API approach of building for longevity, which makes that approach extremely averse to breaking change.

The reality is that when a company embraces a single graph, it may be pushing hundreds of schema changes every day. Different internal teams may maintain their own slice of the graph, defining it differently. Plus, maintenance roles may shift from team to team over time, noted DeBergalis. “If you try to waterfall what your inventory model should look like, you won’t get anywhere,” he said.

Balancing change and stability will likely require a two-pronged approach to GraphQL. You want to continually iterate the graph with experimental features but also depend on certain functionality to preserve backward compatibility for client integrations. Satisfying both sides will likely require a change management strategy for iterating quickly, as well as enterprise-grade service level agreements (SLAs) to retain a firm bedrock on which to build more permanent data models.

DeBergalis also described how observability is important for obtaining a clear picture of graph environments. This is one area where GraphQL shines—since clients request precisely what they need, you are never left wondering which methods are fetched but not implemented in production, as could happen with REST API management. According to DeBargalis, tracking fine-grained GraphQL usage with KPIs can help direct how the data model should evolve over time.

Authorization

Finally, perhaps the most significant concern surrounding GraphQL at scale is apportioning the appropriate access to users and devices. As a single unified endpoint can connect to a vast array of data behind the scenes, large GraphQL implementations will require a strategy for how that one unified graph can be implemented in parts.

Without the proper access control, GraphQL could improperly expose personally identifiable information (PII). For example, a FinTech company may have a rule that requires a customer service representative to only have visibility into a customer’s transaction history if the customer has filed a support ticket in the last few days. Or consider data custody—a business working in the EU must comply with regulations on where data can and can’t live.

Safeguarding GraphQL will require a declarative setting that integrates authorization control and context into GraphQL queries. To implement this, DeBergalis advocated for a command-and-control style. First, you build the schema around the data and problem statement. Next, you integrate rules field-by-field, marking sensitive touchpoints as PII.

With rules baked into the query, you can have a more solid understanding of the surrounding context. “Authorization is actually a lot better at graph level,” said DeBergalis. “It’s the layer that has semantic understanding.”

Future of GraphQL Scalability

“Once you have a graph, you want to use it for everything,” said DeBergalis. The heritage of GraphQL is JavaScript, the language of the product world. And, the graph approach delivers an excellent experience for product engineers. The sheer usability and productivity benefits of GraphQL indicate incredible potential for projected use in the years to come.

As we’ve discussed, scaling GraphQL across an enterprise will likely require performance improvements, a path for schema adaptability, policies to handle authorization for queries and greater accountability for business metrics. While GraphQL certainly has existing momentum among a subgroup of engineers, there are plenty of opportunities to introduce more training and educational curriculum to spread greater awareness.

Furthermore, for those coming from a traditional REST API mindset, GraphQL is a bit of a new paradigm that will require some massaging to expose graph endpoints as a productized API. Doing so will likely involve marking select parts of the larger graph to expose to partners. “Starting with an internal GraphQL use case works really well, but productization around GraphQL is coming,” said DeBergalis.

The value of a unified graph is so strong, and the bottom-up hunger for a graph environment is so intense that increased GraphQL adoption feels like a given. “At this stage, the focus is on expansion,” said DeBergalis. “How do we roll this out to more teams?”

“We know there’s going to be a graph in every company—not a lot of little ones,” said DeBergalis. “Every company’s going to need a graph.” When that day comes, the architectural challenge will be breaking apart the graph into pieces. “That’s the next big frontier.”

Bill Doerrfeld

Bill Doerrfeld is a tech journalist and analyst. His beat is cloud technologies, specifically the web API economy. He began researching APIs as an Associate Editor at ProgrammableWeb, and since 2015 has been the Editor at Nordic APIs, a high impact blog on API strategy for providers. He loves discovering new trends, researching new technology, and writing on topics like DevOps, REST design, GraphQL, SaaS marketing, IoT, AI, and more. He also gets out into the world to speak occasionally.

Recent Posts

Exploring Low/No-Code Platforms, GenAI, Copilots and Code Generators

The emergence of low/no-code platforms is challenging traditional notions of coding expertise. Gone are the days when coding was an…

16 hours ago

Datadog DevSecOps Report Shines Spotlight on Java Security Issues

Datadog today published a State of DevSecOps report that finds 90% of Java services running in a production environment are…

1 day ago

OpenSSF warns of Open Source Social Engineering Threats

Linux dodged a bullet. If the XZ exploit had gone undiscovered for only a few more weeks, millions of Linux…

2 days ago

Auto Reply

We're going to send email messages that say, "Hope this finds you in a well" and see if anybody notices.

2 days ago

From CEO Alan Shimel: Futurum Group Acquires Techstrong Group

I am happy and proud to announce with Daniel Newman, CEO of Futurum Group, an agreement under which Futurum has…

2 days ago

CDF Survey Surfaces DevOps Progress and Challenges

Most developers are using some form of DevOps practices, reports the CDF survey. Adopting STANDARD DevOps practices? Not so much.

3 days ago