How Supergraphs Help Consolidate Data Access

GraphQL, a query language for APIs, has become more of a common tool for new web-based development. GraphQL can help streamline data access by providing all the resources a developer requires in a single request. Compared to the traditional REST approach to web-based data access, GraphQL has a more standardized and traversable schema and arguably better developer experience.

But GraphQL is not just great for improving developer usability for a single data model. If used to aggregate various data sources and APIs into a unified graph, GraphQL could become ‘one schema to rule them all‘ for enterprises. Some refer to this architecture as a supergraph.

I recently caught up with Tanmai Gopal, CEO of Hasura, to explore the concept of a supergraph and see how it can help make data more accessible for enterprises. Below, we’ll consider the expansion of the supergraph concept and what the benefits are of deploying such a strategy.

Understanding the Need for Supergraphs

In large software architectures, product teams need access to various data sources from different domains. “There is a need for integrative experiences, which are becoming more and more important for the end user,” explained Gopal. For instance, an e-commerce application might be composed of product catalog data, login functionality, payment data, shipping information and other necessary information.

However, there are hurdles in the way of pulling in data from these multiple domains. Different domains are often built as microservices, supported by a dedicated team and using a bespoke developer portal and API schema to externalize the services, said Gopal. The friction involved in API integration and the lack of a common registry of services could slow down development.

Not only do microservices that expose discrete endpoints have issues, but the lines between internal and external are blurring, necessitating a zero-trust approach. And if we begin to consider data as a product, explained Gopal, we need to access multiple domains in a secure, consistent and scalable way.

Benefits of Adopting GraphQL and Supergraph

A supergraph can solve some of these difficulties, explained Gopal. By exposing graphs from different domain services, you can automatically unify data into a supergraph, which could be accessed from various teams. And the great thing about using GraphQL is that the act of documenting is the act of building the API, meaning that the description and schema are more standardized.

By composing disparate services into a centralized registry, you could work across domains in a systematic but federated way, said Gopal. Implementing a layer here could also be a good area to enforce role-based identity or authentication.

Andrew Carlson, principal field architect at Apollo GraphQL, similarly proposed using GraphQL to centralize data access control. “When used as a layer to aggregate and orchestrate existing APIs,” Carlson said. “It’s an ideal location in our architecture to centralize access control and authorization down to the field level, providing field-level observability into which clients request what data.”

Gopal shared a handful of other specific advantages for using GraphQL in this context:

Helps you write less code: Since GraphQL enables you to fetch whatever field you want, aligning on GraphQL as a common interface could reduce the number of requests required to integrate data. Also, certain developer experience perks, such as auto-complete, could enhance schema discoverability and have exponential benefits when sources are intertwined in a combined graph.
Increases agility for new development: The need to quickly understand the overall schema and what it looks like is becoming especially relevant, especially for AI applications, said Gopal. There’s a massive agility boost when using a supergraph schema, he added, such as enabling software teams to fetch the context they require to quickly put things into motion.
Improves data and service discoverability: Software teams are facing tooling sprawl, and the number of APIs in use within an organization keeps increasing. A supergraph can enable a traversable data mesh, greatly aiding discoverability and, in turn, the ability to innovate. “The benefit from the ability to understand and use data from different domains is massive,” said Gopal.
Unifies teams from various domains: Another potential benefit of the supergraph concept is improved inter-departmental collaboration. Helping teams work together could enhance reporting and reduce risk. This new shared operating model helps to clean things up when updating code and the schema, said Gopal.

Next Step: Protocol Agnosticism

The idea of wrapping disparate data sources and formats in a standard, unified layer is an alluring concept. “The agility impact is massive,” said Gopal. Especially, he adds, if you have the ability to automate the creation of a supergraph around various styles of legacy databases.

However, investment into a supergraph can be time-intensive. The benefits are unlocked down the line, but you will need an internal champion to initially advocate for it and bring it to fruition, he cautions. Furthermore, enterprises may face hesitancy in going all-in on GraphQL due to past investments in other technology types.

While GraphQL is becoming more popular, various API styles are still commonly applied and will continue to be supported for some time. For instance, a 2023 study by API management and testing company Postman found that GraphQL use had eclipsed the mainstay API format of SOAP. Yet, REST is still the dominant API style, used by 86% of respondents, followed by Webhooks (36%), GraphQL (29%) and SOAP (26%).

As such, the next step is for the supergraph to be agnostic to GraphQL, said Gopal. Instead, it should be able to work with various protocols since most enterprises are utilizing multiple styles.