Benchmarking GraphQL Federation Gateways

Last year we demonstrated the Grafbase Gateway's excellent performance with the graphql-gateways-benchmark from The Guild. Now we're back with an expanded test suite that goes beyond simple throughput measurements. Our September 2025 benchmarks evaluate five federation gateways: Grafbase Gateway, Apollo Router, Cosmo Router, Hive Gateway, and Hive Router. In addition to the original benchmark, we have additional scenarios that mirror production challenges: complex query planning, massive response payload and request de-duplication.

Methodology

Our benchmark suite runs on a Linux machine with an AMD Ryzen 9 7950X3D (16 cores) and 94 GiB of RAM. To ensure fair and reproducible results:

All gateways run in Docker containers with --network host to minimize overhead
CPU boost is disabled to prevent frequency scaling from skewing results
Subgraphs are optimized for speed and serve responses mostly from cache
Every request includes a unique authorization header to prevent gateways from abusing the repetitive nature of the benchmark, except for the de-duplication scenario.

We use K6 for load testing, running each scenario for 60 seconds. The benchmarks measure:

Response latencies, count/rate through K6.
Resource usage (CPU & MEM) through docker stats during the load test.
Requests per core.s / per GB.s is computed from the maximum CPU / MEM.
Total count of subgraph requests is measured by the subgraphs themselves and reported through K6.

In the following tables we always show the maximum CPU & memory used but the average is also available in the full report.

Gateways:

Grafbase Gateway 0.49.0
Apollo Router 2.6.0. For the disabled cache variant we used the following: which is a bit before 2.6.0.
Cosmo Router 0.249.0
Hive Gateway 1.16.3
Hive Router 0.0.8

The full benchmark suite is open source at github.com/grafbase/graphql-federation-benchmarks. The whole report and charts will be generated by the embedded CLI app and you can choose to run specific gateways/scenarios.

All the data presented here comes from the auto-generated full report.

Scenarios

1. Query Planning

The many-plans scenario uses 7 subgraphs with very similar schemas and executes deep queries to stress the query planner, forcing it to consider many possible paths. It's particularly relevant for measuring cold-start performance after gateway re-deployments as many queries will need to be planned from scratch. We have ran the gateways both with and without caching:

many-plans-latencies

many-plans-efficiency

Without Planning Cache:

Gateway	P50 (ms)	P95 (ms)	P99 (ms)	CPU max	MEM max	Requests/core.s	Requests/GB.s
Grafbase Gateway	19.5	20.6	21.9	108%	134 MiB	46.9	387.2
Cosmo Router	372.1	380.3	383.6	178%	83 MiB	1.5	33.2
Apollo Router	3338.9	3385.2	3405.2	101%	4101 MiB	0.3	0.1
Hive Gateway	✗	✗	✗	✗	✗	✗	✗
Hive Router	✗	✗	✗	✗	✗	✗	✗

With Planning Cache:

Gateway	P50 (ms)	P95 (ms)	P99 (ms)	CPU max	MEM max	Requests/core.s	Requests/GB.s
Grafbase Gateway	2.0	2.3	2.6	167%	56 MiB	282.5	8558.2
Apollo Router	9.3	10.5	11.9	175%	801 MiB	56.8	126.9
Cosmo Router	17.9	19.7	20.4	513%	79 MiB	10.7	717.6
Hive Gateway	✗	✗	✗	✗	✗	✗	✗
Hive Router	✗	✗	✗	✗	✗	✗	✗

We have also tested Hive Gateway and Hive Router, but neither managed to return a valid response. Hive Router was particularly slow only being able to return a single response within the 60s load test.

Note that query planning is a complex problem, this benchmark is only one worst case scenario. But, it has the benefit of having no simple workarounds, the planner really has to consider many different paths and some are better than others. On that regard here are the number of subgraph requests executed by each gateway:

Gateway	Subgraph Requests
Grafbase Gateway	~78
Cosmo Router	~190
Apollo Router	203

The Grafbase Gateway generated the least subgraph requests, providing a better query plan. Both Grafbase Gateway and Cosmo Router have inflight request de-duplication activate by default. Grafbase gateway used 80 requests without de-duplication. We have never tested Cosmo Router without and Apollo Router with it in this case. An improvement for the future!

2. Big Response

The big-response scenario tests how gateways handle large GraphQL responses (~8MiB) containing a mix of lists, objects, strings, floats, and integers. K6 runs with a single virtual user to measure best-case latencies.

Latencies:

big-response-latencies

Gateway	P50 (ms)	P95 (ms)	P99 (ms)
Hive Router	25.5	29.6	32.3
Grafbase Gateway	29.9	33.9	36.2
Cosmo Router	72.3	83.0	87.1
Apollo Router	123.9	132.0	137.5
Hive Gateway	161.6	176.9	190.8

Resources:

big-response-efficiency

Gateway	CPU max	MEM max	Requests/core.s	Requests/GB.s
Hive Router	81%	178 MiB	44.0	204.7
Grafbase Gateway	89%	82 MiB	34.6	385.7
Cosmo Router	267%	144 MiB	5.0	95.6
Apollo Router	97%	569 MiB	8.1	14.2
Hive Gateway	122%	569 MiB	4.9	10.8

Both Hive Router and Grafbase Gateway demonstrate excellent performance for large payloads with Cosmo following.

The long-lived-big-response scenario is a variant of the previous one that includes an additional small subgraph requests that takes 100ms. This forces gateways to keep responses in memory longer to have a better measure of how efficient memory use is for response data. The load test is executed with 10 VUs.

Latencies:

long-lived-big-response-latencies

Gateway	P50 (ms)	P95 (ms)	P99 (ms)
Hive Router	141.0	159.1	167.6
Grafbase Gateway	159.3	181.1	193.5
Cosmo Router	205.5	241.5	258.3
Apollo Router	302.4	385.3	466.3
Hive Gateway	1665.3	2543.4	3342.1

Resources:

long-lived-big-response-efficiency

Gateway	CPU max	MEM max	Requests/core.s	Requests/GB.s
Hive Router	245%	568 MiB	28.4	125.2
Grafbase Gateway	234%	365 MiB	26.2	172.1
Apollo Router	510%	2563 MiB	6.3	12.8
Cosmo Router	991%	934 MiB	4.8	52.5
Hive Gateway	133%	766 MiB	4.4	7.9

We see similar results as before. Cosmo's memory consumption increased significantly, Hive Gateway didn't on the contrary. However, Hive Router continued to de-duplicate requests despite unique authorization headers, making this comparison not entirely equivalent. We could have disabled here, but didn't think of it at the time of the benchmark. Hive Router would likely consume more resources without.

3. Complex query and de-duplication.

This scenario executes the complex query from The Guild's graphql-gateways-benchmark, requiring multiple subgraph requests with duplicate patterns. We have two variants:

query: a constant throughput of 500 requests/s with a unique authorization header to prevent abusing the repetitive nature of the bench.
deduplication: a constant throughput of 1000 requests/s with de-duplication enabled on all gateways and a constant authorization header.

Query Latencies:

query-latencies

Gateway	P50 (ms)	P95 (ms)	P99 (ms)	Avg Subgraph Requests
Hive Router	40.4	46.4	47.5	1.28
Grafbase Gateway	45.4	47.2	47.9	13.0
Cosmo Router	46.9	48.8	49.9	8.01
Apollo Router	48.2	50.2	51.0	16.0
Hive Gateway	1314.4	1725.0	10926.4	7.00

Here again Hive-router used inflight request deduplication despite the random authorization header. Contrary to before, here there might duplicate subgraph requests, and it's definitely a good strategy to de-duplicate them for the same gateway request. We think that's what Cosmo does here. So whether we disable or enable Hive router de-duplication, it's not an apples to apples comparison.

Interestingly Grafbase Gateway does more requests than Cosmo here, this is because we don't detect and use the pattern provided by the fragments which is something we're working on adding soon.

Query Resources:

query-efficiency

Gateway	CPU max	MEM max	Requests/core.s	Requests/GB.s
Hive Router	50%	170 MiB	998.9	3008.6
Grafbase Gateway	71%	78 MiB	704.8	6560.9
Apollo Router	295%	184 MiB	169.1	2779.9
Cosmo Router	349%	65 MiB	143.1	7857.5
Hive Gateway	141%	703 MiB	94.1	192.8

Deduplication Latencies:

deduplication-latencies

Gateway	P50 (ms)	P95 (ms)	P99 (ms)	Avg Subgraph Requests
Hive Router	39.9	46.1	47.3	0.65
Grafbase Gateway	41.5	47.1	48.2	0.89
Cosmo Router	43.6	51.6	54.1	0.63
Hive Gateway	1326.1	1775.3	11125.8	7.00
Apollo Router	✗	✗	✗	✗

Deduplication Resources:

deduplication-efficiency

Gateway	CPU max	MEM max	Requests/core.s	Requests/GB.s
Hive Router	83%	199 MiB	1206.4	5135.3
Grafbase Gateway	120%	106 MiB	832.4	9634.4
Cosmo Router	599%	97 MiB	166.9	10543.9
Hive Gateway	142%	765 MiB	93.3	176.9
Apollo Router	✗	✗	✗	✗

Both Hive Router and Cosmo achieve the lowest number of subgraph requests per gateway requests, with Grafbase following behind. For some reason Apollo router had errors with query de-duplication enabled.

Conclusion

These benchmarks demonstrate that Grafbase Gateway delivers exceptional performance across diverse workloads. With the lowest memory footprint, fastest query planning, and competitive latencies, it's engineered for production environments where efficiency and reliability matter.

The full benchmark suite is open source at github.com/grafbase/graphql-federation-benchmarks. We encourage the community to run these benchmarks in their own environments and contribute additional scenarios.

As always, the best benchmark is one that reflects your specific workload. We're committed to continuous improvement and welcome feedback on scenarios that matter to your use cases.