Last year we demonstrated the Grafbase Gateway's excellent performance with the graphql-gateways-benchmark from The Guild. Now we're back with an expanded test suite that goes beyond simple throughput measurements. Our September 2025 benchmarks evaluate five federation gateways: Grafbase Gateway, Apollo Router, Cosmo Router, Hive Gateway, and Hive Router. In addition to the original benchmark, we have additional scenarios that mirror production challenges: complex query planning, massive response payload and request de-duplication.
Our benchmark suite runs on a Linux machine with an AMD Ryzen 9 7950X3D (16 cores) and 94 GiB of RAM. To ensure fair and reproducible results:
- All gateways run in Docker containers with
--network host
to minimize overhead - CPU boost is disabled to prevent frequency scaling from skewing results
- Subgraphs are optimized for speed and serve responses mostly from cache
- Every request includes a unique
authorization
header to prevent gateways from abusing the repetitive nature of the benchmark, except for the de-duplication scenario.
We use K6 for load testing, running each scenario for 60 seconds. The benchmarks measure:
- Response latencies, count/rate through K6.
- Resource usage (CPU & MEM) through docker stats during the load test.
- Requests per core.s / per GB.s is computed from the maximum CPU / MEM.
- Total count of subgraph requests is measured by the subgraphs themselves and reported through K6.
In the following tables we always show the maximum CPU & memory used but the average is also available in the full report.
Gateways:
- Grafbase Gateway 0.49.0
- Apollo Router 2.6.0. For the disabled cache variant we used the following: which is a bit before 2.6.0.
- Cosmo Router 0.249.0
- Hive Gateway 1.16.3
- Hive Router 0.0.8
The full benchmark suite is open source at github.com/grafbase/graphql-federation-benchmarks. The whole report and charts will be generated by the embedded CLI app and you can choose to run specific gateways/scenarios.
All the data presented here comes from the auto-generated full report.
The many-plans
scenario uses 7 subgraphs with very similar schemas and executes deep queries to stress the query planner, forcing it to consider many possible paths. It's particularly relevant for measuring cold-start performance after gateway re-deployments as many queries will need to be planned from scratch. We have ran the gateways both with and without caching:
Without Planning Cache:
Gateway | P50 (ms) | P95 (ms) | P99 (ms) | CPU max | MEM max | Requests/core.s | Requests/GB.s |
---|---|---|---|---|---|---|---|
Grafbase Gateway | 19.5 | 20.6 | 21.9 | 108% | 134 MiB | 46.9 | 387.2 |
Cosmo Router | 372.1 | 380.3 | 383.6 | 178% | 83 MiB | 1.5 | 33.2 |
Apollo Router | 3338.9 | 3385.2 | 3405.2 | 101% | 4101 MiB | 0.3 | 0.1 |
Hive Gateway | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
Hive Router | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
With Planning Cache:
Gateway | P50 (ms) | P95 (ms) | P99 (ms) | CPU max | MEM max | Requests/core.s | Requests/GB.s |
---|---|---|---|---|---|---|---|
Grafbase Gateway | 2.0 | 2.3 | 2.6 | 167% | 56 MiB | 282.5 | 8558.2 |
Apollo Router | 9.3 | 10.5 | 11.9 | 175% | 801 MiB | 56.8 | 126.9 |
Cosmo Router | 17.9 | 19.7 | 20.4 | 513% | 79 MiB | 10.7 | 717.6 |
Hive Gateway | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
Hive Router | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ |
We have also tested Hive Gateway and Hive Router, but neither managed to return a valid response. Hive Router was particularly slow only being able to return a single response within the 60s load test.
Note that query planning is a complex problem, this benchmark is only one worst case scenario. But, it has the benefit of having no simple workarounds, the planner really has to consider many different paths and some are better than others. On that regard here are the number of subgraph requests executed by each gateway:
Gateway | Subgraph Requests |
---|---|
Grafbase Gateway | ~78 |
Cosmo Router | ~190 |
Apollo Router | 203 |
The Grafbase Gateway generated the least subgraph requests, providing a better query plan. Both Grafbase Gateway and Cosmo Router have inflight request de-duplication activate by default. Grafbase gateway used 80 requests without de-duplication. We have never tested Cosmo Router without and Apollo Router with it in this case. An improvement for the future!
The big-response
scenario tests how gateways handle large GraphQL responses (~8MiB) containing a mix of lists, objects, strings, floats, and integers. K6 runs with a single virtual user to measure best-case latencies.
Latencies:
Gateway | P50 (ms) | P95 (ms) | P99 (ms) |
---|---|---|---|
Hive Router | 25.5 | 29.6 | 32.3 |
Grafbase Gateway | 29.9 | 33.9 | 36.2 |
Cosmo Router | 72.3 | 83.0 | 87.1 |
Apollo Router | 123.9 | 132.0 | 137.5 |
Hive Gateway | 161.6 | 176.9 | 190.8 |
Resources:
Gateway | CPU max | MEM max | Requests/core.s | Requests/GB.s |
---|---|---|---|---|
Hive Router | 81% | 178 MiB | 44.0 | 204.7 |
Grafbase Gateway | 89% | 82 MiB | 34.6 | 385.7 |
Cosmo Router | 267% | 144 MiB | 5.0 | 95.6 |
Apollo Router | 97% | 569 MiB | 8.1 | 14.2 |
Hive Gateway | 122% | 569 MiB | 4.9 | 10.8 |
Both Hive Router and Grafbase Gateway demonstrate excellent performance for large payloads with Cosmo following.
The long-lived-big-response
scenario is a variant of the previous one that includes an additional small subgraph requests that takes 100ms. This forces gateways to keep responses in memory longer to have a better measure of how efficient memory use is for response data. The load test is executed with 10 VUs.
Latencies:
Gateway | P50 (ms) | P95 (ms) | P99 (ms) |
---|---|---|---|
Hive Router | 141.0 | 159.1 | 167.6 |
Grafbase Gateway | 159.3 | 181.1 | 193.5 |
Cosmo Router | 205.5 | 241.5 | 258.3 |
Apollo Router | 302.4 | 385.3 | 466.3 |
Hive Gateway | 1665.3 | 2543.4 | 3342.1 |
Resources:
Gateway | CPU max | MEM max | Requests/core.s | Requests/GB.s |
---|---|---|---|---|
Hive Router | 245% | 568 MiB | 28.4 | 125.2 |
Grafbase Gateway | 234% | 365 MiB | 26.2 | 172.1 |
Apollo Router | 510% | 2563 MiB | 6.3 | 12.8 |
Cosmo Router | 991% | 934 MiB | 4.8 | 52.5 |
Hive Gateway | 133% | 766 MiB | 4.4 | 7.9 |
We see similar results as before. Cosmo's memory consumption increased significantly, Hive Gateway didn't on the contrary. However, Hive Router continued to de-duplicate requests despite unique authorization headers, making this comparison not entirely equivalent. We could have disabled here, but didn't think of it at the time of the benchmark. Hive Router would likely consume more resources without.
This scenario executes the complex query from The Guild's graphql-gateways-benchmark, requiring multiple subgraph requests with duplicate patterns. We have two variants:
query
: a constant throughput of 500 requests/s with a unique authorization header to prevent abusing the repetitive nature of the bench.deduplication
: a constant throughput of 1000 requests/s with de-duplication enabled on all gateways and a constant authorization header.
Query Latencies:
Gateway | P50 (ms) | P95 (ms) | P99 (ms) | Avg Subgraph Requests |
---|---|---|---|---|
Hive Router | 40.4 | 46.4 | 47.5 | 1.28 |
Grafbase Gateway | 45.4 | 47.2 | 47.9 | 13.0 |
Cosmo Router | 46.9 | 48.8 | 49.9 | 8.01 |
Apollo Router | 48.2 | 50.2 | 51.0 | 16.0 |
Hive Gateway | 1314.4 | 1725.0 | 10926.4 | 7.00 |
Here again Hive-router used inflight request deduplication despite the random authorization header. Contrary to before, here there might duplicate subgraph requests, and it's definitely a good strategy to de-duplicate them for the same gateway request. We think that's what Cosmo does here. So whether we disable or enable Hive router de-duplication, it's not an apples to apples comparison.
Interestingly Grafbase Gateway does more requests than Cosmo here, this is because we don't detect and use the pattern provided by the fragments which is something we're working on adding soon.
Query Resources:
Gateway | CPU max | MEM max | Requests/core.s | Requests/GB.s |
---|---|---|---|---|
Hive Router | 50% | 170 MiB | 998.9 | 3008.6 |
Grafbase Gateway | 71% | 78 MiB | 704.8 | 6560.9 |
Apollo Router | 295% | 184 MiB | 169.1 | 2779.9 |
Cosmo Router | 349% | 65 MiB | 143.1 | 7857.5 |
Hive Gateway | 141% | 703 MiB | 94.1 | 192.8 |
Deduplication Latencies:
Gateway | P50 (ms) | P95 (ms) | P99 (ms) | Avg Subgraph Requests |
---|---|---|---|---|
Hive Router | 39.9 | 46.1 | 47.3 | 0.65 |
Grafbase Gateway | 41.5 | 47.1 | 48.2 | 0.89 |
Cosmo Router | 43.6 | 51.6 | 54.1 | 0.63 |
Hive Gateway | 1326.1 | 1775.3 | 11125.8 | 7.00 |
Apollo Router | ✗ | ✗ | ✗ | ✗ |
Deduplication Resources:
Gateway | CPU max | MEM max | Requests/core.s | Requests/GB.s |
---|---|---|---|---|
Hive Router | 83% | 199 MiB | 1206.4 | 5135.3 |
Grafbase Gateway | 120% | 106 MiB | 832.4 | 9634.4 |
Cosmo Router | 599% | 97 MiB | 166.9 | 10543.9 |
Hive Gateway | 142% | 765 MiB | 93.3 | 176.9 |
Apollo Router | ✗ | ✗ | ✗ | ✗ |
Both Hive Router and Cosmo achieve the lowest number of subgraph requests per gateway requests, with Grafbase following behind. For some reason Apollo router had errors with query de-duplication enabled.
These benchmarks demonstrate that Grafbase Gateway delivers exceptional performance across diverse workloads. With the lowest memory footprint, fastest query planning, and competitive latencies, it's engineered for production environments where efficiency and reliability matter.
The full benchmark suite is open source at github.com/grafbase/graphql-federation-benchmarks. We encourage the community to run these benchmarks in their own environments and contribute additional scenarios.
As always, the best benchmark is one that reflects your specific workload. We're committed to continuous improvement and welcome feedback on scenarios that matter to your use cases.