Announcing the MongoDB Connector

We’re excited to announce the MongoDB connector!

MongoDB is a popular NoSQL database that developers love thanks to it's ease of use. Its popularity coupled with its document oriented nature made it a no-brainer to support at Grafbase.

The MongoDB connector makes it seamless to deploy a GraphQL API with edge-caching, custom resolvers and authentication from your MongoDB database.

Connectors

Companies with many data sources chose GraphQL to unify their data layer to provide a single and consistent API surface to accelerate development.

Connecting data sources is a time consuming task that Grafbase dramatically accelerates by providing first class connectors to popular data sources.

Simply configure each data source and Grafbase will instantly generate GraphQL queries and mutations with a namespace to avoid collisions across data sources.

Earlier this year we launched support for OpenAPI and GraphQL connectors, which have been very popular. We are now focusing on supporting popular database connectors, starting with MongoDB.

Data API

The Grafbase platform runs on a edge network to achieve low latency globally. To communicate with MongoDB we chose HTTP to optimize for fast cold starts.

The Grafbase Edge Gateway is written in Rust and compiled to WebAssembly (Wasm) which runs in V8 isolates. TCP support is still not ready for production use, so we use the fetch API to communicate with data sources to ensure fast connection negotiation times.

Connector setup

The MongoDB Atlas Data API connector adds to the existing OpenAPI and GraphQL connectors. The configuration is very similar to the other connectors, with the addition of hand-written models specific to MongoDB to provide client-side type-safety.

We start by defining the MongoDB connector using TypeScript configuration:


import { config, connector, graph } from '@grafbase/sdk'

const g = graph.Standalone()

const mongodb = connector.MongoDB('MongoDB', {
  url: 'https://data.mongodb-api.com/app/data-asdf/endpoint/data/v1',
  apiKey: 'SECRET',
  dataSource: 'myDatasource',
  database: 'blogPlatform',
})

g.datasource(mongodb)

export default config({
  graph: g,
})

To find the url and apiKey parameter, the Data API must be enabled from the Atlas dashboard. After enabling, the dashboard shows the URL and the API key can be created from the users menu.

The dataSource can be found from the Database menu, and when clicking it open, it lists all the databases. It would not be necessary to create a new database when starting a new project, just giving it a distinct name in the Grafbase configuration will create the database after the first call to the endpoint.

Models for the MongoDB connector are created through the connector object. The _id field can be omitted; it is implicitly assumed to be included in every document, and to be of ObjectId type.


const address = g.type('Address', {
  street: g.string().mapped('street_name'),
})

mongodb
  .model('User', {
    name: g.string(),
    age: g.int().optional(),
    address: g.ref(address),
  })
  .collection('users')

The model definition is similar to the Grafbase Database, with the added possibility to define a mapped attribute to a field and setting a collection.

The collection defines the collection of the User documents in the database, and mapped attribute is for renaming a field, if it contains characters not allowed in the Grafbase GraphQL schema. References with ref are nested objects in the document, and relations are not allowed with the MongoDB connector yet.

Having all models defined from the connector allows us to differentiate connectors and their features, utilizing the type system to define the features supported by MongoDB.

Relay-style pagination

Our MongoDB connector uses relay-style pagination for collections. This method calculates a cursor for every document in the response, which can be used to navigate back and forward in the collection.

A cursor value for MongoDB is calculated either from the id of the field, or if using ordering, together with the id and all the fields from the orderBy definition. So, when loading the first users from the database, we can load a special type PageInfo in the selection, which tells us if there is more data in the collection, and the cursor of the last document in the response:


query Users {
  mongo {
    userCollection(first: 2) {
      edges {
        node {
          name
          age
        }
      }
      pageInfo {
        hasNextPage
        endCursor
      }
    }
  }
}

If the value of hasNextPage is true, we can load more data by setting the after parameter to the value in the endCursor to load the next page:


query Users {
  mongo {
    userCollection(
      first: 2
      after: "ZmllbGRzAG5hbWUAA19pZAB2YWx1ZQBPYmplY3RJZAAYNjRlNzgwMmZjZWI2Nzg5NGI0NDk0ODQzAAEkAQEBHhRkaXJlY3Rpb24ACUFzY2VuZGluZwADFlFIAwEDEVEgFBQkAQckAWcBAQEHKAIkAQ"
    ) {
      edges {
        node {
          name
          age
        }
      }
      pageInfo {
        hasNextPage
        endCursor
      }
    }
  }
}

This style of pagination ensures that no values will be repeated during the pagination, in case data gets added to the collection between the requests. The underlying queries get more complex when sorted. So, let’s expand our first query with ordering:


query Users {
  mongo {
    userCollection(first: 2, orderBy: [{ name: ASC }, { age: DESC }]) {
      edges {
        node {
          name
          age
        }
      }
      pageInfo {
        hasNextPage
        endCursor
      }
    }
  }
}

When we load the data, we see the cursor suddenly got more length:


{
  "data": {
    "mongo": {
      "userCollection": {
        "pageInfo": {
          "hasNextPage": true,
          "endCursor": "ZmllbGRzAG5hbWUAA2FnZQB2YWx1ZQBQb3NJbnQAAQgBAQELCGRpcmVjdGlvbgAKRGVzY2VuZGluZwADFzYtAwEDEjYhFBQkBG5hbWUAU3RyaW5nAANCb2IAAQ0BAQEJFAlBc2NlbmRpbmcAA0hnXgMBAxErFhQUJANfaWQAT2JqZWN0SWQAGDY0ZTc4MDJmY2ViNjc4OTRiNDQ5NDg0MwABJAEBAR4UCUFzY2VuZGluZwADj66lAwEDEUEWFBQkA39PCSQkJAHIAQEBCygCJAE"
        }
      }
    }
  }
}

For pagination to work with arbitrary ordering, the cursor must include the value of all columns in the orderBy statement together with the id value of the document, and the sort order of that specific column. Now when we paginate using this cursor, loading the next page can define a filter to load the next documents in a correct order.

Relay-style pagination with MongoDB assumes the values used in the ordering should be backed by an index. If implementing pagination with complex sorting, it is necessary to have the corresponding indices in place for best performance.

Authentication

Grafbase provides support for different authentication strategies, which are also supported in the MongoDB connector. You can control on who’s able to access the connected MongoDB database from the connector level down to the field level.

By defining a private global rule, all access to the MongoDB database is allowed only for signed-in users:


import { auth, config, graph } from '@grafbase/sdk'

const g = graph.Standalone()

const provider = auth.OpenIDConnect({
  issuer: g.env('ISSUER_URL'),
})

export default config({
  graph: g,
  auth: {
    providers: [provider],
    rules: rules => {
      rules.private()
    },
  },
})

The rules can be also more granular, and be defined directly on the model:


mongodb
  .model('User', {
    name: g.string(),
    age: g.int().optional(),
    address: g.ref(address),
  })
  .collection('users')
  .auth(rules => rules.private().read())

Or at the field:


mongodb
  .model('User', {
    name: g.string().auth(rules => rules.groups(['admin'])),
    age: g.int().optional(),
    address: g.ref(address),
  })
  .collection('users')

Grafbase supports many different authentication providers, and you can read more about them in our documentation.

Edge Caching

Caching is available for all Grafbase connectors, which reduces latency by delivering cached responses directly from the edge to improve end user experience and protect the database from traffic surges.

It’s easy to define cache invalidation rules directly in the MongoDB model definition:


mongodb
  .model('User', {
    name: g.string().auth(rules => rules.groups(['admin'])),
    age: g.int().optional(),
    address: g.ref(address),
  })
  .collection('users')
  .cache({ maxAge: 60 })

Now the first query fetches the models from the database, and the subsequent ones from the cache for the next 60 seconds. Caches can also be defined at the field level:


mongodb
  .model('User', {
    name: g.string().auth(rules => rules.groups(['admin'])),
    age: g.int().optional(),
    address: g.ref(address).cache({ maxAge: 60 }),
  })
  .collection('users')

Read more about different caching strategies in our documentation.

Local Data API for tests

Building a connector to an API that’s only available in the cloud makes it harder to write tests against it. You have the network round-trips, together with limits on how much data you move and how many requests you make. After considering our options, we decided to build our own Data API container to run tests against.

The mongodb-data-api project was created to solve this problem on a Friday afternoon. It’s not meant for production usage, but can be launched together with an instance of MongoDB in a docker-compose setup. The project is built using the Axum web server, together with Tokio and the MongoDB Rust driver, and is fast enough to give us a speedy feedback loop.

The API can be found on Docker Hub, and the included docker-compose file should be enough to start building on Grafbase together with good integration tests locally and in CI.

Summary

The MongoDB connector is available in the latest versions of the Grafbase CLI and the Grafbase SDK. We’re continuing our work on MongoDB, considering of supporting their GraphQL API and eventually having a support for TCP connections.

We’d love to hear your feedback on the MongoDB connector and understand if there are any features you’d like to see us implementing next. Join our Discord community and chime in!