Use of apollo-datasource-mongodb vs Mongoose APIs
Context and Problem Statementβ
When querying MongoDB, we need to decide whether to use apollo-datasource-mongodb
(data-loader) or the native Mongoose APIs. This decision depends on factors such as the function's usage scope (resolvers vs backend logic), the need for field projections, datetime comparisons, and nested field population.
Decision Driversβ
- Query Performance: We need to minimize redundant database queries and avoid the N+1 problem in GraphQL resolvers.
- Code Maintainability: The approach should make it easy to manage queries across resolvers and backend logic.
- Feature Requirements: Some queries require field projection, datetime filtering, or population of nested fields.
- Consistency: Queries should be handled in a predictable and structured way across our service.
Considered Optionsβ
- Use
apollo-datasource-mongodb
- Use Mongoose APIs
- Separate functions for resolvers and backend logic
Decision Outcomeβ
The decision is based on the following criteria:
1. If the function is only used by resolvers:β
- Check if the query requires processing or manipulation of the returned documents beyond simple retrieval:
- If yes with advanced operations such as field projection like
{ id: 1 }
or a datetime comparison such as{ $lte: now }
, use Mongoose APIs - If yes with nested field populate
find({}).populate(['applicant'])
, useapollo-datasource-mongodb
, and remove populate(). - If no, use
apollo-datasource-mongodb
.
- If yes with advanced operations such as field projection like
2. If the function is used in both resolvers and backend logic:β
- If no nested fields need to be populated, use
apollo-datasource-mongodb
. - If nested fields need to be populated, separate the function into two:
- One for backend logic using Mongoose APIs named with suffix
With<fieldName>
(e.g.,getCredCaseWithApplicant
). - One for resolvers using
apollo-datasource-mongodb
, but withoutpopulate()
, since resolvers will handle population.
- One for backend logic using Mongoose APIs named with suffix
3. If the function is only used in backend logic, use Mongoose APIs.β
Decisionβ
We will follow the structured approach outlined above to determine when to use apollo-datasource-mongodb
versus Mongoose APIs. This ensures:
- Optimized query performance.
- Avoiding unnecessary data fetching in resolvers.
- Proper separation of concerns between backend logic and GraphQL resolvers.
Pros and Cons of the Optionsβ
Use apollo-datasource-mongodb
β
β
Improve UI performance thanks to the benefits from data-loader.
β Can lead to N+1 query issues.
β Not ideal for complex filtering, projections, or nested field population.
Use Mongoose APIsβ
β
Allows full control over queries, including projections, filtering, and population.
β
More efficient for backend-only functions that donβt require GraphQL resolvers.
β Does not leverage the benefits from Apollo's data loader.
Separate Functions for Resolvers and Backend Logicβ
β
Ensures resolvers donβt perform unnecessary populate()
operations.
β
Keeps backend logic optimized for different use cases.
β Requires maintaining two versions of the same function.
Statusβ
Accepted β We will apply this approach when implementing MongoDB queries in our data-access.
Consequencesβ
- Developers must assess each query based on its usage context.
- Functions used in both backend and resolvers need to be split appropriately.
- Queries in resolvers should avoid
populate()
, relying on GraphQL resolvers to handle data population.
MORE INFORMATIONβ
- Decision Tree for choosing between
apollo-datasource-mongodb
and Mongoose APIs: The diagram illustrates the decision process for selecting the appropriate API based on query complexity and performance requirements.
- Performance comparison between
apollo-datasource-mongodb
and Mongoose APIs:-
Result has N documents with one nested field populated: Compares data retrieval performance when executing queries that fetch multiple documents, each with one nested field populated.
-
Result has 1 document with no fields populated: Highlights the baseline performance of both approaches when no nested data is included.
-
Result has 1 document with N nested fields populated: Demonstrates the performance overhead when handling queries that return 1 document and require populating multiple nested fields.
-
Time added by a redundant populate() call: Illustrates the additional time incurred by an unnecessary populate() operation, comparing the performance cost between the implementations.
-