diff --git a/README.md b/README.md --- a/README.md +++ b/README.md @@ -8,10 +8,10 @@ This is the class that holds all the logic and configuration of your, well, app ;) -The basic constructor is: +The basic constructor is: ```js -const newApp = new Sealious.App(config, manifest) +const newApp = new Sealious.App(config, manifest); ``` `config` and `manifest` both influence how the app will be set up. @@ -20,15 +20,15 @@ `config` is considered secret, and contains information on the infrastructure, as well as SMTP passwords and the like. -It's best to be kept in a separate json/yml file, and imported with +It's best to be kept in a separate json/yml file, and imported with ```js -const config = require("./config.json") +const config = require("./config.json"); ``` in your app. -The default config is: +The default config is: ```json { @@ -65,16 +65,16 @@ * `name` (string) - the name of your app * `logo` (string) - path to an image with the logo of your app -* `version` (string) - the version of your app +* `version` (string) - the version of your app * `colors.primary` (string) - the primary color of your brand * `default_language` (string) - the default language for your app. Email templates use this * `admin_email` (string) - the email address of the admin. It might be publicly revealed within the app. Used to create the initial admin account. Whenever the app starts and there's no user with that email, a registration intent is created, causing an email to be sent to this address. -You can also include your own fields/values, so they can be easily shared across different modules on both back-end and front-end. +You can also include your own fields/values, so they can be easily shared across different modules on both back-end and front-end. ## Configuring your app -Every Sealious application has an `App.ConfigManager` interface through which you can configure settings available throughout all the various components of your application. It comes with some sane defaults. +Every Sealious application has an `App.ConfigManager` interface through which you can configure settings available throughout all the various components of your application. It comes with some sane defaults. ### Changing the app settings @@ -108,38 +108,37 @@ app.setDefault("my-module.port", 8080); ``` - ## Sending emails This synopsis is self-explanatory: ```js const message = await TestApp.EmailTemplates.Simple(TestApp, { - to: "test@example.com", - subject: "Congratulations!", - text: "Enlarge your 'seal' with herbal supplements", + to: "test@example.com", + subject: "Congratulations!", + text: "Enlarge your 'seal' with herbal supplements", }); await message.send(TestApp); ``` -To send emails via smtp, set the following config: +To send emails via smtp, set the following config: ```js email: { - from_name: "Sealious app", - from_address: "sealious@example.com", + from_name: "Sealious app", + from_address: "sealious@example.com", }, ``` ## Filtering resources -When reading a list of resources in a collection, you can use *filtering* to limit the list to resources that match certain criteria. +When reading a list of resources in a collection, you can use _filtering_ to limit the list to resources that match certain criteria. Let's say you have a Users collection with an additional "age" field. Now, to get all the users with age set to `42`, you can call: ``` app.run_action(context, ["collections", "users"], "show", { - filter: { age: 42 }, + filter: { age: 42 }, }); ``` @@ -153,11 +152,11 @@ ``` app.run_action(context, ["collections", "users"], "show", { - filter: { age: { ">": 50 } }, + filter: { age: { ">": 50 } }, }); ``` -Or, via HTTP: +Or, via HTTP: ``` GET /api/v1/collections/users?filter[age][>]=50 @@ -169,7 +168,7 @@ ``` app.run_action(context, ["collections", "users"], "show", { - filter: { age: { ">": 50 }, country: "Poland" }, + filter: { age: { ">": 50 }, country: "Poland" }, }); ``` @@ -180,3 +179,156 @@ ### Implementation Each field can implement its own filtering logic, by the means of the `filter_to_query` method. It's goal is to transform user input (like `{">": 52}`) into a Mongo `$match` operator (like `{$gt: 52}`). It should be an `async` function. + +## AND and OR access strategy optimization + +Sealious communicates with mongo using mainly MongoDB Pipelines, which are represented as arrays of stages. Two main pipeline stages are `lookups` and `matches` (equivalents of SQL `joins` and `wheres`). Lookups are quite expensive in comparison to matches; thus, we would like to do them as lately as possible. However, we cannot just place them in the end because some matches can be dependent on some lookups, as they use fields fetched by lookups. In addition, some lookups can also be done only after another lookup(s) takes place. Hence, we have to build a dependency graph and run a kind of priority-first search algorithm on it. + +The construction of dependency graph is straightforward. Firstly, a new node, which is an equivalent of a single pipeline stage, is just inserted to the graph as a seperate node. If it represents `match` and queries more than one field, it will be split. For instance: + +``` +{ + $match: { + weight: {$gt: 200}, + date_of_birth: {$lt : ISODate("2005-01-01T00:00:00Z")}, + } +} +``` + +will create two seperate nodes. The split is done because of optimization reasons - some fields within single `match` may be dependent on other nodes, while other are dependency free. Then, if node has dependencies an edge from the direct dependency is added. Note that it is enough because any additional dependencies are also undoubtly parents for the direct dependency (it simply means that a field required a few lookups to access it). + +The order of visiting the nodes depends on two sets: the _front_, denoted by _F_, and the _candidates_, denoted by _C_. _F_ embraces the nodes, which have already been visited, but at least one of their children is still to be visited. To simplify the notation we can distinguish dummy node _Ø_, which is a parent to orphans. Consequently, _C_ embraces all direct children of nodes in _F_. Thus, while traversing the graph we evaluate next step from the perspective of the whole front instead of single node. + +Let's run our algorithm on a simple example. + +
+1.                                            2.
+ +-------Ø--------+                            +-------Ø--------+
+ |       |        |                            |       |        |
+ |       |        |                            |       |        |
+ v       v        v                            v       v        v
+L1       M3* +----L4----+                     L1*      M3  +----L4----+
+ +           |          |                      +           |          |
+ |           |          |                      |           |          |
+ v           v          v                      v           v          v
+M2           L5        M6                     M2           L5        M6
+                        +                                             +
+                        |                                             |
+                        v                                             v
+                       M7                                            M7
+F: Ø                                           F: Ø
+C: L1 M3 L4                                    C: L1 L4
+
+ +For the first two steps _F_ only embraces our dummy node. First visitee is obviously `M3`, as matches have the highest priority. It doesn't become a part of front because it has no children. The second visitee is determined by calculating the additional fitness measure which is average priority of children of each candidate. The fitness of single match is definetely better than the average of lookup and match, so `L1` is our choice. + +
+3.                                            4.
+ +-------Ø--------+                            +-------Ø--------+
+ |       |        |                            |       |        |
+ |       |        |                            |       |        |
+ v       v        v                            v       v        v
+L1       M3  +----L4----+                     L1       M3  +----L4*---+
+ +           |          |                      +           |          |
+ |           |          |                      |           |          |
+ v           v          v                      v           v          v
+M2*          L5        M6                     M2           L5        M6
+                        +                                             +
+                        |                                             |
+                        v                                             v
+                       M7                                            M7
+F: Ø L1                                        F: Ø
+C: M2 L4                                       C: L4
+
+ +Again, steps 3 and 4 are not complicated. `M2` is picked up first and then it's time for `L4`. + +
+5.                                            6.
+ +-------Ø--------+                            +-------Ø--------+
+ |       |        |                            |       |        |
+ |       |        |                            |       |        |
+ v       v        v                            v       v        v
+L1       M3  +----L4----+                     L1       M3  +----L4----+
+ +           |          |                      +           |          |
+ |           |          |                      |           |          |
+ v           v          v                      v           v          v
+M2           L5        M6*                    M2           L5        M6
+                        +                                             +
+                        |                                             |
+                        v                                             v
+                       M7                                            M7*
+F: L4                                        F: L4 M6
+C: L5 M6                                       C: L5 M7
+
+ +At last _Ø_ leaves _F_. The front moves down the right subtree of `L4`. + +
+7.
+ +-------Ø--------+
+ |       |        |
+ |       |        |
+ v       v        v
+L1       M3  +----L4----+
+ +           |          |
+ |           |          |
+ v           v          v
+M2           L5*       M6
+                        +
+                        |
+                        v
+                       M7
+F:
+C: L5
+
+ +The only node left in candidates is `L5`, so algorithm picks it up. We traversed the whole graph, so that's it. + +## Query class + +Whenever possible we try to use `Query` class instead of raw MongoDB queries. The following classes extend `Query` class (their names are rather self-explanatory): + +* `Query.And` +* `Query.Or` +* `Query.Not` +* `Query.DenyAll` +* `Query.AllowAll` + +Every class which belongs to `Query` group has to expose the functions below. The usage examples can be find in `lib/datastore/query.test.js`. + +### `lookup(body)` + +Adds lookup to the query. + +Returns hexadecimal hash of passed lookup. + +### `match(body)` + +Adds match to the query. + +### `dump()` + +Usually other queries are supplied with its return value. It is rather used internally by classes implementing `Query`. + +Returns the inner representation of the query. + +### `toPipeline()` + +Returns the MongoDB aggregation pipeline. + +### `fromSingleMatch(body)` + +Returns the query object on which `match(body)` has been called. + +### `fromCustomPipeline(pipeline)` + +Returns the query object equivalent to the given pipeline. + +--- + +Classes that implement operators requiring multiple subqueries expose also the following: + +### `addQuery(query)` + +Adds argument as the another parameter of the operator connected with base query (`and`, `or`, etc.) diff --git a/lib/app/base-chips/access-strategy-types/and.js b/lib/app/base-chips/access-strategy-types/and.js --- a/lib/app/base-chips/access-strategy-types/and.js +++ b/lib/app/base-chips/access-strategy-types/and.js @@ -16,14 +16,7 @@ const queries = await Promise.map(access_strategies, strategy => strategy.getRestrictingQuery(context) ); - if (queries.some(query => query instanceof Query.DenyAll)) { - return new Query.DenyAll(); - } - const aggregated_pipeline = queries.reduce( - (acc, query) => acc.concat(query.toPipeline()), - [] - ); - return Query.fromCustomPipeline(aggregated_pipeline); + return new Query.And(...queries); }, item_sensitive: function(params) { const access_strategies = parse_params(app, params); diff --git a/lib/app/base-chips/access-strategy-types/and.subtest.js b/lib/app/base-chips/access-strategy-types/and.subtest.js --- a/lib/app/base-chips/access-strategy-types/and.subtest.js +++ b/lib/app/base-chips/access-strategy-types/and.subtest.js @@ -26,6 +26,14 @@ const collections = [ { + name: + "collection-and(nested-and(allow, public), nested-or(allow, noone))", + strategies: [ + ["and", ["complex-allow-pipeline", "public"]], + ["or", ["complex-allow-pipeline", "noone"]], + ], + }, + { name: "collection-and(complex-allow-pipeline, noone)", strategies: ["complex-allow-pipeline", "noone"], }, @@ -78,6 +86,16 @@ } } + it("return everything for collection-and(nested-and(allow, public), nested-or(allow, noone))", () => + with_running_app(async ({ app }) => { + await setup(app); + return get_collection_as({ + collection: + "collection-and(nested-and(allow, public), nested-or(allow, noone))", + port, + }).then(data => assert.equal(data.length, 3)); + })); + it("returns nothing for and(complex-allow-pipeline, noone)", () => with_running_app(async ({ app }) => { await setup(app); diff --git a/lib/app/base-chips/access-strategy-types/or.subtest.js b/lib/app/base-chips/access-strategy-types/or.subtest.js --- a/lib/app/base-chips/access-strategy-types/or.subtest.js +++ b/lib/app/base-chips/access-strategy-types/or.subtest.js @@ -26,10 +26,11 @@ const collections = [ { - name: "collection-or(complex-allow-pipeline, noone)", + name: + "collection-or(nested-or(allow, noone), nested-and(allow, public))", strategies: [ ["or", ["complex-allow-pipeline", "noone"]], - ["or", ["complex-allow-pipeline", "noone"]], + ["and", ["complex-allow-pipeline", "public"]], ], }, { @@ -85,11 +86,12 @@ } } - it("returns everything for wrapped or(complex-allow-pipeline, noone)", () => + it("returns everything for collection-or(nested-or(allow, noone), nested-and(allow, public))", () => with_running_app(async ({ app }) => { await setup(app); return get_collection_as({ - collection: "collection-or(complex-allow-pipeline, noone)", + collection: + "collection-or(nested-or(allow, noone), nested-and(allow, public))", port, }).then(data => assert.equal(data.length, 3)); })); diff --git a/lib/app/base-chips/access-strategy-types/same-as-for-resource-in-field.subtest.js b/lib/app/base-chips/access-strategy-types/same-as-for-resource-in-field.subtest.js --- a/lib/app/base-chips/access-strategy-types/same-as-for-resource-in-field.subtest.js +++ b/lib/app/base-chips/access-strategy-types/same-as-for-resource-in-field.subtest.js @@ -3,7 +3,7 @@ const { with_running_app } = locreq("test_utils/with-test-app.js"); const assert_throws_async = locreq("test_utils/assert_throws_async.js"); -describe("SameAsReferencedInFieldStrategy", () => { +describe("SameAsForResourceInField", () => { let port; let numbers; const sessions = {}; diff --git a/lib/app/base-chips/access-strategy-types/user-referenced-in-field.js b/lib/app/base-chips/access-strategy-types/user-referenced-in-field.js --- a/lib/app/base-chips/access-strategy-types/user-referenced-in-field.js +++ b/lib/app/base-chips/access-strategy-types/user-referenced-in-field.js @@ -2,7 +2,7 @@ name: "user-referenced-in-field", getRestrictingQuery: async (context, field_name) => { if (!context.user_id) return new app.Query.DenyAll(); - return new app.Query().match({ + return app.Query.fromSingleMatch({ [`body.${field_name}`]: context.user_id, }); }, diff --git a/lib/app/base-chips/access-strategy-types/when.js b/lib/app/base-chips/access-strategy-types/when.js --- a/lib/app/base-chips/access-strategy-types/when.js +++ b/lib/app/base-chips/access-strategy-types/when.js @@ -14,13 +14,14 @@ const special_filter = collection.get_named_filter(special_filter_name); const when_true = new app.Sealious.AccessStrategy(app, when_true_name); const when_false = new app.Sealious.AccessStrategy(app, when_false_name); + const filtering_query = await special_filter.getFilteringQuery(collection); return new Query.Or( new Query.And( - await special_filter.getFilteringQuery(collection), + filtering_query, await when_true.getRestrictingQuery(context) ), new Query.And( - new Query.Not(await special_filter.getFilteringQuery(collection)), + new Query.Not(filtering_query), await when_false.getRestrictingQuery(context) ) ); @@ -56,22 +57,20 @@ ], item ) { + const query = await construct_query( + app, + context, + collection_name, + special_filter_name, + when_true_name, + when_false_name + ); + query.match({ sealious_id: item.id }); const results = await app.Datastore.aggregate( item.collection_name, - (await construct_query( - app, - context, - collection_name, - special_filter_name, - when_true_name, - when_false_name - )) - .match({ sealious_id: item.id }) - .toPipeline() + query.toPipeline() ); - if (results.length) { - return Promise.resolve(); - } else { + if (!results.length) { return Promise.reject("No access"); } }, diff --git a/lib/app/base-chips/access-strategy-types/when.subtest.js b/lib/app/base-chips/access-strategy-types/when.subtest.js --- a/lib/app/base-chips/access-strategy-types/when.subtest.js +++ b/lib/app/base-chips/access-strategy-types/when.subtest.js @@ -2,7 +2,6 @@ const assert = require("assert"); const { with_stopped_app } = locreq("test_utils/with-test-app.js"); const assert_throws_async = locreq("test_utils/assert_throws_async.js"); -const axios = require("axios"); describe("when", () => { async function create_resources(app) { diff --git a/lib/datastore/graph.js b/lib/datastore/graph.js new file mode 100644 --- /dev/null +++ b/lib/datastore/graph.js @@ -0,0 +1,164 @@ +class Graph { + constructor() { + this.adjacency_matrix = []; + this.node_ids = []; + this.nodes = []; + this.indexes = []; + } + addNode(id, priority) { + this.adjacency_matrix.push(Array(this.getNoOfNodes()).fill(0)); + for (const row of this.adjacency_matrix) { + row.push(0); + } + this.node_ids.push(id); + this.nodes.push({ id, priority }); + this.indexes.push(this.nodes.length - 1); + } + getNoOfNodes() { + return this.nodes.length; + } + addEdge(id_i, id_j) { + const [i, j] = this._getIndexesOfNodePair(id_i, id_j); + this.adjacency_matrix[i][j] = 1; + } + _getIndexesOfNodePair(id_i, id_j) { + return [this.node_ids.indexOf(id_i), this.node_ids.indexOf(id_j)]; + } + pathExists(id_i, id_j) { + const [i, j] = this._getIndexesOfNodePair(id_i, id_j); + return this._pathExists(i, j); + } + _pathExists(i, j) { + if (this.adjacency_matrix[i][j]) { + return true; + } + for (let k = 0; k < this.getNoOfNodes(); ++k) { + if (this.adjacency_matrix[i][k]) { + return this._pathExists(k, j); + } + } + return false; + } + bestFirstSearch() { + this.front = []; + this.visited = []; + while (this.visited.length < this.nodes.length) { + const { front_node, next_node } = this._getNextNode(); + this.visited.push(next_node); + + if (front_node !== null) { + if (this._areAllSuccessorsVisited(front_node)) { + const index = this.front.indexOf(front_node); + this.front.splice(index, 1); + } + } + + if (!this._areAllSuccessorsVisited(next_node)) { + this.front.push(next_node); + } + } + return this.visited.map(i => this.nodes[i].id); + } + _areAllSuccessorsVisited(i) { + for (let j = 0; j < this.nodes.length; ++j) { + if (this.adjacency_matrix[i][j] && !this._isVisited(j)) { + return false; + } + } + return true; + } + _isVisited(i) { + return this.visited.includes(i); + } + _isNodeWithoutPredecessors(i) { + for (let j = 0; j < this.nodes.length; ++j) { + if (this.adjacency_matrix[j][i]) { + return false; + } + } + return true; + } + _getNextNode() { + const nodesWithoutPredecessorsYetToBeVisited = this.indexes.filter( + i => this._isNodeWithoutPredecessors(i) && !this._isVisited(i) + ); + + const candidate1 = this._lookForNextNodeInCandidates( + nodesWithoutPredecessorsYetToBeVisited + ); + if (candidate1.priority === Graph.MAX_PRIORITY) { + return { front_node: null, next_node: candidate1.index }; + } + + const successorsYetToBeVisited = this.front.reduce((successors, i) => { + this.indexes + .filter(j => this.adjacency_matrix[i][j] && !this._isVisited(j)) + .map(j => successors.add(j)); + return successors; + }, new Set()); + + const candidate2 = this._lookForNextNodeInCandidates( + successorsYetToBeVisited + ); + + if (candidate1.priority < candidate2.priority) { + return { front_node: null, next_node: candidate1.index }; + } + + if (candidate1.priority === candidate2.priority) { + if ( + candidate1.mean_priority_of_succcessors < + candidate2.mean_priority_of_succcessors + ) { + return { front_node: null, next_node: candidate1.index }; + } + } + + const front_node = this.indexes.find( + i => this.adjacency_matrix[i][candidate2.index] + ); + return { front_node, next_node: candidate2.index }; + } + _lookForNextNodeInCandidates(candidates) { + let next_node = null, + best_priority = Infinity, + current_mean, + best_mean = Infinity; + for (const candidate of candidates) { + if (this.nodes[candidate].priority < best_priority) { + best_priority = this.nodes[candidate].priority; + best_mean = this._meanPriorityOfSuccessors(candidate); + next_node = candidate; + if (this.nodes[candidate].priority === Graph.MAX_PRIORITY) { + break; + } + } else if (this.nodes[candidate].priority === best_priority) { + current_mean = this._meanPriorityOfSuccessors(candidate); + if (current_mean < best_mean) { + best_mean = current_mean; + next_node = candidate; + } + } + } + return { + index: next_node, + priority: best_priority, + mean_priority_of_succcessors: best_mean, + }; + } + _meanPriorityOfSuccessors(i) { + let sum = 0, + length = 0; + for (let j of this.indexes) { + if (this.adjacency_matrix[i][j] && !this._isVisited(j)) { + sum += this.nodes[j].priority; + ++length; + } + } + return length > 0 ? sum / length : 0; + } +} + +Graph.MAX_PRIORITY = 0; + +module.exports = Graph; diff --git a/lib/datastore/graph.test.js b/lib/datastore/graph.test.js new file mode 100644 --- /dev/null +++ b/lib/datastore/graph.test.js @@ -0,0 +1,123 @@ +const Graph = require("./graph.js"); +const assert = require("assert"); + +describe("graph", () => { + let graph; + beforeEach(() => { + graph = new Graph(); + }); + + it("Adding nodes and edges works correctly", () => { + graph.addNode(1, 0); + graph.addNode(2, 0); + graph.addNode(3, 0); + graph.addNode(4, 0); + graph.addNode(5, 1); + graph.addNode(6, 1); + graph.addNode(7, 0); + graph.addEdge(2, 3); + graph.addEdge(2, 4); + graph.addEdge(4, 5); + graph.addEdge(6, 7); + + assert.deepEqual(graph.adjacency_matrix, [ + [0, 0, 0, 0, 0, 0, 0], + [0, 0, 1, 1, 0, 0, 0], + [0, 0, 0, 0, 0, 0, 0], + [0, 0, 0, 0, 1, 0, 0], + [0, 0, 0, 0, 0, 0, 0], + [0, 0, 0, 0, 0, 0, 1], + [0, 0, 0, 0, 0, 0, 0], + ]); + }); + + // L1 M3 +----L4----+ + // + | | + // | | | + // v v v + // M2 L5 M6 + // + + // | + // v + // M7 + + it("Correctly runs best-first search on simple graph", () => { + graph.addNode("L1", 1); + graph.addNode("M2", 0); + graph.addNode("M3", 0); + graph.addNode("L4", 1); + graph.addNode("L5", 1); + graph.addNode("M6", 0); + graph.addNode("M7", 0); + graph.addEdge("L1", "M2"); + graph.addEdge("L4", "L5"); + graph.addEdge("L4", "M6"); + graph.addEdge("M6", "M7"); + + assert.deepEqual( + ["M3", "L1", "M2", "L4", "M6", "M7", "L5"], + graph.bestFirstSearch() + ); + }); + + // L1 M5 L6 +-----L12----+ + // + + | | + // | | | | + // v v v v + // +-----L2----+ +-----O7-----+ M13 L14 + // | | | | + + // | | | | | + // v v v v v + // M3 M4 +---L8---+ M11 M15 + // | | + // v v + // M9 M10 + + it("Correctly runs best-first search on complex graph", () => { + graph.addNode("L1", 1); + graph.addNode("L2", 1); + graph.addNode("M3", 0); + graph.addNode("M4", 0); + graph.addNode("M5", 0); + graph.addNode("L6", 1); + graph.addNode("O7", 2); + graph.addNode("L8", 1); + graph.addNode("M9", 0); + graph.addNode("M10", 0); + graph.addNode("M11", 0); + graph.addNode("L12", 1); + graph.addNode("M13", 0); + graph.addNode("L14", 1); + graph.addNode("M15", 0); + graph.addEdge("L1", "L2"); + graph.addEdge("L2", "M3"); + graph.addEdge("L2", "M4"); + graph.addEdge("L6", "O7"); + graph.addEdge("O7", "L8"); + graph.addEdge("L8", "M9"); + graph.addEdge("L8", "M10"); + graph.addEdge("O7", "M11"); + graph.addEdge("L12", "M13"); + graph.addEdge("L12", "L14"); + graph.addEdge("L14", "M15"); + + const expectedOrder = [ + "M5", + "L12", + "M13", + "L14", + "M15", + "L1", + "L2", + "M3", + "M4", + "L6", + "O7", + "M11", + "L8", + "M9", + "M10", + ]; + assert.deepEqual(expectedOrder, graph.bestFirstSearch()); + }); +}); diff --git a/lib/datastore/query-step.js b/lib/datastore/query-step.js new file mode 100644 --- /dev/null +++ b/lib/datastore/query-step.js @@ -0,0 +1,95 @@ +const object_hash = require("object-hash"); + +class QueryStep { + constructor(body) { + this.body = body; + } + hash() { + return QueryStep.hashBody(this.body); + } + static fromStage(stage, unwind = true) { + if (stage.$lookup) { + const clonedStageBody = Object.assign({}, stage.$lookup); + clonedStageBody.unwind = unwind; + return [new QueryStep.Lookup(clonedStageBody)]; + } else if (stage.$match) { + return Object.keys(stage.$match).map( + field => new QueryStep.Match({ [field]: stage.$match[field] }) + ); + } + throw new Error("Unsupported stage: " + JSON.stringify(stage)); + } + pushDump(dumps) { + dumps.push(this.body); + return dumps; + } + static hashBody(body) { + return object_hash(body, { + algorithm: "md5", + excludeKeys: key => key === "as", + }); + } + getUsedFields() { + throw new Error("Cannot be used on base QueryStep class"); + } +} + +QueryStep.Lookup = class extends QueryStep { + constructor(body) { + const cleared_body = { + from: body.from, + localField: body.localField, + foreignField: body.foreignField, + }; + cleared_body.as = QueryStep.hashBody(cleared_body); + super(cleared_body); + this.unwind = body.unwind; + } + hash() { + return this.body.as; + } + pushStage(pipeline) { + pipeline.push({ $lookup: this.body }); + if (this.unwind) { + pipeline.push({ $unwind: "$" + this.body.as }); + } + return pipeline; + } + getUsedFields() { + return this.body.localField.split("."); + } + getCost() { + return 8; + } +}; + +QueryStep.Match = class extends QueryStep { + pushStage(pipeline) { + pipeline.push({ $match: this.body }); + return pipeline; + } + getUsedFields() { + return getAllKeys(this.body) + .map(path => path.split(".")) + .reduce((acc, fields) => + acc.concat(fields.filter(field => !field.startsWith("$"))) + ); + } + getCost() { + return this.body.$or ? 2 : 0; + } +}; + +function getAllKeys(obj) { + return Object.keys(obj).reduce((acc, key) => { + if (obj[key] instanceof Object) { + acc.push(...getAllKeys(obj[key])); + } + if (!Array.isArray(obj)) { + acc.push(key); + } + return acc; + }, []); +} + +module.exports = QueryStep; diff --git a/lib/datastore/query.js b/lib/datastore/query.js --- a/lib/datastore/query.js +++ b/lib/datastore/query.js @@ -1,56 +1,93 @@ "use strict"; -const Promise = require("bluebird"); -const hash_item = value => - require("object-hash")(value, { - algorithm: "md5", - excludeKeys: key => key === "as", - }); +const object_hash = require("object-hash"); +const QueryStep = require("./query-step.js"); +const transformObject = require("../utils/transform-object.js"); class Query { constructor() { - this.stages = []; + this.steps = []; } - lookup(body, unwind = true) { - body.as = hash_item(body); - this.stages.push({ $lookup: body, unwinds: unwind }); - return body.as; + lookup(body) { + const lookup_step = new QueryStep.Lookup(body); + this.steps.push(lookup_step); + return lookup_step.hash(); } match(body) { - this.stages.push({ $match: body }); - return this; + for (let key of Object.keys(body)) { + this.steps.push(new QueryStep.Match({ [key]: body[key] })); + } } dump() { - return this.stages; + return this.steps; } toPipeline() { - return this.stages.reduce( - (acc, stage) => this._pushToPipeline(acc, stage), + return this.steps.reduce( + (pipeline, query_step) => query_step.pushStage(pipeline), [] ); } - _pushToPipeline(pipeline, stage) { - if (!stage.$lookup) { - pipeline.push(stage); - } else { - pipeline.push({ $lookup: stage.$lookup }); - if (stage.unwinds) { - pipeline.push({ $unwind: "$" + stage.$lookup.as }); - } - } - return pipeline; - } static fromSingleMatch(body) { const query = new Query(); - return query.match(body); + query.match(body); + return query; } static fromCustomPipeline(stages) { const query = new Query(); - query.stages = stages; + let steps; + const field_as_to_hash = {}; + for (let i = 0; i < stages.length; ++i) { + if (stages[i].$unwind) { + continue; + } + const stage = transformObject( + stages[i], + prop => { + if (prop.startsWith("$")) { + return prop; + } + const fields = prop.split("."); + return fields + .map(field => field_as_to_hash[field] || field) + .join("."); + }, + (prop, value) => { + let fields; + if (typeof value !== "string") { + return value; + } + if (prop === "localField") { + fields = value.split("."); + } else if (value.startsWith("$")) { + fields = value.substring(1).split("."); + } else { + return value; + } + return fields + .map(field => field_as_to_hash[field] || field) + .join("."); + } + ); + steps = QueryStep.fromStage(stage, query._isUnwindStage(stages, i)); + if (stage.$lookup) { + const field_as = stage.$lookup.as; + field_as_to_hash[field_as] = steps[0].hash(); + } + + query.steps.push(...steps); + } return query; } + _isUnwindStage(stages, i) { + if (!stages[i].$lookup) { + return false; + } + return stages[i + 1] && stages[i + 1].$unwind; + } } +module.exports = Query; + Query.DenyAll = class extends Query { constructor() { super(); @@ -77,62 +114,7 @@ } }; -Query.Or = class extends Query { - constructor(...queries) { - super(); - this.lookups = {}; - this.matches = []; - for (let query of queries) { - this.addQuery(query); - } - } - addQuery(query) { - let stages = query.dump(); - const combined_match = {}; - for (let stage of stages) { - if (stage.$lookup) { - this._lookup(stage); - } else if (stage.$match) { - Object.assign(combined_match, stage.$match); - } else { - throw new Error("Unsupported query: " + Object.keys(stage)); - } - } - if (Object.keys(combined_match).length) - this.matches.push(combined_match); - } - _lookup(stage) { - const id = stage.$lookup.as; - this.lookups[id] = stage; - } - dump() { - return Object.values(this.lookups).concat({ - $match: { $or: this.matches }, - }); - } - toPipeline() { - return Object.values(this.lookups) - .reduce((acc, stage) => this._pushToPipeline(acc, stage), []) - .concat({ - $match: { $or: this.matches }, - }); - } - match(body) { - return Query.fromCustomPipeline([ - { $match: body }, - ...this.toPipeline(), - ]); - } -}; - -Query.And = class extends Query { - constructor(...queries) { - super(); - this.stages = queries - .map(query => query.stages) - .reduce((acc, stages) => acc.concat(stages), []); - } -}; +Query.Or = require("./query_or.js"); Query.Not = class extends Query { constructor(query) { @@ -150,4 +132,4 @@ } }; -module.exports = Query; +Query.And = require("./query_and.js"); diff --git a/lib/datastore/query.test.js b/lib/datastore/query.test.js new file mode 100644 --- /dev/null +++ b/lib/datastore/query.test.js @@ -0,0 +1,413 @@ +const Query = require("./query.js"); +const assert = require("assert"); +const QueryStep = require("./query-step.js"); + +describe("Query", () => { + describe("Query general", () => { + it("Creates correct query from custom pipeline", () => { + const pipeline = [ + { $match: { title: { $ne: "The Joy of PHP" }, edition: 1 } }, + { + $lookup: { + from: "authors", + localField: "author", + foreignField: "_id", + as: "author_item", + }, + }, + { + $unwind: "$author_item", + }, + { $match: { "author_item.name": { $regex: "some_regex" } } }, + { + $lookup: { + from: "states", + localField: "author.state", + foreignField: "_id", + as: "state_item", + }, + }, + { $unwind: "$state_item" }, + { + $match: { + $or: [ + { "author_item.age": { $le: 30 } }, + { edition: { $gt: 3 } }, + ], + "state_item.abbrevation": { $eq: "PL" }, + }, + }, + ]; + + const query = Query.fromCustomPipeline(pipeline); + + const authors_hash = hashLookup(pipeline[1]); + const states_hash = hashLookup(pipeline[4]); + const expected_pipeline = [ + { $match: { title: { $ne: "The Joy of PHP" } } }, + { $match: { edition: 1 } }, + { + $lookup: { + from: "authors", + localField: "author", + foreignField: "_id", + as: authors_hash, + }, + }, + { + $unwind: "$" + authors_hash, + }, + { + $match: { + [`${authors_hash}.name`]: { + $regex: "some_regex", + }, + }, + }, + { + $lookup: { + from: "states", + localField: "author.state", + foreignField: "_id", + as: states_hash, + }, + }, + { $unwind: "$" + states_hash }, + { + $match: { + $or: [ + { [`${authors_hash}.age`]: { $le: 30 } }, + { edition: { $gt: 3 } }, + ], + }, + }, + { $match: { [`${states_hash}.abbrevation`]: { $eq: "PL" } } }, + ]; + + assert.deepEqual(query.toPipeline(), expected_pipeline); + }); + }); + describe("Query.Or", () => { + it("Returns correct pipeline stages for simple case", () => { + const queries = []; + + const M1 = { + title: { $ne: "The Joy of PHP" }, + }; + queries.push(Query.fromSingleMatch(M1)); + + let query = new Query(); + const L2 = { + from: "authors", + localField: "author", + foreignField: "_id", + unwind: true, + }; + const L2_id = query.lookup(L2); + const M3 = { + [`${L2_id}.last_name`]: { $in: ["Scott", "Dostoyevsky"] }, + }; + query.match(M3); + queries.push(query); + + const or = new Query.Or(...queries); + + const expected_pipeline = [ + { + $lookup: { + from: L2.from, + localField: L2.localField, + foreignField: L2.foreignField, + as: L2_id, + }, + }, + { $unwind: `$${L2_id}` }, + { $match: { $or: [M1, M3] } }, + ]; + assert.deepEqual(or.toPipeline(), expected_pipeline); + }); + + it("Returns correct pipeline stages when And query is provided", () => { + let queries = []; + let subquery = new Query(); + + const L1 = { + from: "authors", + localField: "author", + foreignField: "_id", + unwind: true, + }; + const L1_id = subquery.lookup(L1); + const M2 = { + [`${L1_id}.last_name`]: { $in: ["Christie", "Rowling"] }, + }; + subquery.match(M2); + queries.push(subquery); + + const M3 = { + title: { $ne: "The Joy of PHP" }, + }; + queries.push(Query.fromSingleMatch(M3)); + const and_1 = new Query.And(...queries); + + queries = []; + subquery = new Query(); + const L4 = { + from: "authors", + localField: "author", + foreignField: "_id", + unwind: true, + }; + const L4_id = subquery.lookup(L4); + const M4 = { + [`${L4_id}.middle_name`]: { $in: ["Brown", "Black"] }, + }; + subquery.match(M4); + queries.push(subquery); + + subquery = new Query(); + + subquery.lookup(L4); + const L5 = { + from: "publisher", + localField: `${L4_id}.publisher`, + foreignField: "publisher_id", + unwind: true, + }; + const L5_id = subquery.lookup(L5); + + const M6 = { + $or: [ + { [`${L4_id}.first_name`]: "Ann" }, + { [`${L5_id}.income`]: { $gt: 1000 } }, + ], + }; + subquery.match(M6); + + const M7 = { + price: { $lte: 100 }, + }; + subquery.match(M7); + queries.push(subquery); + const and_2 = new Query.And(...queries); + + const query = new Query.Or(and_1, and_2); + + const expected_pipeline = makeQueryFromStageBodies([ + L1, + L4, + L5, + { + $or: [{ $and: [M3, M2] }, { $and: [M7, M4, M6] }], + }, + ]).toPipeline(); + assert.deepEqual(expected_pipeline, query.toPipeline()); + }); + }); + describe("Query.And", () => { + it("Returns pipeline stages in correct order for simple case", () => { + const queries = []; + let query = new Query(); + + const L1 = { + from: "authors", + localField: "author", + foreignField: "_id", + unwind: true, + }; + const L1_id = query.lookup(L1); + const M2 = { + [`${L1_id}.last_name`]: { $in: ["Christie", "Rowling"] }, + }; + query.match(M2); + queries.push(query); + + const M3 = { + title: { $ne: "The Joy of PHP" }, + }; + queries.push(Query.fromSingleMatch(M3)); + + const and = new Query.And(...queries); + const stageBodies = [M3, L1, M2]; + assertStagesAreCorrectlyOrdered(stageBodies, and.toPipeline()); + assert.deepEqual(makeSteps(stageBodies), and.dump()); + }); + + function assertStagesAreCorrectlyOrdered( + expectedRawPipeline, + actualPipeline + ) { + const query = makeQueryFromStageBodies(expectedRawPipeline); + assert.deepEqual(actualPipeline, query.toPipeline()); + } + + function makeSteps(stageBodies) { + return stageBodies.reduce((acc, stageBody) => { + if (stageBody instanceof Query.Or) { + return acc.concat(stageBody.dump()); + } + if (stageBody.from) { + return acc.concat(new QueryStep.Lookup(stageBody)); + } + return acc.concat(Query.fromSingleMatch(stageBody).dump()); + }, []); + } + + it("Returns pipeline stages in correct order for complex case", () => { + const queries = []; + let query = new Query(); + + const L1 = { + from: "authors", + localField: "author", + foreignField: "_id", + unwind: true, + }; + const L1_id = query.lookup(L1); + + const L2 = { + from: "publisher", + localField: `${L1_id}.publisher`, + foreignField: "publisher_id", + unwind: true, + }; + const L2_id = query.lookup(L2); + + const M3_4 = { + [`${L2_id}.city`]: { $in: ["A", "B"] }, + $or: [ + { [`${L1_id}.first_name`]: "Ann" }, + { [`${L2_id}.income`]: { $gt: 1000 } }, + ], + }; + query.match(M3_4); + queries.push(query); + + query = new Query(); + const M5 = { + title: { $ne: "The Joy of PHP" }, + }; + query.match(M5); + queries.push(query); + + let subquery1 = new Query(); + const O6_L1 = { + from: "libraries", + localField: "first_library", + foreignField: "library_id", + }; + const O6_L1_id = subquery1.lookup(O6_L1); + + const O6_M1 = { + [`${O6_L1_id}.street`]: { $in: ["A street", "B street"] }, + [`${O6_L1_id}.open_at_night`]: { $eq: true }, + }; + subquery1.match(O6_M1); + + const O6_M2 = { + books_count: { $lte: 30 }, + }; + let subquery2 = Query.fromSingleMatch(O6_M2); + const O6 = new Query.Or(subquery1, subquery2); + queries.push(O6); + + const O7_M1 = { + title: { + $in: ["PHP - Python Has Power", "The Good Parts of JS"], + }, + }; + const O7_M2 = O6_M2; + const O7 = new Query.Or( + Query.fromSingleMatch(O7_M1), + Query.fromSingleMatch(O7_M2) + ); + queries.push(O7); + + query = new Query(); + const L8 = { + from: "cover_types", + localField: "cover", + foreignField: "cover_type_id", + unwind: true, + }; + const L8_id = query.lookup(L8); + + const M9 = { + [`${L8_id}.name`]: { $ne: "hard" }, + }; + query.match(M9); + queries.push(query); + + query = new Query(); + // check if hashing is order insensitive + const L10 = { + localField: "cover", + from: "cover_types", + foreignField: "cover_type_id", + unwind: true, + }; + const L10_id = query.lookup(L10); + const M11 = { + [`${L10_id}.name`]: { $ne: "no_cover" }, + }; + query.match(M11); + queries.push(query); + + const stageBodies = [M5, O7, L8, M9, M11, O6, L1, L2, M3_4]; + let and = new Query.And(...queries); + assertStagesAreCorrectlyOrdered(stageBodies, and.toPipeline()); + assert.deepEqual(makeSteps(stageBodies), and.dump()); + }); + it("Returns deny all pipeline when provided Query.DenyAll", () => { + const queries = []; + let query = new Query(); + + const L1 = { + from: "authors", + localField: "author", + foreignField: "_id", + unwind: true, + }; + const L1_id = query.lookup(L1); + const M2 = { + [`${L1_id}.last_name`]: { $in: ["Christie", "Rowling"] }, + }; + query.match(M2); + queries.push(query); + + const deny_all_query = new Query.DenyAll(); + queries.push(deny_all_query); + + const M3 = { + title: { $ne: "The Joy of PHP" }, + }; + queries.push(Query.fromSingleMatch(M3)); + + const and = new Query.And(...queries); + assert.deepEqual(and.toPipeline(), deny_all_query.toPipeline()); + assert.deepEqual(and.dump(), deny_all_query.dump()); + }); + }); +}); + +function makeQueryFromStageBodies(stageBodies) { + const query = new Query(); + for (let i = 0; i < stageBodies.length; ++i) { + const stage = stageBodies[i]; + if (stage instanceof Query) { + query.steps.push(...stage.dump()); + } else if (stage.from) { + query.lookup(stage); + } else { + for (let step of Object.keys(stage)) { + query.match({ [step]: stage[step] }); + } + } + } + return query; +} + +function hashLookup({ $lookup }) { + const { as, ...lookup_without_as } = $lookup; + return QueryStep.hashBody(lookup_without_as); +} diff --git a/lib/datastore/query_and.js b/lib/datastore/query_and.js new file mode 100644 --- /dev/null +++ b/lib/datastore/query_and.js @@ -0,0 +1,87 @@ +const Query = require("./query.js"); +const QueryStep = require("./query-step.js"); +const Graph = require("./graph.js"); + +module.exports = class extends Query { + constructor(...queries) { + super(); + this._reset(); + for (let query of queries) { + this.addQuery(query); + } + } + _reset() { + this.graph = new Graph(); + this.aggregation_steps = {}; + this.received_deny_all = false; + } + addQuery(query) { + if (this.received_deny_all) { + return; + } + if (query instanceof Query.DenyAll) { + this._reset(); + this.received_deny_all = true; + } + const steps = query.dump(); + for (let step of steps) { + const id = step.hash(); + if (this._isInGraph(id)) { + continue; + } + + this._addToAggregationSteps(id, step); + this._addDependenciesInGraph(id, step); + } + } + _isInGraph(key) { + return key.length === 32 && this.graph.node_ids.includes(key); + } + _addToAggregationSteps(id, step) { + this.graph.addNode(id, step.getCost()); + this.aggregation_steps[id] = step; + } + _addDependenciesInGraph(id, step) { + let dependencies = step + .getUsedFields() + .filter(field => this._isInGraph(field)); + + if (step instanceof QueryStep.Match) { + dependencies = dependencies.filter(d1 => + this._isNotDependencyForAnyInGroup(d1, dependencies) + ); + } + + for (let dependency of dependencies) { + this.graph.addEdge(dependency, id); + } + } + _isNotDependencyForAnyInGroup(id, nodeGroup) { + return !nodeGroup.some( + node => id !== node && this.graph.pathExists(id, node) + ); + } + dump() { + const sortedStepIds = this.graph.bestFirstSearch(); + return sortedStepIds.reduce((steps, id) => { + if (Array.isArray(this.aggregation_steps[id])) { + steps.push(...this.aggregation_steps[id]); + } else { + steps.push(this.aggregation_steps[id]); + } + return steps; + }, []); + } + toPipeline() { + const sortedStepIds = this.graph.bestFirstSearch(); + return sortedStepIds.reduce((pipeline, id) => { + if (Array.isArray(this.aggregation_steps[id])) { + for (let step of this.aggregation_steps[id]) { + step.pushStage(pipeline); + } + return pipeline; + } + return this.aggregation_steps[id].pushStage(pipeline); + }, []); + } +}; diff --git a/lib/datastore/query_or.js b/lib/datastore/query_or.js new file mode 100644 --- /dev/null +++ b/lib/datastore/query_or.js @@ -0,0 +1,44 @@ +const Query = require("./query.js"); +const QueryStep = require("./query-step.js"); + +module.exports = class extends Query { + constructor(...queries) { + super(); + this.lookup_steps = []; + for (let query of queries) { + this.addQuery(query); + } + } + addQuery(query) { + const steps = query.dump(); + this.lookup_steps.push( + ...steps.filter(step => step instanceof QueryStep.Lookup) + ); + const match_stage_bodies = []; + steps + .filter(step => step instanceof QueryStep.Match) + .forEach(step => step.pushDump(match_stage_bodies)); + + const match_stage = + match_stage_bodies.length > 1 + ? { $and: match_stage_bodies } + : match_stage_bodies[0]; + this.steps.push(new QueryStep.Match(match_stage)); + } + dump() { + return this.lookup_steps.concat( + new QueryStep.Match({ $or: this._getMatchExpressions() }) + ); + } + toPipeline() { + const lookups = this.lookup_steps.reduce( + (acc, step) => step.pushStage(acc), + [] + ); + + return lookups.concat({ $match: { $or: this._getMatchExpressions() } }); + } + _getMatchExpressions() { + return this.steps.reduce((acc, step) => step.pushDump(acc), []); + } +}; diff --git a/lib/utils/transform-object.js b/lib/utils/transform-object.js new file mode 100644 --- /dev/null +++ b/lib/utils/transform-object.js @@ -0,0 +1,13 @@ +function transformObject(obj, prop_tranformer, value_transformer) { + return Object.keys(obj).reduce((new_obj, prop) => { + let new_prop = prop_tranformer(prop); + new_obj[new_prop] = + obj[prop] instanceof Object + ? transformObject(obj[prop], prop_tranformer, value_transformer) + : value_transformer(prop, obj[prop]); + + return new_obj; + }, Array.isArray(obj) ? [] : {}); +} + +module.exports = transformObject; diff --git a/package.json b/package.json --- a/package.json +++ b/package.json @@ -5,15 +5,13 @@ "description": "A declarative framework for fast & easy app development.", "main": "./lib/main.js", "scripts": { - "test": "mocha setup-test.js lib/**/*.test.js" + "test": "mocha setup-test.js \"./lib/**/*.test.js\"" }, "repository": { "type": "git", "url": "https://github.com/sealcode/sealious" }, - "keywords": [ - "sealious" - ], + "keywords": ["sealious"], "author": "The Sealious team (http://github.com/Sealious)", "license": "BSD-2-Clause", "bugs": { diff --git a/test_utils/access-strategy-types/create_strategies_with_complex_pipeline.js b/test_utils/access-strategy-types/create_strategies_with_complex_pipeline.js --- a/test_utils/access-strategy-types/create_strategies_with_complex_pipeline.js +++ b/test_utils/access-strategy-types/create_strategies_with_complex_pipeline.js @@ -17,11 +17,12 @@ localField: "body.number", foreignField: "sealious_id", }); - return query.match({ + query.match({ [`${id}._id`]: { $exists: strategy === "complex-allow-pipeline", }, }); + return query; }, checker_function: function() { return Promise.resolve();