Aggregation Framework Operators

$match operator

  • A filter operator, which allows only documents that satisfy specified criteria in the operator to pass ahead in the pipeline.

  • If match is the first stage of pipeline, then it can take advantage of indexes.

  • We can use multiple $match operator in a single pipeline.

  • When using $match operator, we cannot use $where operator.

  • We can use only $text query operator if $match stage should be the first stage of pipeline.

  • $match operator does not allow projections.

  • Example:

    • A sample query on a collection named movies with following criteria,

      • imdb.rating is at least 7

      • genres does not contain "Crime" or "Horror"

      • rated is either "PG" or "G"

      • languages contains "English" and "Japanese"

          db.movies.aggregate([
          { $match: {
              "imdb.rating": { "$gte": 7 },
              "genres": { "$nin": [ "Crime", "Horror" ] } ,
              "rated": { "$in": ["PG", "G" ] },
              "languages": { "$all": [ "English", "Japanese" ] }
              }
          }])

$project operator

  • A map like operator, which applies some transformation among collection.

  • Once we specified one field to retain, we must specify all fields we want to retain. The _id field is only exception.

  • It lets us add new fields.

  • This operator can be used as many times as required within an aggregation pipeline.

  • It can also be used to re assign values to existing field names, and to derive entirely new fields.

  • Example:

    • Adding project to above example

        var pipeline = [{$match: {
        "imdb.rating": { "$gte": 7 },
        "genres": { "$nin": [ "Crime", "Horror" ] } ,
        "rated": { "$in": ["PG", "G" ] },
        "languages": { "$all": [ "English", "Japanese" ] }
        }
        },
        {
            "$project": { "_id": 0, "title": 1, "rated": 1 }
        }]
  • Example:

        db.movies.aggregate([
        {
            $match: {
            title: {
                $type: "string"
            }
            }
        },
        {
            $project: {
            title: { $split: ["$title", " "] },
            _id: 0
            }
        },
        {
            $match: {
            title: { $size: 1 }
            }
        }])

$addFields operator

  • Adds fields to a document.

  • It's a transformation operator, just like $project operator.

  • It can modify existing fields or add new transformation fields to incoming documents.

$geoNear operator

  • Helps in performing geo queries within an aggregation pipeline. (Like $near query operator).

  • It must be first stage in the pipeline.

  • It can be used with charted collections.

  • When perfoming this operation on a collection, the collection must have one and only one 2dsphere index.

  • If using 2dsphere, the distance returned is in meters. Else if using legacy coordinates, the distance returned is in radians.

  • Syntax

Cursor-like stages

  • $count, $skip, $sort, $limit

  • Syntax

  • For sort stage, we can specify multiple different fields.

  • If sort is near the beginning of pipeline i.e., before project and unwind in the group stage, it can take advantage of indexes. Else, it will perform in-memory sort, which will increase memory consumption.

  • Sort operation in an aggregation pipeline is limited by default to 100 MB RAM usage. To allow larger dataset handling, diskUse needs to be allowed in aggregation pipeline.

    db.movies.aggregate([
        {stage1},
        {stage2},
        {
          "$sort":{"field":1}  
        }], {allowDiskUse:true})
  • The above option, allows us to do excess of 100 MB of memory calculation using disk. If not done so, operation gets terminated in the server.

Last updated