MongoDB Aggregation operations
A beginner guide to MongoDB aggregation operations
As we know MongoDB is a document-oriented NoSQL database with lots of powerful features and advantages. One of the most useful features is running aggregation operations.
Aggregation operations
Aggregation operations are a way of processing a large number of documents or records in a collection and returning computed results. Aggregation operations group values from multiple documents together and can perform a variety of operations on the grouped data to return a single result.
Aggregate query syntax
The basic syntax for aggregate query is as follows.
db.COLLECTION_NAME.aggregate(pipeline, options)
Here,
COLLECTION_NAME: is the name of the collection or table
pipeline: is the array that defines the sequence of data aggregation operations
options: is an optional parameter of the aggregate query.
Aggregate examples
Before running the aggregate query, we have to insert some data. We can use insert or insertMany to insert data into a collection of databases.
Consider the following data in a collection name Employee.
db.Employees.insert({ _id : "1", Name: "John", Designation: "SE", Salary : 1000});
db.Employees.insert({ _id : "2", Name: "Don", Designation: "SSE" Salary : 100});
db.Employees.insert({ _id : "3", Name: "Smith",Designation: "JSE", Salary : 1000});
db.Employees.insert({ _id : "4", Name: "Kumar",Designation: "SE", Salary : 100});
db.Employees.insert({ _id : "5", Name: "Akash", Designation: "SE", Salary : 100});
db.Employees.insert({ _id : "6", Name: "Roky", Designation: "SSE", Salary : 1000});
COUNT
After finishing the data insertion, we have a requirement to get the number of employees or total count of employees whose Designation is “SE”.
We can do this in the following ways:
db.getCollection('Employees').find({"Designation" : "SE"}).count()
or
db.getCollection('Employees').count({"Designation" : "SE"})
or
db.getCollection('Employees').find({"Designation" : "SE"}).length()
or
db.Employees.count({"Designation" : "SE"})
But what if you don’t know the possible values of Designation? In this case, an aggregate query comes up and can solve easily.
db.Employees.aggregate(
[
{
$group: {
_id: "$Designation",
count: {$sum: 1}
}
}
]
);
We will get the results as follows.
/* 1 */
{
"_id" : "SE",
"count" : 3.0
}/* 2 */
{
"_id" : "SSE",
"count" : 2.0
}/* 3 */
{
"_id" : "JSE",
"count" : 1.0
}
SUM
From the above data sets, we want to calculate the summation of salary. We can use the following aggregate query.
db.Employees.aggregate(
[
{
$group: {
_id: "$Designation",
count: {$sum: 1},
totalSalary: {$sum: '$Salary'}
}
}
]
);
We will get the results as follows
/* 1 */
{
"_id" : "SSE",
"count" : 2.0,
"totalSalary" : 1100.0
}/* 2 */
{
"_id" : "JSE",
"count" : 1.0,
"totalSalary" : 1000.0
}/* 3 */
{
"_id" : "SE",
"count" : 3.0,
"totalSalary" : 1200.0
}
AVERAGE
If we have a requirement to calculate the average salary according to Designation, How can we calculate that?
We can use the following aggregate query.
db.Employees.aggregate(
[
{
$group: {
_id: "$Designation",
count: {$sum: 1},
totalSalary: {$sum: '$Salary'},
averageSalary: {$avg: {$sum: ["$Salary"]}}
}
}
]
);
We will get the results as follows
/* 1 */
{
"_id" : "SE",
"count" : 3.0,
"totalSalary" : 1200.0,
"averageSalary" : 400.0
}/* 2 */
{
"_id" : "SSE",
"count" : 2.0,
"totalSalary" : 1100.0,
"averageSalary" : 550.0
}/* 3 */
{
"_id" : "JSE",
"count" : 1.0,
"totalSalary" : 1000.0,
"averageSalary" : 1000.0
}
MATCH
If we want to get the total salary and average salary of employees according to Designation where salary is greater than equal to 100 and less than 1000.
We can use the following aggregate query.
db.Employees.aggregate(
[
{
$match: {"Salary":{"$lt": 1000}}
},
{
$match: {"Salary":{"$gte": 100}}
},
{
$group: {
_id: "$Designation",
count: {$sum: 1},
totalSalary: {$sum: '$Salary'},
averageSalary: {$avg: {$sum: ["$Salary"]}}
}
}
]
);
We will get the results as follows
/* 1 */
{
"_id" : "SSE",
"count" : 1.0,
"totalSalary" : 100.0,
"averageSalary" : 100.0
}/* 2 */
{
"_id" : "SE",
"count" : 2.0,
"totalSalary" : 200.0,
"averageSalary" : 100.0
}
MAX
If we want to get the maximum salary of employees according to Designation, We can use the following aggregate query.
db.Employees.aggregate(
[
{
$group: {
_id: "$Designation",
count: {$sum: 1},
maxSalary: {$max: '$Salary'}
}
}
]
);
We will get the results as follows
/* 1 */
{
"_id" : "SSE",
"count" : 2.0,
"maxSalary" : 1000.0
}/* 2 */
{
"_id" : "SE",
"count" : 3.0,
"maxSalary" : 1000.0
}/* 3 */
{
"_id" : "JSE",
"count" : 1.0,
"maxSalary" : 1000.0
}
MIN
If we want to get the minimum salary of employees according to Designation, We can use the following aggregate query.
db.Employees.aggregate(
[
{
$group: {
_id: "$Designation",
count: {$sum: 1},
minSalary: {$min: '$Salary'}
}
}
]
);
We will get the results as follows
/* 1 */
{
"_id" : "SSE",
"count" : 2.0,
"minSalary" : 100.0
}/* 2 */
{
"_id" : "SE",
"count" : 3.0,
"minSalary" : 100.0
}/* 3 */
{
"_id" : "JSE",
"count" : 1.0,
"minSalary" : 1000.0
}
Summary
We discussed some basic aggregate operations of MongoDB. There also have some other aggregate query options. One thing, if you are worried about the performance of the aggregate query then you have to know that the aggregation pipeline automatically reshapes the query with the aim of improving its performance.
So Cheers!!