Quick Reference · document database & aggregation

mongodb cheat sheet 8.0 / 8.2

MongoDB stores documents (BSON) in collections. Everything is one of: shape the store (structure), write data, read/query data, transform via the aggregation pipeline, or operate the cluster (admin). MongoDB 8.0 added a cross-collection bulkWrite, range Queryable Encryption, and big time-series & sharding gains. Cards tagged 8.0+ mark recent additions; Atlas marks Atlas-only features.

connect / inspect structure write read / query pipeline / modern admin / ops destructive most common

Distilled & cross-checked against: mongodb.com/docs/manual · MongoDB 8.0 / 8.1 / 8.2 release notes · aggregation-stages reference · mongosh reference · Queryable Encryption docs · quickref.me

The aggregation pipeline — documents flow stage → stage, each reshaping the stream
db.coll.aggregate([ ... ]) collectionraw docs $matchfilter early $group /$setWindowaggregate $lookupjoin $projectreshape $sort/$limitorder + page resultor $merge/$out Pipeline rules that decide correctness & speed • Put $match (and $sort on an indexed field) FIRST — only $match/$sort at the front of a pipeline can use an index. • Each stage is capped at 100 MB RAM; pass { allowDiskUse: true } for large $group/$sort, or shrink docs with $project early. • $group collapses the stream; $setWindowFields computes across a window but keeps every document (running totals, ranks). • $out / $merge must be LAST and write to a collection; $search / $vectorSearch must be FIRST (Atlas only). • $lookup joins another collection; MongoDB 8.0 lets it run inside a transaction even when targeting a sharded collection. • Documents pass by value — a stage never mutates the source collection unless it's $out or $merge.
01Connect & Shellmongosh
02Databasesstructure
03Collectionsstructure
04Insertwrite
05Find — Basicsread
06Query Operatorsread · inside the filter
07Projectionread · shape the output
08Sort, Skip & Limitread · cursor modifiers
09Update Operatorswrite · inside the update
10Update Methodswrite
11Deletewrite · handle with care
12Bulk Write 8.0+write · batched ops
13Pipeline — Core Stagesread · aggregate()
14Pipeline — Reshape & Combineread / write-out
15Window & Analytics 5.0+read · per-window calc
16Expression Operatorsused inside pipeline stages
17Accumulatorsused in $group / $setWindowFields
18Search & Vector Atlasread · must be first stage
19Indexesstructure · speed up reads
20Index Typesstructure
21Geospatialread · needs 2dsphere index
22Text Searchread · native (non-Atlas)
23BSON Data Typeswhat a field can hold

Scalars

Double / Int32 / Longnumeric; Long via NumberLong()
Decimal128exact decimal — money, precise finance
StringUTF-8 text
Boolean / Nulltrue/false; explicit null

Structured

Objectembedded document
Arrayordered list, mixed types allowed

Identity & Time

ObjectId12-byte default _id, time-ordered
Datems since epoch; ISODate()
Timestampinternal replication use

Special

BinDatabinary; subtype 4 = UUID
Regex / MinKey / MaxKeypattern; comparison sentinels
24Schema Validationstructure
25Time Seriesstructure · 5.0+, big 8.0 gains
26Change Streamsread · react to writes
27Transactionsadmin · multi-doc ACID
28Read / Write Concernadmin · durability & consistency
29Users & Rolesadmin · RBAC
30Queryable Encryption 8.0+admin · encrypt + query
31Backup, Restore & Importshell tools · not in mongosh
32Replication & Shardingadmin · scale & HA
33Explain & Performanceread · tuning
Operators Quick Map$ prefix by context

Three concepts worth picturing

The mental models that trip people up most in MongoDB 8.x — drawn from the aggregation reference and 8.0 release notes.

$group vs. $setWindowFields

$group collapses the stream to one doc per key; $setWindowFields computes across a window but keeps every doc.

$group eng 120 eng 100 sales 90 sales 60 eng 220 sales 150 $setWindowFields eng 120 ∑120 eng 100 ∑220 sales 90 ∑90 sales 60 ∑150 running total, all rows

Embed vs. $lookup reference

Embed data you read together; reference (and $lookup) when it's large, shared, or grows unbounded.

embed user { name, orders:[ {..},{..} ] } one read reference + $lookup users_id: 7 ordersuid: 7 → $lookup

Queryable Encryption range 8.0

Fields are encrypted client-side, stored as opaque BinData — yet range queries still run server-side.

amount: 1500client (plain) encrypt BinData(0x9f..)server (cipher) find({amount: {$gt:1000}}) matches without decrypting on server

Worth memorizing

$match earlyonly $match / $sort at the front of a pipeline can use an index
$group vs $setWindow$group collapses rows; $setWindowFields keeps them (running totals, ranks)
100 MB / stageexceed it and the stage errors unless you pass allowDiskUse:true
$out/$merge lastwrite stages end a pipeline; $search / $vectorSearch must be first (Atlas)
ESR index rulecompound index order: Equality fields, then Sort, then Range
IXSCAN not COLLSCANexplain() showing COLLSCAN means no index is being used
embed vs referenceembed what you read together; reference what's large, shared, or unbounded
countDocumentsaccurate count; the old count() is deprecated
bulkWrite command 8.0write across multiple collections in one request; ~56% faster inserts
QE range 8.0$gt/$lt/$gte/$lte now work on encrypted fields, GA
txns need a repl setmulti-doc transactions require a replica set / sharded cluster
$field vs $$ROOT"$x" = value of x; "$$ROOT" = the whole current document