Quick Reference · document-oriented NoSQL database

mongodb cheat sheet

Everything is a JSON-like document inside a collection inside a database — no fixed schema, no CREATE TABLE. Commands either shape a collection (structure), write documents, read documents, or administer the cluster. Learn the bucket and the method names stop being arbitrary.

connect / inspect structure (collections/indexes) write (insert/update) read (find/aggregate) admin (users/tx/ops) destructive most common

Distilled & cross-checked against: mongodb.com/docs · mongosh reference · quickref.me · GeeksforGeeks · Codecademy · community mongosh gists

The aggregation pipeline — documents flow through stages, left to right
orders collection (all documents) $match filter documents (uses indexes — put this first) $lookup left outer join with another collection $unwind explode an array into one doc per element $group bucket by _id expr + $sum/$avg/ $push accumulators $project reshape output fields: 1 / 0 / computed expr $sort + $limit order & cap the final set Why stage order matters • Each stage's output becomes the next stage's input — think Unix pipes, not a single SQL clause list. • Put $match (and $sort before $limit) as early as possible so MongoDB can use indexes and shrink the working set fast. • A pipeline is just an array: db.orders.aggregate([ {$match:{...}}, {$group:{...}}, {$sort:{...}} ])
01Connect & Shellmongosh
02Databasescreated on first write
03Collectionsstructure
04Insertwrite · create
05Find — Basicsread
06Query Operatorsinside find()'s filter
07Projectionshape the returned fields
08Sort, Skip & Limitread · chained on a cursor
09Update Operatorsinside the update document
10Update Methodswrite
11Deletehandle with care
12Aggregation Stagesread · pipeline building blocks
13Indexesstructure · speed up reads
14BSON Data Typeswhat a field can hold

Core

ObjectId12-byte id, first 4 bytes = creation time
StringUTF-8 text
Booleantrue / false
Nullexplicit "no value"
Array / Objectnested lists & embedded documents

Numeric

Int32 / NumberIntstandard 32-bit integer
NumberLong64-bit — use for big counters/ids
Doubledefault JS number, floating point
NumberDecimalexact decimal — use for money

Date & Time

Date / ISODate()ms since epoch — sortable, comparable
Timestampinternal, used by the oplog

Other

Binaryraw bytes / files (small ones)
Regexa stored regular expression
GeoJSONPoint/Polygon for geospatial queries
15Schema Validationstructure · optional guardrails
16Transactionsadmin · multi-document ACID
17Users & Rolesadmin · access control
18Backup, Restore & Importshell · not run inside mongosh
19Replication & Shardingadmin · cluster ops
Operators Quick Mapthe $ prefix always means "operator"

Modeling relationships: embed vs. reference

MongoDB has no JOINs at write time, so the schema design choice happens up front. Based on the official MongoDB Data Modeling guide's embedding/referencing guidance.

Embed (one-to-few)

Related data lives inside the parent document. One read gets everything — best when children are small, bounded, and always fetched together.

{ user } name: "Max" addresses: [ {city:"Nellore"} {city:"Chennai"}

Reference (one-to-many)

Child documents live in their own collection and store the parent's ObjectId. Best when children are numerous, grow unbounded, or are queried on their own.

users _id: 501 orders userId: 501 orders userId: 501 orders userId: 501

$lookup at query time

Referenced data can still be pulled together with an aggregation join — the cost is paid at read time, not write time.

users orders $lookup user + orders: [...]

Worth memorizing

find() ≠ findOne()find returns a cursor of many docs; findOne returns one doc or null
update() is legacyalways prefer updateOne / updateMany / replaceOne today
deleteMany({})wipes every document but keeps the collection & its indexes
_id is immutableauto-indexed, unique per collection, can't be changed after insert
no operator = replaceupdateOne(filter, plainDoc) replaces the whole document body
countDocuments vs estimatedcountDocuments is accurate; estimatedDocumentCount is a fast metadata guess
upsert: trueinsert-if-missing in one round trip — skips a separate exists check
ObjectId has a timestampfirst 4 bytes encode creation time, so ids sort roughly chronologically