MongoDB

Links#

https://www.mongodb.com/docs/manual/
https://www.mongodb.com/docs/manual/administration/production-notes/
https://www.mongodb.com/docs/manual/administration/security-checklist/
https://www.mongodb.com/docs/manual/core/security-encryption-at-rest/
https://www.mongodb.com/docs/manual/replication/
https://www.mongodb.com/docs/manual/sharding/
https://www.mongodb.com/docs/manual/reference/command/serverStatus/
https://www.mongodb.com/docs/manual/tutorial/manage-the-database-profiler/
https://www.mongodb.com/docs/manual/core/indexes/index-types/index-compound/create-compound-index/
https://github.com/percona/mongodb_exporter

1. Important Points#

MongoDB 是 document database，不是 relational database:
    适合:
        document model
        schema evolves over time
        high write/read OLTP
        nested data
        event / profile / catalog / content / metadata

    不适合:
        heavy relational join
        cross-document strong transaction everywhere
        ad-hoc analytical query on huge dataset
        unbounded array growth
        no index discipline

核心原则:
    schema design should follow read/write access pattern
    index is part of application design
    replica set is production baseline
    backup restore must be tested
    slow query / profiler / explain should be daily tools
    do not run production without auth / TLS / least privilege

2. Service Configuration#

deployment mode#

Mode	When To Use	注意项
Standalone	local dev / temporary test	production 不建议
Replica Set	most production workload	HA baseline, supports election and failover
Sharded Cluster	dataset / throughput exceeds one replica set	operational complexity much higher

production baseline:
    replica set with odd voting members
    at least 3 data-bearing nodes, or 2 data-bearing + 1 arbiter for limited cases
    auth enabled
    TLS enabled
    backup enabled
    monitoring enabled
    slow query log reviewed

mongod config#

storage:
  dbPath: /var/lib/mongo
  journal:
    enabled: true
  wiredTiger:
    engineConfig:
      cacheSizeGB: 4

systemLog:
  destination: file
  path: /var/log/mongodb/mongod.log
  logAppend: true

net:
  port: 27017
  bindIp: 127.0.0.1,10.0.1.10
  tls:
    mode: requireTLS
    certificateKeyFile: /etc/mongodb/tls/server.pem
    CAFile: /etc/mongodb/tls/ca.pem

security:
  authorization: enabled
  keyFile: /etc/mongodb/keyfile

replication:
  replSetName: rs0

operationProfiling:
  slowOpThresholdMs: 100

config notes:
    bindIp 不要直接 0.0.0.0 暴露公网
    keyFile 用于 replica set / sharded cluster internal authentication
    tls should be required for production
    WiredTiger cache 默认会自动计算，但容器/混部环境建议显式限制
    slowOpThresholdMs 根据业务 SLO 调整

storage#

storage checklist:
    use SSD / low latency disk
    filesystem: XFS is commonly recommended for WiredTiger
    enough IOPS for peak write and checkpoint
    separate data volume from root volume
    monitor disk latency, not only disk usage
    never let disk reach 100%

容量预估:
    data size
    index size
    oplog size
    journal / temporary file
    backup snapshot
    growth rate

oplog#

oplog:
    capped collection for replication
    secondaries replicate by reading oplog
    oplog window must be longer than expected outage / maintenance window

watch:
    replication lag
    oplog window
    secondary can catch up after restart

common issue:
    oplog too small -> secondary falls too far behind -> initial sync required

3. Data Modeling Best Practices#

document design#

embed when:
    one-to-few
    data is read together
    child lifecycle depends on parent
    update frequency is not high

reference when:
    one-to-many / many-to-many
    child grows without bound
    child is queried independently
    child changes frequently

avoid:
    unbounded arrays
    huge document close to 16 MB limit
    deeply nested structure that is hard to index
    storing large binary files directly in collection

schema#

MongoDB is flexible schema, not no schema:
    define required fields in application
    use JSON Schema validation for important collections
    keep version field for schema evolution
    plan migration / backfill

db.createCollection("orders", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["order_id", "user_id", "status", "created_at"],
      properties: {
        order_id: { bsonType: "string" },
        user_id: { bsonType: "string" },
        status: { enum: ["PENDING", "PAID", "CANCELLED"] },
        created_at: { bsonType: "date" },
        amount: { bsonType: "decimal" }
      }
    }
  }
})

indexes#

index principles:
    index supports query shape
    every high-QPS query should have explain plan reviewed
    compound index order matters
    avoid too many indexes
    indexes speed reads but slow writes and cost storage
    remove unused indexes after observing usage

ESR rule for compound index:
    Equality
    Sort
    Range

example query:
    user_id equality
    sort by created_at
    range by created_at

index:
    { user_id: 1, created_at: -1 }

db.orders.createIndex(
  { user_id: 1, created_at: -1 },
  { name: "idx_user_created_at" }
)

db.orders.createIndex(
  { status: 1, created_at: 1 },
  {
    name: "idx_pending_created_at",
    partialFilterExpression: { status: "PENDING" }
  }
)

shard key#

shard key should:
    have high cardinality
    distribute writes
    support common queries
    avoid monotonically increasing hot shard

bad shard key:
    created_at only
    status
    country
    low-cardinality tenant

common choice:
    hashed shard key for even distribution
    compound shard key for query targeting

warning:
    shard key is hard to change
    choose after measuring workload

4. Query / Write Best Practices#

query#

good query:
    uses index
    limits result size
    returns only needed fields
    has stable sort
    avoids large skip

bad query:
    collection scan on production API
    regex prefix without proper index strategy
    sort without index
    large skip pagination
    $in with huge list

db.orders.find(
  {
    user_id: "u-1001",
    created_at: {
      $gte: ISODate("2026-05-01T00:00:00Z"),
      $lt: ISODate("2026-06-01T00:00:00Z")
    }
  },
  {
    _id: 0,
    order_id: 1,
    status: 1,
    amount: 1,
    created_at: 1
  }
).sort({ created_at: -1 }).limit(50)

pagination#

avoid:
    skip huge offset

prefer:
    cursor based pagination
    stable sort key
    created_at + _id

db.orders.find({
  user_id: "u-1001",
  $or: [
    { created_at: { $lt: ISODate("2026-05-29T10:00:00Z") } },
    {
      created_at: ISODate("2026-05-29T10:00:00Z"),
      _id: { $lt: ObjectId("665800000000000000000000") }
    }
  ]
}).sort({ created_at: -1, _id: -1 }).limit(50)

write concern#

writeConcern:
    w: 1
        primary acknowledged
        lower latency

    w: majority
        majority acknowledged
        safer for critical write

    j: true
        journal acknowledged

production default:
    critical data use w: majority
    tune per workload, not globally by guess

read concern / read preference#

readConcern:
    local:
        fastest, may read rollback-able data

    majority:
        only data acknowledged by majority

readPreference:
    primary:
        strongest consistency with primary writes

    secondaryPreferred:
        good for read-heavy non-critical workload
        must tolerate replication lag

warning:
    reading from secondary can return stale data

transactions#

transactions:
    use when multi-document atomicity is really needed
    keep short
    avoid user interaction inside transaction
    watch lock / conflict / latency

practical rule:
    if every request needs transaction, revisit schema design

5. Security Best Practices#

authentication#

production:
    enable authorization
    use SCRAM or x.509 depending on environment
    no shared application admin user
    separate app user / migration user / backup user / readonly user
    rotate password / keyFile

use admin

db.createUser({
  user: "order_app",
  pwd: passwordPrompt(),
  roles: [
    { role: "readWrite", db: "order" }
  ]
})

authorization#

least privilege:
    application should not use root
    backup user only needs backup related roles
    readonly dashboard user should be read-only
    avoid clusterAdmin for application

network#

network checklist:
    do not expose MongoDB to public internet
    security group only allows app subnet / admin bastion / backup system
    enable TLS
    bind to private IP
    separate internal replica traffic from public access where possible

encryption#

in transit:
    enable TLS
    verify CA

at rest:
    self-managed MongoDB Community usually relies on disk encryption
    MongoDB Enterprise supports storage engine encryption at rest
    cloud / VM disk encryption should be enabled

field level:
    use client-side field level encryption for sensitive fields when needed
    encrypted fields may affect query capability

audit#

audit checklist:
    log authentication failure
    log privilege changes
    log user creation/deletion
    log unusual admin command
    ship logs to central logging system

note:
    detailed audit logging is an Enterprise feature
    Community deployments usually rely on logs + OS/network audit

6. Backup / Restore#

backup methods:
    filesystem snapshot with consistency guarantee
    mongodump / mongorestore for logical backup
    oplog-based backup for point-in-time restore
    operator/vendor backup if running on Kubernetes

backup rule:
    backup without restore test is not backup
    restore into isolated environment regularly
    define RPO / RTO
    keep backup credentials separate from app credentials

mongodump \
  --uri "mongodb://backup_user@mongo-1:27017,mongo-2:27017,mongo-3:27017/admin?replicaSet=rs0&authSource=admin" \
  --archive="/backup/order-$(date +%F).archive" \
  --gzip

mongorestore \
  --uri "mongodb://restore_user@restore-mongo:27017/admin?authSource=admin" \
  --archive="/backup/order-2026-05-29.archive" \
  --gzip

7. Monitoring#

important metrics#

Area	Metrics / Command	What To Watch
Availability	`up`, connection failure	node down / primary unavailable
Connections	`serverStatus.connections`	current / available / rejected
Operations	`opcounters`, `opcountersRepl`	query / insert / update / delete rate
Latency	`opLatencies`, slow query log	p95 / p99 read write command latency
Query quality	profiler, `explain()`	COLLSCAN, IXSCAN, docs examined / returned
Cache	`wiredTiger.cache`	dirty bytes, bytes read into cache, eviction
Lock	`globalLock`, lock stats	lock pressure and queue
Replication	`rs.printSecondaryReplicationInfo()`	replication lag and oplog window
Oplog	`rs.printReplicationInfo()`	oplog size and time range
Storage	`dbStats`, `collStats`	data size, index size, collection growth
Errors	logs, assertions	network, replication, storage, auth errors
Cursors	`metrics.cursor`	timeout / open cursor

alert rules#

critical:
    no primary for > 1m
    primary changed frequently
    replication lag > RPO
    disk usage > 85%
    disk latency high
    WiredTiger cache pressure high
    connections near limit
    high rate of authentication failure

warning:
    slow query spike
    COLLSCAN appears in high-QPS path
    docs examined / returned ratio too high
    index size grows faster than data size
    oplog window lower than maintenance window
    secondary stale for long time

dashboard#

dashboard should include:
    node health and primary/secondary state
    operation rate by type
    read/write/command latency
    slow query count
    connections current / available
    replication lag
    oplog window
    WiredTiger cache usage / dirty bytes / eviction
    disk usage / disk latency / filesystem available
    network in/out
    asserts / errors / restarts

Prometheus exporter#

common exporter:
    percona/mongodb_exporter

notes:
    create monitoring user with least privilege
    scrape every mongod node
    label replica set / role / environment
    monitor exporter errors too

use admin

db.createUser({
  user: "mongodb_exporter",
  pwd: passwordPrompt(),
  roles: [
    { role: "clusterMonitor", db: "admin" },
    { role: "read", db: "local" }
  ]
})

scrape_configs:
  - job_name: mongodb
    static_configs:
      - targets:
          - mongo-1:9216
          - mongo-2:9216
          - mongo-3:9216
        labels:
          service: mongodb
          env: prod

8. Hands-on#

explain query#

db.orders.find({
  user_id: "u-1001",
  created_at: {
    $gte: ISODate("2026-05-01T00:00:00Z"),
    $lt: ISODate("2026-06-01T00:00:00Z")
  }
}).sort({ created_at: -1 }).explain("executionStats")

check:
    winningPlan uses IXSCAN
    totalDocsExamined close to nReturned
    no COLLSCAN
    executionTimeMillis within SLO

find slow queries#

db.setProfilingLevel(1, { slowms: 100 })

db.system.profile.find({
  millis: { $gt: 100 }
}).sort({ ts: -1 }).limit(20).pretty()

check replica set#

rs.status()
rs.printReplicationInfo()
rs.printSecondaryReplicationInfo()

check collection stats#

db.orders.stats()
db.orders.aggregate([
  { $indexStats: {} }
])

create index#

db.orders.createIndex(
  { user_id: 1, created_at: -1 },
  {
    name: "idx_user_created_at"
  }
)

backup#

mongodump \
  --uri "mongodb://backup_user@mongo-1:27017,mongo-2:27017,mongo-3:27017/admin?replicaSet=rs0&authSource=admin" \
  --db order \
  --archive="/backup/order-$(date +%F).archive" \
  --gzip

9. Production Checklist#

before launch:
    access patterns reviewed
    schema reviewed
    indexes reviewed with explain
    replica set deployed
    auth enabled
    TLS enabled
    firewall / security group restricted
    backup configured
    restore tested
    monitoring and alerts created
    slow query threshold decided
    oplog window checked
    disk size / IOPS checked
    connection pool settings reviewed

when incident happens:
    check primary availability
    check recent election
    check replication lag
    check slow query / COLLSCAN
    check lock / cache / disk latency
    check connection saturation
    check recent deploy / migration / index build