1. Important Points#

EC2 是 AWS 最基础的 compute service。它适合需要完整 OS control、custom runtime、host-level agent、legacy workload、特殊网络或存储形态的场景。

EC2 用来做:
    run VM workload
    run self-managed container / Kubernetes / database
    host legacy application
    use GPU / local NVMe / high network instance type
    build bastion, runner, test box, rescue host

EC2 不适合:
    不想管理 OS patching
    不想处理 instance lifecycle
    application 可以直接跑在 ECS / Lambda / App Runner
    workload 不需要 host-level control

核心原则:

access first:
    Session Manager or SSH fallback must work
    instance profile must be attached
    IMDS and SSM endpoint access must be healthy

state and recovery:
    root volume snapshot before risky change
    data volume separated from root volume when possible
    rescue process documented

operations:
    monitor CPU, memory, disk, status check, network
    keep CloudWatch Agent / SSM Agent healthy
    avoid auto-starting risky containers before access is verified

2. Common Topics#

Topic Notes
Access Session Manager, SSH, EC2 Instance Connect
Instance Profile IAM role for AWS API access from the instance
IMDS Metadata and temporary credentials
EBS Root volume, data volume, snapshot, attach/detach
User Data Bootstrap script and first boot behavior
Networking ENI, security group, route table, public/private subnet
Monitoring status checks, CloudWatch metrics, CloudWatch Agent
Rescue recover from broken OS, broken network, bad startup service

3. Issue Cases#