AWS Route 53


https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/Welcome.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zones-private.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zone-private-considerations.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zone-private-creating.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zone-private-associate-vpcs-different-accounts.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resolver-query-logs.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/monitoring-resolver-with-cloudwatch.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/DNSLimitations.html

1. Important Points#

Route 53 是 AWS managed DNS:
    public hosted zone:
        internet DNS
        domain delegation from registrar

    private hosted zone:
        private DNS namespace inside associated VPCs
        resolved by VPC Resolver
        not directly queried from internet

    Route 53 Resolver:
        VPC DNS resolver
        hybrid DNS inbound / outbound endpoint
        Resolver rule forwards selected domains to custom DNS servers

核心原则:
    public DNS 和 private DNS 要明确分层
    private hosted zone name 不能随便选,命名错误会影响整个 VPC 的 DNS resolution
    PHZ association 是访问边界,不是 IAM 边界
    overlapping namespace 必须提前设计
    Resolver query log should be enabled for production VPCs
private hosted zone 常见坑:
    VPC 必须打开 enableDnsHostnames 和 enableDnsSupport
    同一个 VPC 不能关联两个完全同名的 hosted zone
    如果 PHZ 匹配了查询名称但没有对应 record,VPC Resolver 返回 NXDOMAIN,不会再去 public DNS 查
    PHZ 和 Resolver rule 同名时,Resolver rule has precedence
    overlapping PHZ 使用 most specific match

2. Service Configuration#

hosted zone#

Item Recommendation
Public hosted zone only for internet-facing domain
Private hosted zone only for VPC-internal names
PHZ VPC settings enableDnsHostnames=true, enableDnsSupport=true
PHZ association associate only VPCs that need this namespace
Cross-account VPC use authorization + programmatic association
Large VPC sharing review Route 53 Profiles when PHZ association count grows
Tags env, service, owner, cost-center, dns-scope
hosted zone checklist:
    zone owner is clear
    namespace owner is clear
    public/private split is intentional
    VPC association list is reviewed
    Resolver rules do not accidentally override PHZ
    query logging is enabled for production VPCs
    records are managed by IaC

record#

Record Use Case
A / AAAA IP address record
CNAME alias one DNS name to another DNS name; not for zone apex
Alias AWS target such as ALB, NLB, CloudFront, S3 website, API Gateway
TXT verification / SPF / misc metadata
MX mail routing
SRV service discovery when clients support SRV
record checklist:
    prefer Alias for AWS resources when supported
    keep TTL low during migration
    increase TTL after records become stable
    avoid wildcard record unless ownership and blast radius are clear
    do not put secrets in TXT records

3. Private Hosted Zone Naming Best Practices#

namespace pattern#

推荐先选一个 company-owned root domain:
    example.com

再给 AWS private DNS 单独划一个 private namespace:
    aws.example.com
    internal.example.com
    corp.example.com

再按 env / region / platform / service 细分:
    prod.ap-east-1.aws.example.com
    dev.ap-east-1.aws.example.com
    shared.ap-east-1.aws.example.com
    data.prod.ap-east-1.aws.example.com

record examples:
    orders.prod.ap-east-1.aws.example.com
    mysql.order.prod.ap-east-1.aws.example.com
    redis.order.prod.ap-east-1.aws.example.com
    api.shared.ap-east-1.aws.example.com
推荐格式:
    <service>.<env>.<region>.aws.example.com
    <component>.<service>.<env>.<region>.aws.example.com

examples:
    order-api.prod.ap-east-1.aws.example.com
    writer.mysql.order.prod.ap-east-1.aws.example.com
    reader.mysql.order.prod.ap-east-1.aws.example.com

zone boundary#

不要把 PHZ 建得太大:
    bad:
        example.com

    reason:
        如果 VPC 内查询 www.example.com,PHZ example.com 会匹配
        但 PHZ 里没有 www 的 public record
        Resolver returns NXDOMAIN
        public website can be broken from inside VPC

推荐建窄一点:
    better:
        aws.example.com
        prod.ap-east-1.aws.example.com
        order.prod.ap-east-1.aws.example.com
什么时候用宽 zone:
    团队有 central DNS ownership
    所有 subdomain record 都由 IaC 管理
    public/private split-view DNS 是明确需求
    migration / rollback plan 已经验证

什么时候用窄 zone:
    多团队、多账号、多 VPC
    只想暴露某个 app / platform / environment namespace
    不希望一个缺失 record 影响 parent domain

overlapping namespace#

overlapping namespace example:
    PHZ 1:
        aws.example.com

    PHZ 2:
        order.prod.ap-east-1.aws.example.com

query:
    mysql.order.prod.ap-east-1.aws.example.com

result:
    VPC Resolver chooses the most specific matching PHZ
    here it uses order.prod.ap-east-1.aws.example.com
overlap design rule:
    parent zone is owned by platform team
    child zone is owned by application / domain team
    child zone must not silently shadow parent records
    document which VPCs associate parent / child zones
    test NXDOMAIN behavior before production association

names to avoid#

avoid:
    example.com as PHZ unless doing intentional split-view DNS
    public production domain root as broad PHZ
    .local because many networks use it for mDNS / local discovery
    random fake TLD such as .corp / .internal if company policy does not control it
    VPC ID in record names unless DNS clients really need VPC-specific names
    account ID in user-facing DNS names unless it is an operations-only namespace
    uppercase / underscore in host labels

prefer:
    domain you own
    lowercase
    short labels
    stable business names
    env / region only when they help routing and operations

split-view DNS#

split-view DNS:
    same domain name exists in public hosted zone and private hosted zone
    public users resolve public records
    VPC clients resolve private records

good use cases:
    api.example.com public -> public ALB / CloudFront
    api.example.com private -> internal ALB

risk:
    PHZ missing record returns NXDOMAIN inside VPC
    internal and public answers can drift
    certificate / TLS name must still match the DNS name

4. Resolver Architecture#

VPC Resolver#

VPC Resolver:
    default DNS resolver in VPC
    resolves:
        private hosted zones associated with the VPC
        public DNS names
        AWS internal names
        Resolver rules

VPC setting:
    enableDnsSupport:
        enables Amazon-provided DNS resolver

    enableDnsHostnames:
        gives instances public DNS hostnames when appropriate
        required for private hosted zone usage

hybrid DNS#

Pattern Direction Use Case
Inbound endpoint on-prem -> AWS on-prem resolves PHZ / AWS private names
Outbound endpoint AWS -> on-prem VPC workloads resolve on-prem domains
Resolver rule selected domain forwarding forward corp.example.com to on-prem DNS
hybrid DNS checklist:
    endpoints in at least two subnets / AZs
    security group allows DNS TCP/UDP 53 from trusted CIDR
    forwarding rules are narrow
    no loop between on-prem DNS and Route 53 Resolver
    query logging enabled before migration

5. Security Best Practices#

least privilege:
    separate permissions:
        hosted zone admin
        record writer
        VPC association admin
        query log admin

    prefer CI/IaC role for record changes
    avoid manual console record edits in production
    use IAM condition keys for VPC association control when needed
private DNS security:
    PHZ is not authentication
    still enforce authn/authz at application / load balancer / service mesh layer
    do not rely on unguessable DNS names
    protect Resolver endpoints with security groups and network ACLs
    review wildcard records
    log DNS queries for incident response

6. Reliability / Migration#

TTL strategy#

before migration:
    lower TTL to 30-60 seconds
    wait old TTL duration
    change record
    verify from target networks

after stable:
    increase TTL:
        internal app record: 60-300 seconds
        stable infra record: 300-900 seconds
        external stable record: 300-3600 seconds

migration checklist#

DNS migration:
    create new records before switching clients
    test from each VPC / subnet / on-prem path
    test A, AAAA, CNAME, Alias behavior
    test NXDOMAIN for missing names
    confirm TLS certificate covers the final hostname
    keep rollback record target ready
    monitor query logs and application errors

7. Monitoring#

Route 53 Resolver query logs:
    log VPC-originated DNS queries
    log inbound endpoint queries from on-prem
    log outbound endpoint recursive queries
    log DNS Firewall actions

useful fields:
    query_name
    query_type
    response_code
    vpc_id
    srcaddr
    instance_id
CloudWatch metrics for Resolver endpoints:
    InboundQueryVolume
    OutboundQueryAggregateVolume
    P90ResponseTime if detailed metrics enabled
    target name server health / response when enabled

alerts:
    sudden SERVFAIL / NXDOMAIN increase
    inbound / outbound query volume drops to zero
    endpoint IP imbalance
    high response time

8. Hands-on#

create private hosted zone#

export AWS_PAGER=""
export AWS_REGION="ap-east-1"
export VPC_ID="vpc-0123456789abcdef0"
export ZONE_NAME="prod.ap-east-1.aws.example.com"

aws ec2 modify-vpc-attribute \
  --vpc-id "$VPC_ID" \
  --enable-dns-support Value=true

aws ec2 modify-vpc-attribute \
  --vpc-id "$VPC_ID" \
  --enable-dns-hostnames Value=true

aws route53 create-hosted-zone \
  --name "$ZONE_NAME" \
  --vpc VPCRegion="$AWS_REGION",VPCId="$VPC_ID" \
  --hosted-zone-config Comment="prod private dns in ap-east-1",PrivateZone=true \
  --caller-reference "$(date +%Y%m%d%H%M%S)"

create record#

{
  "Comment": "create order-api private record",
  "Changes": [
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "order-api.prod.ap-east-1.aws.example.com",
        "Type": "A",
        "TTL": 60,
        "ResourceRecords": [
          {
            "Value": "10.0.10.25"
          }
        ]
      }
    }
  ]
}
export ZONE_ID="Z0123456789ABCDEFG"

aws route53 change-resource-record-sets \
  --hosted-zone-id "$ZONE_ID" \
  --change-batch file://change-record.json

test from VPC#

dig order-api.prod.ap-east-1.aws.example.com
dig missing.prod.ap-east-1.aws.example.com
dig www.example.com
verify:
    expected private record returns private IP / Alias target
    missing private name returns NXDOMAIN
    public name still resolves if PHZ does not shadow it

9. Production Checklist#

namespace:
    company-owned domain is used
    PHZ is not too broad
    naming pattern is documented
    overlap behavior is tested

association:
    VPC DNS attributes enabled
    VPC association list reviewed
    cross-account association process documented
    Resolver rules reviewed for precedence

records:
    records managed by IaC
    TTL policy defined
    wildcard records reviewed
    Alias used for AWS targets where possible

security:
    least privilege IAM configured
    Resolver endpoint security groups reviewed
    DNS is not treated as auth boundary

operations:
    Resolver query logs enabled
    CloudWatch metrics / alarms configured
    migration rollback plan exists
    incident runbook includes dig / nslookup tests from affected VPCs