Links#
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/Welcome.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zones-private.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zone-private-considerations.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zone-private-creating.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zone-private-associate-vpcs-different-accounts.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resolver-query-logs.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/monitoring-resolver-with-cloudwatch.html
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/DNSLimitations.html
1. Important Points#
Route 53 是 AWS managed DNS:
public hosted zone:
internet DNS
domain delegation from registrar
private hosted zone:
private DNS namespace inside associated VPCs
resolved by VPC Resolver
not directly queried from internet
Route 53 Resolver:
VPC DNS resolver
hybrid DNS inbound / outbound endpoint
Resolver rule forwards selected domains to custom DNS servers
核心原则:
public DNS 和 private DNS 要明确分层
private hosted zone name 不能随便选,命名错误会影响整个 VPC 的 DNS resolution
PHZ association 是访问边界,不是 IAM 边界
overlapping namespace 必须提前设计
Resolver query log should be enabled for production VPCs
private hosted zone 常见坑:
VPC 必须打开 enableDnsHostnames 和 enableDnsSupport
同一个 VPC 不能关联两个完全同名的 hosted zone
如果 PHZ 匹配了查询名称但没有对应 record,VPC Resolver 返回 NXDOMAIN,不会再去 public DNS 查
PHZ 和 Resolver rule 同名时,Resolver rule has precedence
overlapping PHZ 使用 most specific match
2. Service Configuration#
hosted zone#
| Item |
Recommendation |
| Public hosted zone |
only for internet-facing domain |
| Private hosted zone |
only for VPC-internal names |
| PHZ VPC settings |
enableDnsHostnames=true, enableDnsSupport=true |
| PHZ association |
associate only VPCs that need this namespace |
| Cross-account VPC |
use authorization + programmatic association |
| Large VPC sharing |
review Route 53 Profiles when PHZ association count grows |
| Tags |
env, service, owner, cost-center, dns-scope |
hosted zone checklist:
zone owner is clear
namespace owner is clear
public/private split is intentional
VPC association list is reviewed
Resolver rules do not accidentally override PHZ
query logging is enabled for production VPCs
records are managed by IaC
record#
| Record |
Use Case |
| A / AAAA |
IP address record |
| CNAME |
alias one DNS name to another DNS name; not for zone apex |
| Alias |
AWS target such as ALB, NLB, CloudFront, S3 website, API Gateway |
| TXT |
verification / SPF / misc metadata |
| MX |
mail routing |
| SRV |
service discovery when clients support SRV |
record checklist:
prefer Alias for AWS resources when supported
keep TTL low during migration
increase TTL after records become stable
avoid wildcard record unless ownership and blast radius are clear
do not put secrets in TXT records
3. Private Hosted Zone Naming Best Practices#
namespace pattern#
推荐先选一个 company-owned root domain:
example.com
再给 AWS private DNS 单独划一个 private namespace:
aws.example.com
internal.example.com
corp.example.com
再按 env / region / platform / service 细分:
prod.ap-east-1.aws.example.com
dev.ap-east-1.aws.example.com
shared.ap-east-1.aws.example.com
data.prod.ap-east-1.aws.example.com
record examples:
orders.prod.ap-east-1.aws.example.com
mysql.order.prod.ap-east-1.aws.example.com
redis.order.prod.ap-east-1.aws.example.com
api.shared.ap-east-1.aws.example.com
推荐格式:
<service>.<env>.<region>.aws.example.com
<component>.<service>.<env>.<region>.aws.example.com
examples:
order-api.prod.ap-east-1.aws.example.com
writer.mysql.order.prod.ap-east-1.aws.example.com
reader.mysql.order.prod.ap-east-1.aws.example.com
zone boundary#
不要把 PHZ 建得太大:
bad:
example.com
reason:
如果 VPC 内查询 www.example.com,PHZ example.com 会匹配
但 PHZ 里没有 www 的 public record
Resolver returns NXDOMAIN
public website can be broken from inside VPC
推荐建窄一点:
better:
aws.example.com
prod.ap-east-1.aws.example.com
order.prod.ap-east-1.aws.example.com
什么时候用宽 zone:
团队有 central DNS ownership
所有 subdomain record 都由 IaC 管理
public/private split-view DNS 是明确需求
migration / rollback plan 已经验证
什么时候用窄 zone:
多团队、多账号、多 VPC
只想暴露某个 app / platform / environment namespace
不希望一个缺失 record 影响 parent domain
overlapping namespace#
overlapping namespace example:
PHZ 1:
aws.example.com
PHZ 2:
order.prod.ap-east-1.aws.example.com
query:
mysql.order.prod.ap-east-1.aws.example.com
result:
VPC Resolver chooses the most specific matching PHZ
here it uses order.prod.ap-east-1.aws.example.com
overlap design rule:
parent zone is owned by platform team
child zone is owned by application / domain team
child zone must not silently shadow parent records
document which VPCs associate parent / child zones
test NXDOMAIN behavior before production association
names to avoid#
avoid:
example.com as PHZ unless doing intentional split-view DNS
public production domain root as broad PHZ
.local because many networks use it for mDNS / local discovery
random fake TLD such as .corp / .internal if company policy does not control it
VPC ID in record names unless DNS clients really need VPC-specific names
account ID in user-facing DNS names unless it is an operations-only namespace
uppercase / underscore in host labels
prefer:
domain you own
lowercase
short labels
stable business names
env / region only when they help routing and operations
split-view DNS#
split-view DNS:
same domain name exists in public hosted zone and private hosted zone
public users resolve public records
VPC clients resolve private records
good use cases:
api.example.com public -> public ALB / CloudFront
api.example.com private -> internal ALB
risk:
PHZ missing record returns NXDOMAIN inside VPC
internal and public answers can drift
certificate / TLS name must still match the DNS name
4. Resolver Architecture#
VPC Resolver#
VPC Resolver:
default DNS resolver in VPC
resolves:
private hosted zones associated with the VPC
public DNS names
AWS internal names
Resolver rules
VPC setting:
enableDnsSupport:
enables Amazon-provided DNS resolver
enableDnsHostnames:
gives instances public DNS hostnames when appropriate
required for private hosted zone usage
hybrid DNS#
| Pattern |
Direction |
Use Case |
| Inbound endpoint |
on-prem -> AWS |
on-prem resolves PHZ / AWS private names |
| Outbound endpoint |
AWS -> on-prem |
VPC workloads resolve on-prem domains |
| Resolver rule |
selected domain forwarding |
forward corp.example.com to on-prem DNS |
hybrid DNS checklist:
endpoints in at least two subnets / AZs
security group allows DNS TCP/UDP 53 from trusted CIDR
forwarding rules are narrow
no loop between on-prem DNS and Route 53 Resolver
query logging enabled before migration
5. Security Best Practices#
least privilege:
separate permissions:
hosted zone admin
record writer
VPC association admin
query log admin
prefer CI/IaC role for record changes
avoid manual console record edits in production
use IAM condition keys for VPC association control when needed
private DNS security:
PHZ is not authentication
still enforce authn/authz at application / load balancer / service mesh layer
do not rely on unguessable DNS names
protect Resolver endpoints with security groups and network ACLs
review wildcard records
log DNS queries for incident response
6. Reliability / Migration#
TTL strategy#
before migration:
lower TTL to 30-60 seconds
wait old TTL duration
change record
verify from target networks
after stable:
increase TTL:
internal app record: 60-300 seconds
stable infra record: 300-900 seconds
external stable record: 300-3600 seconds
migration checklist#
DNS migration:
create new records before switching clients
test from each VPC / subnet / on-prem path
test A, AAAA, CNAME, Alias behavior
test NXDOMAIN for missing names
confirm TLS certificate covers the final hostname
keep rollback record target ready
monitor query logs and application errors
7. Monitoring#
Route 53 Resolver query logs:
log VPC-originated DNS queries
log inbound endpoint queries from on-prem
log outbound endpoint recursive queries
log DNS Firewall actions
useful fields:
query_name
query_type
response_code
vpc_id
srcaddr
instance_id
CloudWatch metrics for Resolver endpoints:
InboundQueryVolume
OutboundQueryAggregateVolume
P90ResponseTime if detailed metrics enabled
target name server health / response when enabled
alerts:
sudden SERVFAIL / NXDOMAIN increase
inbound / outbound query volume drops to zero
endpoint IP imbalance
high response time
8. Hands-on#
create private hosted zone#
export AWS_PAGER=""
export AWS_REGION="ap-east-1"
export VPC_ID="vpc-0123456789abcdef0"
export ZONE_NAME="prod.ap-east-1.aws.example.com"
aws ec2 modify-vpc-attribute \
--vpc-id "$VPC_ID" \
--enable-dns-support Value=true
aws ec2 modify-vpc-attribute \
--vpc-id "$VPC_ID" \
--enable-dns-hostnames Value=true
aws route53 create-hosted-zone \
--name "$ZONE_NAME" \
--vpc VPCRegion="$AWS_REGION",VPCId="$VPC_ID" \
--hosted-zone-config Comment="prod private dns in ap-east-1",PrivateZone=true \
--caller-reference "$(date +%Y%m%d%H%M%S)"
create record#
{
"Comment": "create order-api private record",
"Changes": [
{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "order-api.prod.ap-east-1.aws.example.com",
"Type": "A",
"TTL": 60,
"ResourceRecords": [
{
"Value": "10.0.10.25"
}
]
}
}
]
}
export ZONE_ID="Z0123456789ABCDEFG"
aws route53 change-resource-record-sets \
--hosted-zone-id "$ZONE_ID" \
--change-batch file://change-record.json
test from VPC#
dig order-api.prod.ap-east-1.aws.example.com
dig missing.prod.ap-east-1.aws.example.com
dig www.example.com
verify:
expected private record returns private IP / Alias target
missing private name returns NXDOMAIN
public name still resolves if PHZ does not shadow it
9. Production Checklist#
namespace:
company-owned domain is used
PHZ is not too broad
naming pattern is documented
overlap behavior is tested
association:
VPC DNS attributes enabled
VPC association list reviewed
cross-account association process documented
Resolver rules reviewed for precedence
records:
records managed by IaC
TTL policy defined
wildcard records reviewed
Alias used for AWS targets where possible
security:
least privilege IAM configured
Resolver endpoint security groups reviewed
DNS is not treated as auth boundary
operations:
Resolver query logs enabled
CloudWatch metrics / alarms configured
migration rollback plan exists
incident runbook includes dig / nslookup tests from affected VPCs