Links#
https://docs.aws.amazon.com/vpc/latest/privatelink/concepts.html
https://docs.aws.amazon.com/vpc/latest/privatelink/privatelink-access-aws-services.html
https://docs.aws.amazon.com/vpc/latest/privatelink/gateway-endpoints.html
https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-s3.html
https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-ddb.html
https://docs.aws.amazon.com/vpc/latest/privatelink/create-interface-endpoint.html
https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-access.html
https://aws.amazon.com/privatelink/pricing/
https://aws.amazon.com/vpc/pricing/
1. Important Points#
VPC Endpoint 用来让 VPC 内资源私有访问 AWS service / endpoint service:
traffic stays on AWS network
private subnet can call supported service without public IP
can reduce NAT Gateway dependency for supported AWS service traffic
can improve security boundary with endpoint policy / security group
VPC Endpoint 不是:
general internet egress
NAT Gateway full replacement
firewall
cross-service magic route
automatic access to every AWS service
核心原则:
endpoint is per service / per region / per VPC
use gateway endpoint first for S3 / DynamoDB when VPC-local access is enough
use interface endpoint for supported AWS APIs and PrivateLink services
enable private DNS for AWS service interface endpoints unless there is a clear reason not to
endpoint policy is an extra guardrail, not a replacement for IAM
cost comparison must include hourly + per-GB + cross-AZ / data transfer path
2. Service Configuration#
endpoint types#
| Type |
Use Case |
Routing / DNS |
Cost Shape |
| Gateway endpoint |
S3 / DynamoDB |
route table target with AWS prefix list |
no additional endpoint hourly / data processing charge |
| Interface endpoint |
AWS services / PrivateLink endpoint service / SaaS |
ENI private IP + DNS/private DNS |
hourly per endpoint ENI/AZ + per-GB processing |
| Gateway Load Balancer endpoint |
inline firewall / appliance |
route table target |
GWLB endpoint pricing |
| Resource endpoint |
shared VPC resource through VPC Lattice resource configuration |
private resource access |
PrivateLink / VPC Lattice pricing |
| Service network endpoint |
VPC Lattice service network |
service network access |
VPC Lattice pricing |
most common:
S3:
gateway endpoint for VPC workloads
interface endpoint when on-prem / TGW / cross-VPC private access pattern requires it
DynamoDB:
gateway endpoint for VPC workloads
AWS APIs:
interface endpoint
examples:
sts
ecr.api
ecr.dkr
logs
monitoring
secretsmanager
kms
ssm
ec2
gateway endpoint#
gateway endpoint:
attach endpoint to selected route tables
AWS adds route:
destination = AWS-managed prefix list
target = gateway endpoint
supported services:
Amazon S3
DynamoDB
security controls:
route table association
endpoint policy
bucket policy / DynamoDB IAM condition
interface endpoint#
interface endpoint:
creates endpoint network interface in selected subnets
endpoint ENI has private IP
security group controls inbound traffic to endpoint ENI
private DNS can make normal service hostname resolve to endpoint private IPs
production baseline:
one subnet per AZ where clients run
private DNS enabled for AWS service endpoint
endpoint security group allows 443 from client security groups
endpoint policy scoped to required actions/resources
3. VPC Endpoint And NAT Gateway#
short answer#
没有 VPC Endpoint,不等于必须有 NAT Gateway。
必须看 workload 是否需要 outbound access:
no outbound requirement:
no NAT Gateway needed
no VPC endpoint needed
private subnet needs supported AWS service:
use VPC endpoint when possible
NAT Gateway is not required for that service traffic
private subnet needs public internet / third-party API / unsupported AWS public endpoint:
NAT Gateway or NAT instance or proxy is required for IPv4 egress
public subnet instance with public IP:
can use Internet Gateway directly
NAT Gateway is not required for that instance
with VPC endpoint#
| Traffic |
Need NAT Gateway? |
Notes |
| Private subnet -> S3 through gateway endpoint |
No |
route table sends S3 prefix list to endpoint |
| Private subnet -> DynamoDB through gateway endpoint |
No |
route table sends DynamoDB prefix list to endpoint |
| Private subnet -> supported AWS API through interface endpoint |
No, for that service |
private DNS sends SDK call to endpoint ENI |
| Private subnet -> public internet |
Yes, if IPv4 and no other egress path |
endpoint does not handle general internet |
| Private subnet -> unsupported AWS service public endpoint |
Usually yes |
unless another private connectivity pattern exists |
| Private subnet -> on-prem over VPN/DX/TGW |
No NAT for private route |
needs route/security design, not endpoint |
有 VPC Endpoint 后,不需要什么:
for S3/DynamoDB gateway endpoint traffic:
no NAT Gateway path
no Internet Gateway path
no public IP on instances
for interface endpoint supported service traffic:
no NAT Gateway path for that service
no public IP on instances
no app code change if private DNS is enabled
仍然可能需要什么:
NAT Gateway for general internet egress
NAT Gateway for OS package download / external SaaS API
Internet Gateway for public subnets / NAT Gateway itself
DNS hostnames and DNS resolution enabled for private DNS
security group rule to endpoint ENI for interface endpoint
without VPC endpoint#
| Traffic |
Need NAT Gateway? |
Notes |
| Private subnet -> public AWS service endpoint |
Yes, for IPv4 egress path |
typical path: private subnet -> NAT Gateway -> IGW -> AWS public endpoint |
| Private subnet -> S3/DynamoDB public endpoint |
Yes, unless route uses gateway endpoint |
NAT data processing applies if routed through NAT |
| Private subnet -> public internet |
Yes |
NAT Gateway / NAT instance / proxy |
| Public subnet instance with public IP -> AWS public endpoint |
No NAT Gateway |
route through Internet Gateway |
| No outbound traffic |
No NAT Gateway |
no endpoint required either |
没有 VPC Endpoint 时,不一定需要 NAT Gateway:
如果 workload 不出站:
no NAT
如果 workload 在 public subnet 且有 public IP:
use Internet Gateway directly
no NAT
如果 workload 只访问 VPC 内服务:
use local route / peering / TGW / private IP
no NAT
没有 VPC Endpoint 时,通常需要 NAT Gateway:
private subnet workload needs IPv4 outbound to public endpoints
examples:
call STS / ECR / CloudWatch Logs without interface endpoint
pull public package from internet
call third-party API
access S3/DynamoDB without gateway endpoint
cost relationship#
NAT Gateway cost shape:
hourly charge while provisioned
per-GB data processing charge for traffic through NAT
possible standard data transfer charges
cross-AZ traffic can add data transfer cost if instance and NAT are in different AZ
Gateway endpoint cost shape:
S3 / DynamoDB gateway endpoint:
no additional endpoint hourly charge
no endpoint data processing charge
normal service charges still apply
Interface endpoint cost shape:
hourly charge per endpoint ENI / AZ
per-GB data processing charge
possible cross-region / data transfer charges depending path
是不是 NAT Gateway 的流量费用一定比 VPC Endpoint 多:
S3 / DynamoDB gateway endpoint:
usually yes for that traffic path
gateway endpoint avoids NAT Gateway data processing charge
gateway endpoint has no additional endpoint hourly/data processing charge
interface endpoint:
not always
compare:
NAT hourly + NAT per-GB + data transfer
vs
interface endpoint hourly per AZ + PrivateLink per-GB + data transfer
low traffic many services:
many interface endpoints can cost more than one NAT Gateway
high traffic to supported AWS services:
endpoints often reduce NAT data processing cost and improve private security posture
practical cost rule:
always create S3 gateway endpoint for private subnets that access S3
always create DynamoDB gateway endpoint for private subnets that access DynamoDB
create interface endpoints for high-volume or security-sensitive AWS APIs
do not create every possible interface endpoint blindly
keep NAT Gateway for remaining internet egress
4. NAT Gateway Subnet And Route Tables#
where NAT Gateway lives#
NAT Gateway is created in one subnet:
the subnet must be public subnet
public subnet route table must have 0.0.0.0/0 -> Internet Gateway
NAT Gateway has an Elastic IP for public IPv4 egress
NAT Gateway does not automatically serve all subnets:
other subnets use it only if their route table points to it
route table decides traffic path
subnet AZ does not automatically bind to NAT Gateway AZ
example:
public-subnet-1a:
NAT Gateway nat-aaa lives here
route:
0.0.0.0/0 -> igw-xxx
private-subnet-1a route table:
0.0.0.0/0 -> nat-aaa
private-subnet-1c route table:
0.0.0.0/0 -> nat-aaa
result:
private-subnet-1c can route to NAT Gateway in public-subnet-1a
this works if route table, NACL, and security path allow it
same AZ vs cross AZ#
| Design |
Works? |
Recommendation |
Why |
| private subnet 1a -> NAT Gateway 1a |
Yes |
Recommended |
AZ-local, better failure isolation |
| private subnet 1c -> NAT Gateway 1a |
Yes |
Avoid for production if possible |
cross-AZ dependency and possible cross-AZ data transfer cost |
| all private subnets -> one NAT Gateway |
Yes |
acceptable for dev / low criticality |
cheaper hourly, weaker AZ resilience |
| one NAT Gateway per AZ |
Yes |
production baseline |
each AZ keeps egress if another AZ/NAT fails |
best practice:
create one NAT Gateway in each AZ that has private workloads
private subnet in AZ-a routes 0.0.0.0/0 to NAT Gateway in AZ-a
private subnet in AZ-c routes 0.0.0.0/0 to NAT Gateway in AZ-c
why:
reduce cross-AZ traffic
avoid one NAT Gateway becoming cross-AZ dependency
if AZ-a fails, AZ-c workloads still have AZ-local egress
relationship with VPC endpoint#
route priority:
more specific route wins
example route table:
com.amazonaws.ap-east-1.s3 prefix list -> vpce-s3
0.0.0.0/0 -> nat-aaa
result:
S3 traffic goes to VPC endpoint
other public IPv4 egress goes to NAT Gateway
NAT Gateway and VPC Endpoint can coexist:
endpoint handles selected AWS service traffic
NAT handles remaining internet / unsupported service traffic
VPC Endpoint does not choose NAT AZ:
endpoint route/DNS and NAT route are independent routing decisions
common mistakes#
mistakes:
NAT Gateway created in private subnet
public subnet for NAT has no route to Internet Gateway
private subnet route table has no 0.0.0.0/0 -> NAT Gateway
all AZs route to one NAT Gateway without accepting cross-AZ dependency
S3/DynamoDB traffic still goes through NAT because gateway endpoint route table was not associated
5. Routing / DNS Best Practices#
gateway endpoint route table#
route table:
destination:
com.amazonaws.<region>.s3 prefix list
com.amazonaws.<region>.dynamodb prefix list
target:
vpce-xxxxxxxx
effect:
only subnets associated with this route table use the gateway endpoint
aws ec2 describe-prefix-lists \
--filters "Name=prefix-list-name,Values=com.amazonaws.ap-east-1.s3"
interface endpoint private DNS#
private DNS enabled:
normal AWS service hostname resolves to private endpoint IPs inside VPC
SDK can keep using:
https://secretsmanager.ap-east-1.amazonaws.com
https://logs.ap-east-1.amazonaws.com
requirements:
VPC enableDnsHostnames=true
VPC enableDnsSupport=true
without private DNS:
app must use vpce-specific DNS name
harder to operate
security group#
interface endpoint security group:
inbound:
tcp/443 from client security group or subnet CIDR
outbound:
usually default is enough
client security group:
outbound:
tcp/443 to endpoint security group or endpoint subnet CIDR
6. Security Best Practices#
endpoint policy#
{
"Statement": [
{
"Effect": "Allow",
"Principal": "*",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::order-prod-bucket/*"
]
}
]
}
endpoint policy:
controls what can be accessed through the endpoint
does not grant permissions by itself
IAM principal still needs permission
S3 bucket policy can restrict access to endpoint:
aws:SourceVpce
aws:VpcSourceIp
DynamoDB IAM can use endpoint-related conditions where supported
common security mistakes#
mistakes:
endpoint policy = *
endpoint security group allows 0.0.0.0/0
private DNS disabled without reason
app still routes through NAT because DNS / route table not updated
S3 bucket policy still uses aws:SourceIp after moving to endpoint
no CloudTrail review for access path
7. Reliability / Design#
interface endpoint HA:
create endpoint in every AZ where clients run
avoid cross-AZ endpoint traffic when possible
endpoint ENI IP is stable for endpoint lifetime
gateway endpoint reliability:
regional service route through route table
route table association decides coverage
deployment notes:
creating/modifying S3 gateway endpoint can reset existing TCP connections
roll out during low-risk window
application should retry idempotent requests
8. Monitoring#
what to monitor:
NAT Gateway:
BytesInFromSource
BytesOutToDestination
PacketsDropCount
ErrorPortAllocation
VPC endpoint:
CloudTrail for CreateVpcEndpoint / ModifyVpcEndpoint / DeleteVpcEndpoint
VPC Flow Logs for endpoint ENI traffic
service-side metrics:
S3 / DynamoDB / CloudWatch Logs / Secrets Manager / ECR
cost:
NATGateway-Bytes
NatGateway-Hours
VPC-Endpoint-Hours
VPC-Endpoint-Bytes
cost investigation:
check NAT Gateway bytes first
identify top private subnets / ENIs with VPC Flow Logs
map destination service:
S3 / DynamoDB:
add gateway endpoint
AWS API supported by PrivateLink:
consider interface endpoint
public internet:
NAT still required
9. Hands-on#
create S3 gateway endpoint#
export AWS_PAGER=""
export AWS_REGION="ap-east-1"
export VPC_ID="vpc-0123456789abcdef0"
export ROUTE_TABLE_ID="rtb-0123456789abcdef0"
aws ec2 create-vpc-endpoint \
--region "$AWS_REGION" \
--vpc-id "$VPC_ID" \
--service-name "com.amazonaws.${AWS_REGION}.s3" \
--vpc-endpoint-type Gateway \
--route-table-ids "$ROUTE_TABLE_ID"
create DynamoDB gateway endpoint#
aws ec2 create-vpc-endpoint \
--region "$AWS_REGION" \
--vpc-id "$VPC_ID" \
--service-name "com.amazonaws.${AWS_REGION}.dynamodb" \
--vpc-endpoint-type Gateway \
--route-table-ids "$ROUTE_TABLE_ID"
create Secrets Manager interface endpoint#
export SUBNET_ID_1="subnet-11111111111111111"
export SUBNET_ID_2="subnet-22222222222222222"
export ENDPOINT_SG_ID="sg-0123456789abcdef0"
aws ec2 create-vpc-endpoint \
--region "$AWS_REGION" \
--vpc-id "$VPC_ID" \
--service-name "com.amazonaws.${AWS_REGION}.secretsmanager" \
--vpc-endpoint-type Interface \
--subnet-ids "$SUBNET_ID_1" "$SUBNET_ID_2" \
--security-group-ids "$ENDPOINT_SG_ID" \
--private-dns-enabled
verify route and DNS#
aws ec2 describe-vpc-endpoints \
--filters "Name=vpc-id,Values=$VPC_ID" \
--query "VpcEndpoints[*].{Id:VpcEndpointId,Type:VpcEndpointType,Service:ServiceName,State:State,PrivateDns:PrivateDnsEnabled}"
dig secretsmanager.ap-east-1.amazonaws.com
aws s3 ls s3://order-prod-bucket/
aws secretsmanager list-secrets --region ap-east-1
verify:
S3/DynamoDB route table has prefix-list route to gateway endpoint
interface endpoint DNS resolves to private IP inside VPC
NAT Gateway bytes decrease for moved AWS service traffic
app still reaches public internet if NAT is still required
10. Production Checklist#
design:
each outbound dependency classified:
S3 / DynamoDB
AWS API with PrivateLink
public internet
on-prem / private network
NAT Gateway remains only for traffic that needs it
endpoint list is not blindly copied across accounts
routing:
gateway endpoint associated with correct private subnet route tables
interface endpoint created in client AZs
private DNS enabled and VPC DNS attributes enabled
security:
endpoint policy scoped
endpoint security group scoped
S3 bucket policy uses aws:SourceVpce / aws:VpcSourceIp where appropriate
IAM least privilege still enforced
cost:
NAT Gateway bytes monitored before/after endpoint rollout
S3/DynamoDB traffic uses gateway endpoint
interface endpoint hourly cost reviewed per AZ
cross-AZ and cross-region paths reviewed
operations:
VPC Flow Logs available for troubleshooting
endpoint deletion is change-controlled
runbook explains DNS and route verification