AWS VPC Endpoint


https://docs.aws.amazon.com/vpc/latest/privatelink/concepts.html
https://docs.aws.amazon.com/vpc/latest/privatelink/privatelink-access-aws-services.html
https://docs.aws.amazon.com/vpc/latest/privatelink/gateway-endpoints.html
https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-s3.html
https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-ddb.html
https://docs.aws.amazon.com/vpc/latest/privatelink/create-interface-endpoint.html
https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-access.html
https://aws.amazon.com/privatelink/pricing/
https://aws.amazon.com/vpc/pricing/

1. Important Points#

VPC Endpoint 用来让 VPC 内资源私有访问 AWS service / endpoint service:
    traffic stays on AWS network
    private subnet can call supported service without public IP
    can reduce NAT Gateway dependency for supported AWS service traffic
    can improve security boundary with endpoint policy / security group

VPC Endpoint 不是:
    general internet egress
    NAT Gateway full replacement
    firewall
    cross-service magic route
    automatic access to every AWS service
核心原则:
    endpoint is per service / per region / per VPC
    use gateway endpoint first for S3 / DynamoDB when VPC-local access is enough
    use interface endpoint for supported AWS APIs and PrivateLink services
    enable private DNS for AWS service interface endpoints unless there is a clear reason not to
    endpoint policy is an extra guardrail, not a replacement for IAM
    cost comparison must include hourly + per-GB + cross-AZ / data transfer path

2. Service Configuration#

endpoint types#

Type Use Case Routing / DNS Cost Shape
Gateway endpoint S3 / DynamoDB route table target with AWS prefix list no additional endpoint hourly / data processing charge
Interface endpoint AWS services / PrivateLink endpoint service / SaaS ENI private IP + DNS/private DNS hourly per endpoint ENI/AZ + per-GB processing
Gateway Load Balancer endpoint inline firewall / appliance route table target GWLB endpoint pricing
Resource endpoint shared VPC resource through VPC Lattice resource configuration private resource access PrivateLink / VPC Lattice pricing
Service network endpoint VPC Lattice service network service network access VPC Lattice pricing
most common:
    S3:
        gateway endpoint for VPC workloads
        interface endpoint when on-prem / TGW / cross-VPC private access pattern requires it

    DynamoDB:
        gateway endpoint for VPC workloads

    AWS APIs:
        interface endpoint
        examples:
            sts
            ecr.api
            ecr.dkr
            logs
            monitoring
            secretsmanager
            kms
            ssm
            ec2

gateway endpoint#

gateway endpoint:
    attach endpoint to selected route tables
    AWS adds route:
        destination = AWS-managed prefix list
        target = gateway endpoint

supported services:
    Amazon S3
    DynamoDB

security controls:
    route table association
    endpoint policy
    bucket policy / DynamoDB IAM condition

interface endpoint#

interface endpoint:
    creates endpoint network interface in selected subnets
    endpoint ENI has private IP
    security group controls inbound traffic to endpoint ENI
    private DNS can make normal service hostname resolve to endpoint private IPs

production baseline:
    one subnet per AZ where clients run
    private DNS enabled for AWS service endpoint
    endpoint security group allows 443 from client security groups
    endpoint policy scoped to required actions/resources

3. VPC Endpoint And NAT Gateway#

short answer#

没有 VPC Endpoint,不等于必须有 NAT Gateway。

必须看 workload 是否需要 outbound access:
    no outbound requirement:
        no NAT Gateway needed
        no VPC endpoint needed

    private subnet needs supported AWS service:
        use VPC endpoint when possible
        NAT Gateway is not required for that service traffic

    private subnet needs public internet / third-party API / unsupported AWS public endpoint:
        NAT Gateway or NAT instance or proxy is required for IPv4 egress

    public subnet instance with public IP:
        can use Internet Gateway directly
        NAT Gateway is not required for that instance

with VPC endpoint#

Traffic Need NAT Gateway? Notes
Private subnet -> S3 through gateway endpoint No route table sends S3 prefix list to endpoint
Private subnet -> DynamoDB through gateway endpoint No route table sends DynamoDB prefix list to endpoint
Private subnet -> supported AWS API through interface endpoint No, for that service private DNS sends SDK call to endpoint ENI
Private subnet -> public internet Yes, if IPv4 and no other egress path endpoint does not handle general internet
Private subnet -> unsupported AWS service public endpoint Usually yes unless another private connectivity pattern exists
Private subnet -> on-prem over VPN/DX/TGW No NAT for private route needs route/security design, not endpoint
有 VPC Endpoint 后,不需要什么:
    for S3/DynamoDB gateway endpoint traffic:
        no NAT Gateway path
        no Internet Gateway path
        no public IP on instances

    for interface endpoint supported service traffic:
        no NAT Gateway path for that service
        no public IP on instances
        no app code change if private DNS is enabled

仍然可能需要什么:
    NAT Gateway for general internet egress
    NAT Gateway for OS package download / external SaaS API
    Internet Gateway for public subnets / NAT Gateway itself
    DNS hostnames and DNS resolution enabled for private DNS
    security group rule to endpoint ENI for interface endpoint

without VPC endpoint#

Traffic Need NAT Gateway? Notes
Private subnet -> public AWS service endpoint Yes, for IPv4 egress path typical path: private subnet -> NAT Gateway -> IGW -> AWS public endpoint
Private subnet -> S3/DynamoDB public endpoint Yes, unless route uses gateway endpoint NAT data processing applies if routed through NAT
Private subnet -> public internet Yes NAT Gateway / NAT instance / proxy
Public subnet instance with public IP -> AWS public endpoint No NAT Gateway route through Internet Gateway
No outbound traffic No NAT Gateway no endpoint required either
没有 VPC Endpoint 时,不一定需要 NAT Gateway:
    如果 workload 不出站:
        no NAT

    如果 workload 在 public subnet 且有 public IP:
        use Internet Gateway directly
        no NAT

    如果 workload 只访问 VPC 内服务:
        use local route / peering / TGW / private IP
        no NAT

没有 VPC Endpoint 时,通常需要 NAT Gateway:
    private subnet workload needs IPv4 outbound to public endpoints
    examples:
        call STS / ECR / CloudWatch Logs without interface endpoint
        pull public package from internet
        call third-party API
        access S3/DynamoDB without gateway endpoint

cost relationship#

NAT Gateway cost shape:
    hourly charge while provisioned
    per-GB data processing charge for traffic through NAT
    possible standard data transfer charges
    cross-AZ traffic can add data transfer cost if instance and NAT are in different AZ

Gateway endpoint cost shape:
    S3 / DynamoDB gateway endpoint:
        no additional endpoint hourly charge
        no endpoint data processing charge
        normal service charges still apply

Interface endpoint cost shape:
    hourly charge per endpoint ENI / AZ
    per-GB data processing charge
    possible cross-region / data transfer charges depending path
是不是 NAT Gateway 的流量费用一定比 VPC Endpoint 多:
    S3 / DynamoDB gateway endpoint:
        usually yes for that traffic path
        gateway endpoint avoids NAT Gateway data processing charge
        gateway endpoint has no additional endpoint hourly/data processing charge

    interface endpoint:
        not always
        compare:
            NAT hourly + NAT per-GB + data transfer
            vs
            interface endpoint hourly per AZ + PrivateLink per-GB + data transfer

    low traffic many services:
        many interface endpoints can cost more than one NAT Gateway

    high traffic to supported AWS services:
        endpoints often reduce NAT data processing cost and improve private security posture
practical cost rule:
    always create S3 gateway endpoint for private subnets that access S3
    always create DynamoDB gateway endpoint for private subnets that access DynamoDB
    create interface endpoints for high-volume or security-sensitive AWS APIs
    do not create every possible interface endpoint blindly
    keep NAT Gateway for remaining internet egress

4. NAT Gateway Subnet And Route Tables#

where NAT Gateway lives#

NAT Gateway is created in one subnet:
    the subnet must be public subnet
    public subnet route table must have 0.0.0.0/0 -> Internet Gateway
    NAT Gateway has an Elastic IP for public IPv4 egress

NAT Gateway does not automatically serve all subnets:
    other subnets use it only if their route table points to it
    route table decides traffic path
    subnet AZ does not automatically bind to NAT Gateway AZ
example:
    public-subnet-1a:
        NAT Gateway nat-aaa lives here
        route:
            0.0.0.0/0 -> igw-xxx

    private-subnet-1a route table:
        0.0.0.0/0 -> nat-aaa

    private-subnet-1c route table:
        0.0.0.0/0 -> nat-aaa

result:
    private-subnet-1c can route to NAT Gateway in public-subnet-1a
    this works if route table, NACL, and security path allow it

same AZ vs cross AZ#

Design Works? Recommendation Why
private subnet 1a -> NAT Gateway 1a Yes Recommended AZ-local, better failure isolation
private subnet 1c -> NAT Gateway 1a Yes Avoid for production if possible cross-AZ dependency and possible cross-AZ data transfer cost
all private subnets -> one NAT Gateway Yes acceptable for dev / low criticality cheaper hourly, weaker AZ resilience
one NAT Gateway per AZ Yes production baseline each AZ keeps egress if another AZ/NAT fails
best practice:
    create one NAT Gateway in each AZ that has private workloads
    private subnet in AZ-a routes 0.0.0.0/0 to NAT Gateway in AZ-a
    private subnet in AZ-c routes 0.0.0.0/0 to NAT Gateway in AZ-c

why:
    reduce cross-AZ traffic
    avoid one NAT Gateway becoming cross-AZ dependency
    if AZ-a fails, AZ-c workloads still have AZ-local egress

relationship with VPC endpoint#

route priority:
    more specific route wins

example route table:
    com.amazonaws.ap-east-1.s3 prefix list -> vpce-s3
    0.0.0.0/0 -> nat-aaa

result:
    S3 traffic goes to VPC endpoint
    other public IPv4 egress goes to NAT Gateway
NAT Gateway and VPC Endpoint can coexist:
    endpoint handles selected AWS service traffic
    NAT handles remaining internet / unsupported service traffic

VPC Endpoint does not choose NAT AZ:
    endpoint route/DNS and NAT route are independent routing decisions

common mistakes#

mistakes:
    NAT Gateway created in private subnet
    public subnet for NAT has no route to Internet Gateway
    private subnet route table has no 0.0.0.0/0 -> NAT Gateway
    all AZs route to one NAT Gateway without accepting cross-AZ dependency
    S3/DynamoDB traffic still goes through NAT because gateway endpoint route table was not associated

5. Routing / DNS Best Practices#

gateway endpoint route table#

route table:
    destination:
        com.amazonaws.<region>.s3 prefix list
        com.amazonaws.<region>.dynamodb prefix list

    target:
        vpce-xxxxxxxx

effect:
    only subnets associated with this route table use the gateway endpoint
aws ec2 describe-prefix-lists \
  --filters "Name=prefix-list-name,Values=com.amazonaws.ap-east-1.s3"

interface endpoint private DNS#

private DNS enabled:
    normal AWS service hostname resolves to private endpoint IPs inside VPC
    SDK can keep using:
        https://secretsmanager.ap-east-1.amazonaws.com
        https://logs.ap-east-1.amazonaws.com

requirements:
    VPC enableDnsHostnames=true
    VPC enableDnsSupport=true

without private DNS:
    app must use vpce-specific DNS name
    harder to operate

security group#

interface endpoint security group:
    inbound:
        tcp/443 from client security group or subnet CIDR

    outbound:
        usually default is enough

client security group:
    outbound:
        tcp/443 to endpoint security group or endpoint subnet CIDR

6. Security Best Practices#

endpoint policy#

{
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": "*",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::order-prod-bucket/*"
      ]
    }
  ]
}
endpoint policy:
    controls what can be accessed through the endpoint
    does not grant permissions by itself
    IAM principal still needs permission

S3 bucket policy can restrict access to endpoint:
    aws:SourceVpce
    aws:VpcSourceIp

DynamoDB IAM can use endpoint-related conditions where supported

common security mistakes#

mistakes:
    endpoint policy = *
    endpoint security group allows 0.0.0.0/0
    private DNS disabled without reason
    app still routes through NAT because DNS / route table not updated
    S3 bucket policy still uses aws:SourceIp after moving to endpoint
    no CloudTrail review for access path

7. Reliability / Design#

interface endpoint HA:
    create endpoint in every AZ where clients run
    avoid cross-AZ endpoint traffic when possible
    endpoint ENI IP is stable for endpoint lifetime

gateway endpoint reliability:
    regional service route through route table
    route table association decides coverage

deployment notes:
    creating/modifying S3 gateway endpoint can reset existing TCP connections
    roll out during low-risk window
    application should retry idempotent requests

8. Monitoring#

what to monitor:
    NAT Gateway:
        BytesInFromSource
        BytesOutToDestination
        PacketsDropCount
        ErrorPortAllocation

    VPC endpoint:
        CloudTrail for CreateVpcEndpoint / ModifyVpcEndpoint / DeleteVpcEndpoint
        VPC Flow Logs for endpoint ENI traffic
        service-side metrics:
            S3 / DynamoDB / CloudWatch Logs / Secrets Manager / ECR

    cost:
        NATGateway-Bytes
        NatGateway-Hours
        VPC-Endpoint-Hours
        VPC-Endpoint-Bytes
cost investigation:
    check NAT Gateway bytes first
    identify top private subnets / ENIs with VPC Flow Logs
    map destination service:
        S3 / DynamoDB:
            add gateway endpoint

        AWS API supported by PrivateLink:
            consider interface endpoint

        public internet:
            NAT still required

9. Hands-on#

create S3 gateway endpoint#

export AWS_PAGER=""
export AWS_REGION="ap-east-1"
export VPC_ID="vpc-0123456789abcdef0"
export ROUTE_TABLE_ID="rtb-0123456789abcdef0"

aws ec2 create-vpc-endpoint \
  --region "$AWS_REGION" \
  --vpc-id "$VPC_ID" \
  --service-name "com.amazonaws.${AWS_REGION}.s3" \
  --vpc-endpoint-type Gateway \
  --route-table-ids "$ROUTE_TABLE_ID"

create DynamoDB gateway endpoint#

aws ec2 create-vpc-endpoint \
  --region "$AWS_REGION" \
  --vpc-id "$VPC_ID" \
  --service-name "com.amazonaws.${AWS_REGION}.dynamodb" \
  --vpc-endpoint-type Gateway \
  --route-table-ids "$ROUTE_TABLE_ID"

create Secrets Manager interface endpoint#

export SUBNET_ID_1="subnet-11111111111111111"
export SUBNET_ID_2="subnet-22222222222222222"
export ENDPOINT_SG_ID="sg-0123456789abcdef0"

aws ec2 create-vpc-endpoint \
  --region "$AWS_REGION" \
  --vpc-id "$VPC_ID" \
  --service-name "com.amazonaws.${AWS_REGION}.secretsmanager" \
  --vpc-endpoint-type Interface \
  --subnet-ids "$SUBNET_ID_1" "$SUBNET_ID_2" \
  --security-group-ids "$ENDPOINT_SG_ID" \
  --private-dns-enabled

verify route and DNS#

aws ec2 describe-vpc-endpoints \
  --filters "Name=vpc-id,Values=$VPC_ID" \
  --query "VpcEndpoints[*].{Id:VpcEndpointId,Type:VpcEndpointType,Service:ServiceName,State:State,PrivateDns:PrivateDnsEnabled}"
dig secretsmanager.ap-east-1.amazonaws.com
aws s3 ls s3://order-prod-bucket/
aws secretsmanager list-secrets --region ap-east-1
verify:
    S3/DynamoDB route table has prefix-list route to gateway endpoint
    interface endpoint DNS resolves to private IP inside VPC
    NAT Gateway bytes decrease for moved AWS service traffic
    app still reaches public internet if NAT is still required

10. Production Checklist#

design:
    each outbound dependency classified:
        S3 / DynamoDB
        AWS API with PrivateLink
        public internet
        on-prem / private network

    NAT Gateway remains only for traffic that needs it
    endpoint list is not blindly copied across accounts

routing:
    gateway endpoint associated with correct private subnet route tables
    interface endpoint created in client AZs
    private DNS enabled and VPC DNS attributes enabled

security:
    endpoint policy scoped
    endpoint security group scoped
    S3 bucket policy uses aws:SourceVpce / aws:VpcSourceIp where appropriate
    IAM least privilege still enforced

cost:
    NAT Gateway bytes monitored before/after endpoint rollout
    S3/DynamoDB traffic uses gateway endpoint
    interface endpoint hourly cost reviewed per AZ
    cross-AZ and cross-region paths reviewed

operations:
    VPC Flow Logs available for troubleshooting
    endpoint deletion is change-controlled
    runbook explains DNS and route verification