ECS Logs To VictoriaLogs


https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_firelens.html
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/firelens-taskdef.html
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_awslogs.html
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/SubscriptionFilters.html
https://docs.aws.amazon.com/lambda/latest/dg/services-cloudwatchlogs.html
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html
https://docs.victoriametrics.com/victorialogs/
https://docs.victoriametrics.com/victorialogs/data-ingestion/
https://docs.victoriametrics.com/victorialogs/querying/

1. Important Points#

目标:
    ECS application logs -> self-hosted VictoriaLogs

推荐优先级:
    1. ECS FireLens / Fluent Bit -> VictoriaLogs
        best for new ECS services
        works with Fargate and EC2
        no CloudWatch Logs ingestion cost for app logs

    2. awslogs -> CloudWatch Logs subscription -> Lambda -> VictoriaLogs
        best for existing services already using awslogs
        keeps CloudWatch Logs as buffer/inspection layer
        extra CloudWatch Logs + Lambda cost

    3. CloudWatch Agent on ECS EC2 -> CloudWatch Logs -> Lambda -> VictoriaLogs
        only for ECS EC2 host/file logs
        not for Fargate
        not ideal for normal container stdout

    4. sidecar log agent with shared volume -> VictoriaLogs
        useful when app writes files
        more moving parts than FireLens
大局观:
    FireLens:
        container stdout/stderr -> log router sidecar -> VictoriaLogs

    awslogs subscription:
        container stdout/stderr -> CloudWatch Logs -> subscription Lambda -> VictoriaLogs

    CloudWatch Agent:
        EC2 host/file logs -> CloudWatch Logs -> subscription Lambda -> VictoriaLogs

关键选择:
    如果你控制 ECS task definition:
        use FireLens direct

    如果已有大量 awslogs:
        use subscription bridge first, migrate later

    如果必须保留 CloudWatch Logs:
        use awslogs + subscription

2. Architecture Options#

Option Path Fargate ECS EC2 Best For Tradeoff
FireLens direct app -> Fluent Bit -> VictoriaLogs Yes Yes new services need log router sidecar
awslogs bridge app -> CloudWatch Logs -> Lambda -> VictoriaLogs Yes Yes migration / dual-write pattern extra cost and latency
CloudWatch Agent bridge file/host logs -> CloudWatch Logs -> Lambda -> VictoriaLogs No Yes host logs / legacy file logs not for Fargate
custom sidecar app file -> shared volume -> agent -> VictoriaLogs Yes Yes app writes files file lifecycle and backpressure complexity
recommendation:
    For ECS app stdout logs:
        FireLens direct is the cleanest.

    For audit/compliance requiring CloudWatch Logs copy:
        awslogs bridge is simpler.

    For EC2 instance system logs:
        CloudWatch Agent bridge is acceptable.

3. VictoriaLogs Ingest Basics#

VictoriaLogs commonly accepts:
    JSON line ingestion
    Elasticsearch bulk-compatible ingestion
    syslog / other ingestion paths depending deployment

example endpoint:
    http://victorialogs.internal:9428/insert/jsonline

recommended labels / stream fields:
    service
    env
    cluster
    task_definition
    container

verify VictoriaLogs#

curl -s http://victorialogs.internal:9428/health

insert one JSON line#

printf '{"_msg":"hello from ecs","service":"order-api","env":"dev","container":"app"}\n' \
  | curl -sS \
      -H 'content-type: application/stream+json' \
      --data-binary @- \
      'http://victorialogs.internal:9428/insert/jsonline?_stream_fields=service,env,container'

query#

curl -G 'http://victorialogs.internal:9428/select/logsql/query' \
  --data-urlencode 'query=service:order-api'

4. Option A: ECS FireLens Direct#

when to use#

use when:
    new ECS service
    app logs to stdout/stderr
    want direct delivery to VictoriaLogs
    want to avoid CloudWatch Logs as the main ingestion path

works with:
    ECS Fargate
    ECS EC2

network#

ECS task must reach VictoriaLogs:
    same VPC:
        use private IP / internal NLB / Cloud Map DNS

    other VPC:
        VPC peering / Transit Gateway / PrivateLink

    public endpoint:
        use NAT Gateway / egress proxy
        TLS and auth strongly recommended

task execution role policy#

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowPullImageAndWriteRouterLogs",
      "Effect": "Allow",
      "Action": [
        "ecr:GetAuthorizationToken",
        "ecr:BatchCheckLayerAvailability",
        "ecr:GetDownloadUrlForLayer",
        "ecr:BatchGetImage",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "*"
    }
  ]
}
why CloudWatch Logs permissions still appear:
    log_router container itself often uses awslogs for its own diagnostics
    application logs can still go directly to VictoriaLogs through FireLens

least privilege:
    ECR actions often need Resource="*"
    CloudWatch Logs can be scoped to router log group when you create it upfront

task definition sample#

{
  "family": "order-api",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024",
  "executionRoleArn": "arn:aws:iam::123456789012:role/ecs-task-execution-role",
  "taskRoleArn": "arn:aws:iam::123456789012:role/order-api-task-role",
  "containerDefinitions": [
    {
      "name": "log_router",
      "image": "public.ecr.aws/aws-observability/aws-for-fluent-bit:stable",
      "essential": true,
      "firelensConfiguration": {
        "type": "fluentbit"
      },
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/order-api/firelens",
          "awslogs-region": "ap-east-1",
          "awslogs-stream-prefix": "firelens"
        }
      }
    },
    {
      "name": "app",
      "image": "123456789012.dkr.ecr.ap-east-1.amazonaws.com/order-api:2026-06-02",
      "essential": true,
      "portMappings": [
        {
          "containerPort": 3000,
          "protocol": "tcp"
        }
      ],
      "logConfiguration": {
        "logDriver": "awsfirelens",
        "options": {
          "Name": "http",
          "Host": "victorialogs.internal",
          "Port": "9428",
          "URI": "/insert/jsonline?_stream_fields=service,env,ecs_cluster,ecs_task_definition,container",
          "Format": "json_stream",
          "Header": "Content-Type application/stream+json"
        }
      },
      "environment": [
        {
          "name": "SERVICE",
          "value": "order-api"
        },
        {
          "name": "ENV",
          "value": "prod"
        }
      ]
    }
  ]
}
notes:
    awsfirelens sends app stdout/stderr to Fluent Bit.
    http output sends records to VictoriaLogs.
    log_router diagnostics still go to CloudWatch Logs.
    For TLS endpoint, use https/443 and configure CA/auth according to your Fluent Bit image support.

app log format#

{
  "level": "info",
  "msg": "order created",
  "service": "order-api",
  "env": "prod",
  "order_id": "ord_001",
  "request_id": "req_001"
}
best practice:
    log JSON to stdout
    include service/env/version/request_id
    avoid secrets / tokens / raw PII
    keep high-cardinality fields out of stream fields

verify#

aws ecs describe-tasks \
  --cluster prod-app \
  --tasks arn:aws:ecs:ap-east-1:123456789012:task/prod-app/abc
curl -G 'http://victorialogs.internal:9428/select/logsql/query' \
  --data-urlencode 'query=service:order-api env:prod'

common failures#

log_router exits:
    check /ecs/order-api/firelens CloudWatch log group
    check Fluent Bit output plugin option names

no logs in VictoriaLogs:
    ECS task cannot reach victorialogs.internal:9428
    security group / NACL / route table blocked
    wrong URI or content-type

logs arrive but query is hard:
    missing service/env/container fields
    wrong stream fields
    app logs are plain text instead of JSON

5. Option B: awslogs -> CloudWatch Logs -> Lambda -> VictoriaLogs#

when to use#

use when:
    ECS service already uses awslogs
    you want CloudWatch Logs as source of truth for a while
    you want low-risk migration to VictoriaLogs
    compliance requires logs in CloudWatch Logs

tradeoff:
    extra CloudWatch Logs ingestion/storage cost
    Lambda forwarding latency
    retry/error handling belongs to Lambda

ECS awslogs config#

{
  "logConfiguration": {
    "logDriver": "awslogs",
    "options": {
      "awslogs-group": "/ecs/order-api",
      "awslogs-region": "ap-east-1",
      "awslogs-stream-prefix": "app"
    }
  }
}

Lambda execution role policy#

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowLambdaWriteOwnLogs",
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "arn:aws:logs:ap-east-1:123456789012:log-group:/aws/lambda/cwlogs-to-victorialogs:*"
    }
  ]
}
if Lambda runs in VPC:
    add AWSLambdaVPCAccessExecutionRole permissions
    configure subnets / security group to reach VictoriaLogs

allow CloudWatch Logs to invoke Lambda#

aws lambda add-permission \
  --function-name cwlogs-to-victorialogs \
  --statement-id AllowCloudWatchLogsInvoke \
  --action lambda:InvokeFunction \
  --principal logs.ap-east-1.amazonaws.com \
  --source-arn arn:aws:logs:ap-east-1:123456789012:log-group:/ecs/order-api:*

create subscription filter#

aws logs put-subscription-filter \
  --log-group-name /ecs/order-api \
  --filter-name to-victorialogs \
  --filter-pattern "" \
  --destination-arn arn:aws:lambda:ap-east-1:123456789012:function:cwlogs-to-victorialogs

Lambda forwarder minimal code#

import base64
import gzip
import json
import os
import urllib.request


VICTORIALOGS_URL = os.environ["VICTORIALOGS_URL"]


def lambda_handler(event, context):
    compressed = base64.b64decode(event["awslogs"]["data"])
    payload = json.loads(gzip.decompress(compressed))

    lines = []
    for log_event in payload.get("logEvents", []):
        item = {
            "_msg": log_event.get("message", ""),
            "timestamp": log_event.get("timestamp"),
            "log_group": payload.get("logGroup"),
            "log_stream": payload.get("logStream"),
            "owner": payload.get("owner"),
            "subscription_filters": ",".join(payload.get("subscriptionFilters", [])),
        }
        lines.append(json.dumps(item, separators=(",", ":")))

    if not lines:
        return {"records": 0}

    body = ("\n".join(lines) + "\n").encode("utf-8")
    req = urllib.request.Request(
        VICTORIALOGS_URL,
        data=body,
        headers={"content-type": "application/stream+json"},
        method="POST",
    )

    with urllib.request.urlopen(req, timeout=5) as resp:
        resp.read()

    return {"records": len(lines)}

Lambda environment#

VICTORIALOGS_URL=http://victorialogs.internal:9428/insert/jsonline?_stream_fields=log_group,log_stream

verify#

aws logs describe-subscription-filters \
  --log-group-name /ecs/order-api
aws logs tail /aws/lambda/cwlogs-to-victorialogs --follow
curl -G 'http://victorialogs.internal:9428/select/logsql/query' \
  --data-urlencode 'query=log_group:/ecs/order-api'

common failures#

subscription filter not invoking:
    Lambda permission source ARN wrong
    region mismatch
    filter is attached to wrong log group

Lambda timeout:
    VictoriaLogs network path blocked
    endpoint DNS cannot resolve inside VPC
    batch too large / timeout too low

duplicated logs:
    Lambda retry after partial failure
    design ingestion to tolerate duplicates

6. Option C: CloudWatch Agent On ECS EC2#

when to use#

use when:
    ECS launch type is EC2
    logs are written to host files
    you need collect /var/log/messages or custom app files

do not use for:
    Fargate
    normal container stdout/stderr
    replacing FireLens for new ECS app logs

CloudWatch Agent config#

{
  "logs": {
    "logs_collected": {
      "files": {
        "collect_list": [
          {
            "file_path": "/var/log/ecs/ecs-agent.log",
            "log_group_name": "/ecs/container-instance/ecs-agent",
            "log_stream_name": "{instance_id}",
            "timezone": "UTC"
          },
          {
            "file_path": "/var/log/order-api/*.log",
            "log_group_name": "/ecs/order-api/file",
            "log_stream_name": "{instance_id}",
            "timezone": "UTC"
          }
        ]
      }
    }
  }
}
then:
    CloudWatch Agent -> CloudWatch Logs
    CloudWatch Logs subscription -> Lambda -> VictoriaLogs

same subscription pattern as Option B.

EC2 instance role policy#

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowCloudWatchAgentLogs",
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents",
        "logs:DescribeLogStreams"
      ],
      "Resource": "arn:aws:logs:ap-east-1:123456789012:log-group:/ecs/*"
    }
  ]
}

7. Option D: Custom Sidecar Agent#

when to use#

use when:
    app writes logs to file
    app and sidecar can share a volume
    you need agent features not available through FireLens config

avoid when:
    app already writes JSON to stdout
    FireLens can solve the problem

task volume pattern#

{
  "volumes": [
    {
      "name": "app-logs"
    }
  ],
  "containerDefinitions": [
    {
      "name": "app",
      "mountPoints": [
        {
          "sourceVolume": "app-logs",
          "containerPath": "/var/log/order-api"
        }
      ]
    },
    {
      "name": "log-agent",
      "image": "fluent/fluent-bit:latest",
      "mountPoints": [
        {
          "sourceVolume": "app-logs",
          "containerPath": "/var/log/order-api",
          "readOnly": true
        }
      ]
    }
  ]
}
sidecar risks:
    file rotation must be correct
    sidecar must not fall behind silently
    shared volume lifetime is tied to task
    multi-line logs need explicit parser

8. Best Practices#

log schema:
    JSON logs
    service
    env
    version
    request_id / trace_id
    level
    msg
    event_time

do not log:
    password
    access token
    refresh token
    raw authorization header
    full credit card / identity document
stream fields:
    good:
        service
        env
        cluster
        container

    bad:
        request_id
        user_id
        order_id
        full_url
cost:
    FireLens direct:
        avoids CloudWatch Logs app ingestion cost
        requires operating log router path

    CloudWatch bridge:
        easier migration
        duplicates storage/ingestion path
        higher AWS cost
reliability:
    log delivery is usually at-least-once
    tolerate duplicates
    monitor backlog/errors
    keep local app logging non-blocking
    avoid application crash when logging backend is down

9. Monitoring#

ECS / FireLens:
    log_router container health
    log_router CloudWatch diagnostic logs
    task stopped reason
    CPU/memory of log_router

CloudWatch bridge:
    Lambda Errors
    Lambda Throttles
    Lambda Duration
    Lambda IteratorAge is not applicable here
    subscription filter delivery errors

VictoriaLogs:
    ingest request rate
    ingest errors
    disk usage
    query latency
    retention
alerts:
    log_router exits
    Lambda errors > 0
    VictoriaLogs ingest errors > 0
    no logs from service for N minutes
    VictoriaLogs disk free low

10. Production Checklist#

architecture:
    chosen path documented
    FireLens direct preferred for new ECS app logs
    CloudWatch bridge used only when CloudWatch retention/migration is needed
    CloudWatch Agent used only for ECS EC2 host/file logs

security:
    VictoriaLogs endpoint private or protected by TLS/auth
    task execution role least privilege
    Lambda role least privilege
    sensitive fields redacted at app or agent

operations:
    verify command documented
    log_router diagnostics enabled
    Lambda forwarder alarms enabled
    VictoriaLogs ingest/disk alarms enabled
    replay strategy exists for CloudWatch bridge

schema:
    JSON log format standardized
    service/env/container fields included
    high-cardinality fields not used as stream fields