Experimenting with Aurora Serverless, AWS Lambdas and AWS API Gateways

I was able to deploy the API using serverless.com on AWS - I used HTTP API type with a lambda proxy in the API Gateway because it’s considerably cheaper than the REST API mode. Then I used terraform to create a new VPC and subnets with custom ACLs and deployed an Aurora Serverless Cluster in the VPC. To allow the lambda to talk to the database, I had to assign lambda to the specific VPC and subnets and have the correct security group. There were a lot of quirks but I finally managed to get it work. Also to connect to the DB locally, I had to setup a bastion which tunnels to the DB.

Some things I noticed:

  • If you place Lambda in a VPC, it won’t have access to the public internet. You’ll have to setup a NAT Gateway (which costs ~40$ a month) in a public subnet and place the Lambda in a private subnet to allow it to access the internet.

  • Serverless Aurora is great but the cold starts can get pretty long (~30 seconds). You can keep it so that its always on, but for Postgres that will cost ~80$/month and ~40$/month for MySQL (Hopefully it’ll be 40$ for Postgres too in the near future)

  • I like the secure by default option for Aurora Serverless. It disallows public access and can only be accessed in a VPC. For all other managed database services like Heroku, Digital Ocean etc the db will have to be public to be accessed by lambdas in AWS, which is not very secure. For secure DBs, you’ll likely have to stick to one cloud provider.

  • Deploying the API directly on AWS will require just a little more effort, but will result in considerable cost savings. This is a tradeoff between ease of deployment vs cost. - The Data API is available for Aurora Serverless and there is an issue to support it in Prisma - Add support for AWS Data API (AWS Aurora Serverless) · Issue #1964 · prisma/prisma · GitHub. This will be great for connection limits and pooling issues for serverless functions. Till then I guess the only option is to work around (eg. limit concurrent lambda execution etc). This will also remove the need to put Lambda in the same VPC. You will be able to attach an IAM role to Lambda and connect to the serverless DB without worrying about secrets - the ideal scenario hopefully.

These are the main things I noticed. I’ll add more if something comes to mind. Very interested and happy to discuss this topic further.

1 Like

thanks for this. definitely pertinent as neither Netlify or Vercel will give you a static ip.

i have done something similar with the lambda in vpc - one challenge here is that you still have to have a mechanism for securely connecting to the DB. you can have the proxy lambda check an api key or something in the header (and i’ve done this at scale with AWS Api Gateway - would not recommend), but then you need to safely store the secrets somewhere not on the client.
the best i can figure out is a pattern where a netlify function stores the secrets, but it feels super clunky.

very interesting to think about Aurora. aside from the challenges you mention, it seems great.
can you post your Terraform code or more details here for interested folks?

another thing - I am looking this weekend at using Vercel integrations with Google Cloud to try and fill this gap. it seems promising. see: Integrations – Dashboard – Vercel
of course i barely know GCP but i’d take it to avoid the troubles you mention.

would also love your thoughts on my other post i am now going to shamelessly plug:

If your DB is in a VPC and you put the Lambda in the same VPC, you can easily connect to the DB using IAM Authentication for normal RDS or just simple DB Url for Aurora Serverless. Ideal would be IAM authentication since you don’t have to manage secrets. Yeah I can post the terraform code here.

Here’s my main.tf, note - there’s also variables.tf and outputs.tf, and an optional backend.tf depending on whether you are using a remote backend. I used terraform cloud for my experiments.

provider "aws" {
  profile = "default"
  region  = var.region
}

data "aws_security_group" "default" {
  name   = "default"
  vpc_id = module.vpc.vpc_id
}

module "vpc" {
  source = "terraform-aws-modules/vpc/aws"

  name = "app-vpc"

  cidr = "10.0.0.0/16"

  azs              = ["us-east-2a", "us-east-2b", "us-east-2c"]
  public_subnets   = ["10.0.101.0/24", "10.0.102.0/24"]
  database_subnets = ["10.0.21.0/24", "10.0.22.0/24"]

  public_dedicated_network_acl = true
  public_inbound_acl_rules = concat(
    local.network_acls["private_inbound"],
    local.network_acls["public_inbound"],
  )
  public_outbound_acl_rules = concat(
    local.network_acls["private_outbound"],
    local.network_acls["public_outbound"],
  )

  database_dedicated_network_acl = true
  database_inbound_acl_rules     = local.network_acls["database_inbound"]
  database_outbound_acl_rules    = local.network_acls["database_outbound"]

  enable_ipv6          = false
  enable_nat_gateway   = false
  enable_dns_hostnames = true

  tags = {
    Owner       = "user"
    Environment = "dev"
  }

  vpc_tags = {
    Name = "app-vpc"
  }

  public_subnet_tags = {
    Type = "public"
  }
  public_acl_tags = {
    Type = "public"
  }

  private_subnet_tags = {
    Type = "private"
  }
  private_acl_tags = {
    Type = "private"
  }

  database_subnet_tags = {
    Type = "database"
  }
  database_acl_tags = {
    Type = "database"
  }
}

resource "aws_security_group" "home_ssh_access" {
  name        = "allow_ssh"
  description = "Allow ssh access from home network"
  vpc_id      = module.vpc.vpc_id

  ingress {
    description = "Inbound ssh"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = [var.home_cidr_block]
  }

  tags = {
    Name = "allow_ssh"
  }
}

resource "aws_security_group" "lambda_db_access" {
  name        = "allow_db_for_lambda"
  description = "Allow db access for lambda"
  vpc_id      = module.vpc.vpc_id

  ingress {
    description = "Inbound ephemeral"
    from_port   = 1024
    to_port     = 65535
    protocol    = "tcp"
    cidr_blocks = ["10.0.21.0/24", "10.0.22.0/24"]
  }

  egress {
    description = "Outbound db"
    from_port   = 5432
    to_port     = 5432
    protocol    = "tcp"
    cidr_blocks = ["10.0.21.0/24", "10.0.22.0/24"]
  }

  tags = {
    Name = "allow_db_for_lambda"
  }
}

module "db" {
  source  = "terraform-aws-modules/rds-aurora/aws"
  version = "~> 2.0"

  name                  = "aurora-serverless"
  engine                = "aurora-postgresql"
  engine_version        = 10.7
  engine_mode           = "serverless"
  replica_scale_enabled = false
  replica_count         = 0

  backtrack_window = 10 # ignored in serverless

  subnets                         = module.vpc.database_subnets
  vpc_id                          = module.vpc.vpc_id
  monitoring_interval             = 60
  instance_type                   = "db.t3.medium"
  apply_immediately               = true
  skip_final_snapshot             = true
  storage_encrypted               = true
  db_parameter_group_name         = aws_db_parameter_group.aurora_db_postgres10_parameter_group.id
  db_cluster_parameter_group_name = aws_rds_cluster_parameter_group.aurora_cluster_postgres10_parameter_group.id

  scaling_configuration = {
    auto_pause               = true
    max_capacity             = 2
    min_capacity             = 2
    seconds_until_auto_pause = 300
    timeout_action           = "ForceApplyCapacityChange"
  }
}

resource "aws_db_parameter_group" "aurora_db_postgres10_parameter_group" {
  name        = "postgres10-parameter-group"
  family      = "aurora-postgresql10"
  description = "Postgres 10 parameter group"
}

resource "aws_rds_cluster_parameter_group" "aurora_cluster_postgres10_parameter_group" {
  name        = "postgres10-cluster-parameter-group"
  family      = "aurora-postgresql10"
  description = "Postgres 10 cluster parameter group"
}

resource "aws_security_group" "lambdas" {
  name        = "lambdas"
  description = "Security group for lambdas accessing the DB"
  vpc_id      = module.vpc.vpc_id
  tags = {
    Name = "allow_db_acces"
  }
}

resource "aws_security_group_rule" "allow_access" {
  type                     = "ingress"
  from_port                = module.db.this_rds_cluster_port
  to_port                  = module.db.this_rds_cluster_port
  protocol                 = "tcp"
  source_security_group_id = aws_security_group.lambdas.id
  security_group_id        = module.db.this_security_group_id
}

resource "aws_security_group_rule" "allow_access_outbound" {
  type                     = "egress"
  from_port                = 1024
  to_port                  = 65535
  protocol                 = "tcp"
  source_security_group_id = aws_security_group.lambdas.id
  security_group_id        = module.db.this_security_group_id
}

locals {
  network_acls = {
    database_inbound = [
      {
        rule_number = 101
        rule_action = "allow"
        from_port   = 5432
        to_port     = 5432
        protocol    = "tcp"
        cidr_block  = "10.0.101.0/24"
      },
      {
        rule_number = 102
        rule_action = "allow"
        from_port   = 5432
        to_port     = 5432
        protocol    = "tcp"
        cidr_block  = "10.0.102.0/24"
      },
    ]
    database_outbound = [
      {
        rule_number = 201
        rule_action = "allow"
        from_port   = 1024
        to_port     = 65535
        protocol    = "tcp"
        cidr_block  = "10.0.101.0/24"
      },
      {
        rule_number = 202
        rule_action = "allow"
        from_port   = 1024
        to_port     = 65535
        protocol    = "tcp"
        cidr_block  = "10.0.102.0/24"
      },
    ]
    private_inbound = [
      {
        rule_number = 201
        rule_action = "allow"
        from_port   = 1024
        to_port     = 65535
        protocol    = "tcp"
        cidr_block  = "10.0.21.0/24"
      },
      {
        rule_number = 202
        rule_action = "allow"
        from_port   = 1024
        to_port     = 65535
        protocol    = "tcp"
        cidr_block  = "10.0.22.0/24"
      }
    ]
    private_outbound = [
      {
        rule_number = 101
        rule_action = "allow"
        from_port   = 5432
        to_port     = 5432
        protocol    = "tcp"
        cidr_block  = "10.0.21.0/24"
      },
      {
        rule_number = 102
        rule_action = "allow"
        from_port   = 5432
        to_port     = 5432
        protocol    = "tcp"
        cidr_block  = "10.0.22.0/24"
      }
    ]
    public_inbound = [
      {
        rule_number = 100
        rule_action = "allow"
        from_port   = 80
        to_port     = 80
        protocol    = "tcp"
        cidr_block  = "0.0.0.0/0"
      },
      {
        rule_number = 110
        rule_action = "allow"
        from_port   = 443
        to_port     = 443
        protocol    = "tcp"
        cidr_block  = "0.0.0.0/0"
      },
      {
        rule_number = 120
        rule_action = "allow"
        from_port   = 22
        to_port     = 22
        protocol    = "tcp"
        cidr_block  = var.home_cidr_block
      }
    ]
    public_outbound = [
      {
        rule_number = 900
        rule_action = "allow"
        from_port   = 1024
        to_port     = 65535
        protocol    = "tcp"
        cidr_block  = "0.0.0.0/0"
      },
      {
        rule_number = 100
        rule_action = "allow"
        from_port   = 80
        to_port     = 80
        protocol    = "tcp"
        cidr_block  = "0.0.0.0/0"
      },
      {
        rule_number = 110
        rule_action = "allow"
        from_port   = 443
        to_port     = 443
        protocol    = "tcp"
        cidr_block  = "0.0.0.0/0"
      }
    ]
  }
}

Here’s my serverless.yml for the Lambdas -

service: app
org: your-org
app: your-app
plugins:
  - serverless-dotenv-plugin

custom:
  dotenv:
    include: FIREBASE_PROJECT_ID

provider:
  name: aws
  runtime: nodejs12.x
  region: us-east-2
  httpApi:
    cors: true
    payload: '1.0'
  stackTags:
    source: serverless
    name: Lambda API with API Gateway
  tags:
    name: Lambda API with API Gateway

package:
  individually: true

functions:
  graphql:
    description: redwood on aws lambda
    package:
      artifact: zipball/graphql.zip
    memorySize: 1024
    timeout: 25
    tags:
      endpoint: graphql
    environment:
      DATABASE_URL: ${param:database_url}
      FIREBASE_PROJECT_ID: ${env:FIREBASE_PROJECT_ID}
    handler: graphql.handler
    vpc: # VPC IDs from Terraform
      securityGroupIds:
        - securityGroupId1
        - securityGroupId2
      subnetIds:
        - subnetId1
        - subnetId2
    events:
      - httpApi:
          path: /graphql
          method: GET
      - httpApi:
          path: /graphql
          method: POST

Yeah integrations seems cool. I will have to check them out. Also thanks for the link to the post, I will check it out.