Blog
August 22, 2018 Marie H.

Aurora Serverless: Auto-Scaling RDS Without the Headache

Aurora Serverless: Auto-Scaling RDS Without the Headache

Photo by <a href="https://unsplash.com/@fslfsl?utm_source=cloudista&utm_medium=referral" target="_blank" rel="noopener">Domaintechnik Ledl.net</a> on <a href="https://unsplash.com/?utm_source=cloudista&utm_medium=referral" target="_blank" rel="noopener">Unsplash</a>

Provisioning RDS has always involved a judgment call: pick the instance size you think you'll need, add some headroom, and hope your traffic patterns don't surprise you. Aurora Serverless changes that model. The database scales its compute capacity automatically based on actual load, and it can pause completely when idle. For certain workloads, it's genuinely compelling. For others, it'll frustrate you. Here's the honest version.

What Aurora Serverless v1 actually does

Aurora Serverless runs the same Aurora MySQL or Aurora PostgreSQL engine you're used to, but instead of being backed by a fixed DB instance class, it's backed by a fleet that Aurora manages for you. Capacity is measured in Aurora Capacity Units (ACUs). One ACU is approximately 2GB of memory plus corresponding CPU. You set a minimum and maximum ACU range, and Aurora scales within that range based on connections and load.

The two behaviors that define the product:

Scaling up: When Aurora detects sustained load, it scales to a higher ACU count within your defined maximum. Scale-up events take around 30-60 seconds. During a scaling event, Aurora waits for a gap in active transactions before applying the change — if your database is constantly in a transaction, scaling will be delayed until there's a quiet moment. This is usually fine but can be surprising under write-heavy load.

Pause and resume: If you configure it, Aurora Serverless will pause the database after a configurable period of inactivity (minimum 5 minutes). While paused, you pay only for storage, not compute. When a connection comes in, Aurora resumes — but resuming takes 20-30 seconds. Whoever's request triggered the resume gets a connection timeout or error unless your application handles it. More on this below.

Use cases where it actually fits

Variable or unpredictable workloads are the obvious fit. If your application has heavy usage during business hours and near-zero traffic overnight, you pay for capacity when you need it and not when you don't. With a provisioned db.r4.large, you pay the same rate at 3am as you do during your noon traffic peak.

Development and staging environments are probably the best use case. Dev databases often sit idle for 8-16 hours a day. With Aurora Serverless and auto-pause enabled, you pay essentially nothing during those idle hours. For a team running 3-4 dev databases, the savings add up fast. The cold start on resume is annoying, but it's a dev environment — nobody's SLA is affected.

Internal tools and low-traffic applications where occasional slowness on first connection is acceptable.

Where it doesn't fit: latency-sensitive production workloads where you can't tolerate a 25-second resume delay, applications that maintain persistent connection pools (more on this), and high-throughput write workloads where scaling event timing matters.

Provisioning via CLI

Creating an Aurora Serverless cluster:

$ aws rds create-db-cluster \
  --db-cluster-identifier my-serverless-db \
  --engine aurora-mysql \
  --engine-version 5.6.10a \
  --engine-mode serverless \
  --scaling-configuration \
    MinCapacity=1,MaxCapacity=16,AutoPause=true,SecondsUntilAutoPause=300 \
  --master-username admin \
  --master-user-password "$DB_PASSWORD" \
  --db-subnet-group-name my-db-subnet-group \
  --vpc-security-group-ids sg-abc123

The SecondsUntilAutoPause=300 means it'll pause after 5 minutes of no connections. MinCapacity=1 is the lowest ACU setting — Aurora will scale down to 1 ACU when load is minimal. MaxCapacity=16 caps your burst capacity and your cost ceiling.

You don't create a DB instance separately with the serverless engine mode — there are no instances to configure. The cluster IS the thing.

The Data API

Aurora Serverless supports the Data API, which is genuinely interesting: instead of connecting to the database with a traditional driver and maintaining a persistent TCP connection, you make HTTPS calls to an AWS API endpoint. The API handles connection management internally.

$ aws rds-data execute-statement \
  --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:my-serverless-db" \
  --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:db-creds" \
  --database "myapp" \
  --sql "SELECT id, name FROM users WHERE active = :active" \
  --parameters '[{"name":"active","value":{"booleanValue":true}}]'

This is useful for Lambda functions. Traditional Lambda-to-RDS connections are a mess: Lambda can scale to hundreds of concurrent executions, each wanting its own database connection, and RDS has a connection limit. You either use RDS Proxy (which adds complexity and cost) or you do connection management gymnastics in your Lambda code. With the Data API, you just make HTTP calls and AWS handles the pooling on the database side.

Enable it on an existing cluster:

$ aws rds modify-db-cluster \
  --db-cluster-identifier my-serverless-db \
  --enable-http-endpoint

The trade-off: the Data API has higher per-query latency than a direct connection and a maximum result set size of 1MB. For most Lambda use cases that's fine. For bulk data work or complex multi-statement transactions, you'll want a direct connection.

Connection pooling considerations

For traditional applications connecting directly to Aurora Serverless (not via Data API), connection pooling works the same as with provisioned Aurora — you set a pool size in your application, and connections are reused. The gotcha is with auto-pause: when Aurora resumes from a paused state, existing connections in your pool are dead. Your pooler needs to handle connection drops and reconnect, which most modern connection pool libraries do with reconnect_on_failure or equivalent settings. Just make sure you have that configured.

Also: the connection limit in serverless mode is proportional to the ACU count. At 1 ACU (2GB memory), you have roughly 90 max connections for MySQL. At 8 ACUs, it scales up. If your application spins up before Aurora has scaled to handle the connection burst, you'll see too many connections errors briefly. Factor this into your startup sequencing.

Monitoring scaling events

Aurora Serverless doesn't expose instance metrics the way provisioned Aurora does (no CPU utilization for the underlying instance). Instead, watch these CloudWatch metrics:

  • ServerlessDatabaseCapacity — current ACU count. Useful to see if you're hitting your maximum.
  • DatabaseConnections — connections to the cluster.
  • ACUUtilization — percentage of your max ACU in use. If this is consistently near 100%, your max is too low.

Set an alarm on ServerlessDatabaseCapacity hitting MaxCapacity for more than a few minutes. That's your signal that you've picked too low a ceiling and Aurora can't scale further.

$ aws cloudwatch put-metric-alarm \
  --alarm-name aurora-serverless-at-max-capacity \
  --metric-name ServerlessDatabaseCapacity \
  --namespace AWS/RDS \
  --statistic Maximum \
  --period 300 \
  --threshold 16 \
  --comparison-operator GreaterThanOrEqualToThreshold \
  --evaluation-periods 3 \
  --dimensions Name=DBClusterIdentifier,Value=my-serverless-db \
  --alarm-actions arn:aws:sns:us-east-1:123456789012:ops-alerts

Cost comparison with provisioned Aurora

With provisioned Aurora you pay a fixed hourly rate for the DB instance plus storage. With Aurora Serverless you pay per ACU-hour (roughly $0.06/ACU-hour for MySQL) plus storage, and nothing for compute when paused.

For a database that runs at 4 ACUs during peak and auto-pauses 12 hours a day, the math is very different than for a db.r4.large ($0.29/hour) running 24/7. The crossover point depends on your actual utilization pattern. If your database is busy more than about 60-70% of the time and at reasonably consistent load, provisioned Aurora with Reserved pricing will likely be cheaper. If your load is spiky and you have large idle windows, serverless wins.

The unpredictability of serverless pricing is worth noting: with provisioned you know your monthly bill. With serverless, a traffic spike can run up ACU-hours you didn't anticipate. Set MaxCapacity conservatively and increase it deliberately.

Honest assessment

Aurora Serverless is a good product for the right workload. Dev environments, internal tools, and applications with genuinely variable load profiles are solid fits. The auto-pause feature is the killer feature for dev databases — I've cut dev database costs significantly just by moving everything that can tolerate cold starts over to serverless.

For latency-sensitive production workloads as of August 2018, I'm not there yet. The 20-30 second resume delay, the scaling event timing behavior under write load, and the limited visibility into what the database is doing under the hood give me enough pause to stick with provisioned Aurora for anything where a brief degradation has real consequences. Check back in a year — this product is clearly still maturing and the gaps will close.