Automating DB Snapshots at Amazon RDS

Amazon RDS is a relational database service by Amazon Web Services. RDS offers an easy to administer environment and supports many known relational database engines such as PostgreSQL, MySQL and MariaDB.

The database is a crucial part for a company and it is very important to maintain the integrity of the data.
For this reason, you should consider creating a good backup strategy from day one.

RDS provides two backup methods: Automated backups and user-initiated DB snapshots. Automated backups are initiated during the creation of an RDS instance. You set the backup window and the retention period for the backups and you're ready.

Although automated backups seem attractive because they are easy to manage, they have some constraints.

  1. Retention: There is a 35-day retention period limit. After exceeding that limit, the snapshot is then deleted.

  2. Deleted database: If you accidentally delete a database for any reason, automated backups are going to be removed too.

  3. Disaster Recovery: Automated backups can only be restored from within the same region.
    If you have a DR strategy, you might want to move the snapshots between multiple regions.

These 3 constraints can be resolved with the DB snapshots.

  1. DB snapshots can be retained for as long as you wish.
  2. DB snapshots are not removed if you accidentally delete the database.
  3. DB snapshots can be moved from one region to another without any constraint.

Amazon Web Services gives us three ways to take manual DB Snapshots: the management console,
the awscli command line utility and the various Amazon SDKs.

Lets see how we can capture & delete DB snapshots from Amazon RDS with boto3, the Python SDK.
The benefit of using the SDK is that we can write extensible tools that fit our needs.

Source code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""dbsnap is a DB snapshot management tool for Amazon RDS.
Demo tool used for educational purposes in http://blog.codebender.cc/2015/12/08/automating-db-snapshots-at-amazon-rds/
"""

from __future__ import print_function

import boto3
import click
import datetime
import time
import sys

__version__ = '0.1.0'


class DBSnapshot(object):
    """DBSnapshot"""

    def __init__(self):
        self.client = boto3.client('rds')

    def create(self, db_instance, timestamp):
        """Creates a new DB snapshot"""
        snapshot = "{0}-{1}-{2}".format("mysnapshot", db_instance, timestamp)
        self.client.create_db_snapshot(DBSnapshotIdentifier=snapshot, DBInstanceIdentifier=db_instance)
        time.sleep(2)  # wait 2 seconds before status request
        current_status = None
        while True:
            current_status = self.__status(snapshot=snapshot)
            if current_status == 'available' or current_status == 'failed':
                break
        return current_status

    def delete(self, snapshot):
        """Deletes a user-specified DB snapshot"""
        try:
            current_status = self.__status(snapshot=snapshot)
            if current_status == 'available':
                self.client.delete_db_snapshot(DBSnapshotIdentifier=snapshot)
                current_status = self.__status(snapshot=snapshot)
        except:
            current_status = 'does not exist'
        return current_status

    def list_instances(self):
        """Lists the available RDS instances"""
        return self.client.describe_db_instances()['DBInstances']

    def __status(self, snapshot):
        """Returns the current status of the DB snapshot"""
        return self.client.describe_db_snapshots(DBSnapshotIdentifier=snapshot)['DBSnapshots'][0]['Status']


@click.group()
@click.version_option(version=__version__)
def cli():
    """dbsnap is a DB snapshot management tool for Amazon RDS."""
    pass


@cli.command()
def instances():
    """Returns the available RDS instances"""
    dbcon = DBSnapshot()
    db_instances = dbcon.list_instances()
    click.echo("Database Instances:")
    for instance in db_instances:
        print("\t- {0}".format(instance['DBInstanceIdentifier']))


@cli.command()
@click.option('--db-instance', help='Database instance')
def create(db_instance):
    """Creates a new DB snapshot"""
    if not db_instance:
        click.echo("Please specify a database using --db-instance option", err=True)
        return sys.exit(1)
    dbcon = DBSnapshot()
    date = datetime.datetime.now()
    timestamp = date.strftime("%Y-%m-%d")
    click.echo("Creating a new snapshot from {0} instance...".format(db_instance))
    response = dbcon.create(db_instance=db_instance, timestamp=timestamp)
    click.echo("Snapshot status: {0}".format(response))


@cli.command()
@click.option('--db-snapshot', help='Database snapshot')
def delete(db_snapshot):
    """Deletes a user-specified DB snapshot"""
    if not db_snapshot:
        click.echo("Please specify a database using --db-snapshot option", err=True)
        return sys.exit(1)
    dbcon = DBSnapshot()
    response = dbcon.delete(snapshot=db_snapshot)
    if response == 'does not exist':
        output = "Snapshot: {0} has been deleted".format(db_snapshot)
    else:
        output = "Snapshot: {0} deletion failed".format(db_snapshot)
    click.echo(output)

Our utility comprises a wrapper of the boto's RDS interface and uses click module which allows us to create simple command line utilities.

Lets go through the code to see some of the core calls to the RDS API.

Here's how we connect to boto's client for RDS.

boto3.client('rds')

When we run the create operation we create a basic naming convention for our snapshots.

snapshot = "{0}-{1}-{2}".format("mysnapshot", db_instance, timestamp) # mysnapshot-database-instance-1-YYYY-MM-DD

Then we call the API to create our snapshot:

create_db_snapshot(DBSnapshotIdentifier=snapshot, DBInstanceIdentifier=db_instance)

If we want to know the availability status of the snapshot we run describe_db_snapshots:

describe_db_snapshots(DBSnapshotIdentifier=snapshot)['DBSnapshots'][0]['Status']

We could also call the delete operation to remove a captured snapshot:

delete_db_snapshot(DBSnapshotIdentifier=snapshot)

If you have multiple RDS instances you might want to list them before you perform any action.
This is how we get a list with the available RDS instances:

describe_db_instances()['DBInstances']

Now that we have looked at what is going on under the hood, lets move onto some basic examples.

Get the help page with the available operations:

dbsnap.py --help

Output:

Usage: dbsnap.py [OPTIONS] COMMAND [ARGS]...

  dbsnap is a snapshot management tool for Amazon RDS.

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  create     Creates a new database snapshot
  delete     Deletes the specified snapshot
  instances  Returns the available database instances
Get a list with all your database instances:

dbsnap.py instances

Output:

Database Instances:
- database-instance-1
- database-instance-2
- database-instance-3
Create a new DB snapshot for database-instance-1:

dbsnap.py create --db-instance database-instance-1

Output:

Creating a new snapshot from database-instance-1 instance...
Snapshot status: available

If you check at your RDS dashboard now, you will see a new snapshot in the form of:
mysnapshot-database-instance-1-YYYY-MM-DD

Delete the DB snapshot that we created earlier:

dbsnap.py delete --db-snapshot mysnapshot-database-instance-1-YYYY-MM-DD

Output:

Snapshot: mysnapshot-database-instance-1-YYYY-MM-DD has been deleted

Based on the code and the examples you can extend the tool as you wish. You could add logging capabilities, run it daily to get DB snapshots for all your instances or add functionalities to delete the snapshots after a period of time.

It's worth mentioning that during the backup window, I/O freeze may occur on Single AZ (Availability Zone) database instances. You can avoid that freeze by enabling the Multi-AZ deployments which provides high availability.

To conclude, automated backups are very handy and if they are used in conjunction with automated DB snapshots, they can give a very reliable and stable backup strategy.

I hope you enjoyed the post.

Further information:
* DB Instance Backups
* Best Practices for Amazon RDS
* Pinterest recently open-sourced a collection of its in-house MySQL utilities. It is a good source to get some ideas.