Database Replication: AWS Database Migration Service

Amazon DMS

AWS DMS is an easy to use, cost-effective database replication tool

In August 2016, Amazon Web Services released Database Migration Service (DMS). DMS is database replication software focused on making it easier to migrate data from a source database to a target destination like a data warehouse or data lake (within AWS).

What is AWS Database Migration Service?

DMS supports migrations of data from multiple sources via a few different replication methods. Oracle to Oracle, Oracle, or Microsoft SQL Server to Amazon Aurora and Redshift. DMS also supports MySQL database replication as well as Postgres, and others.

In addition to replication data from a database, AWS DMS allows you to continuously replicate your data with high availability and consolidate databases cloud warehouses like Amazon Redshift or object storage Amazon S3.

For in-depth details on the supported source and target databases, AWS makes these details available on their site.

Types of Database Replication: One time, On-going

Typically, you will either be doing a one-time migration or continuous data replication. In the case of a unique one-time replication, you may undertake this process to do a seed replication to a new system for testing or production.

In the case of continuous replication, you may do this process on a schedule, such a nightly job, or undertake near real-time replication. Ongoing near real-time replication does have specific configuration requirements for read/write access to the source system.

If you only have read access to the source database, this will require an alternate replication pattern. Most real-time processes will require write operations to the source for replicating updates.

Another benefit of DMS is the opportunity to automate processing according to your specific requirements beyond the AWS user interface. For example, you can accomplish automation with AWS CLI, CloudFormation, or a third-party solution like Terraform.

DMS and Data Lake Landing Zone

One of the intriguing options and a less obvious use case of DMS is using S3 as a target destination. Using the DMS S3 target destination creates a cost-effective, and high-quality data lake landing zone for exported tables from a source system.

From your DMS source system landing zone, you can create scalable, zero administration data pipelines to serverless query engines like AWS Athena or use a hybrid data lake + data warehouse model with tools like Redshift Spectrum.

Pricing

When migrating databases to Amazon Aurora, Amazon Redshift, Amazon DynamoDB, or Amazon DocumentDB (with MongoDB compatibility), you can use DMS free for six months. However, while a free extended trial is helpful, in most cases, you have to budget for ongoing operations.

For less frequent (i.e., daily) batch replication operations, we will use a c4.large, which is $0.154 per hour. The SSD storage is $0.115 per GB/month. If we are running our daily process for 4 hours, this will be about $.62 a day or about $19 for the month.

Assuming we were consistently persisting about 500 GB of data within the DMS process, this would be another $58 a month. The total service costs would be $76.

If we were running this 24/7, then the daily costs would $3.70 or $110 for the month. The data storage would roughly be the same at 500 GB. The total cost would be $168.

Depending on the type of replication, replication schemes, and replication processes, your costs will vary upward or downward. However, regardless of the end configuration, DMS is an affordable option that offers value and powerful capabilities.

Database replication software comparison

Up until a couple of years ago, tools like AWS DMS were hard to come by or difficult to employ. The lack of cost-effective and quick setup solutions led several SaaS vendors like Fivetran, Stitch, Alooma, and Openbridge to roll out solutions.

Why did these companies build out solutions? Customers needed to support data replication from a source database like Postgres, MySQL, and others to a cloud warehouse like Redshift or BigQuery for data analytics. Moving data into a data lake or cloud warehouse opened new opportunities to use tools like Tableau, Looker, PowerBI, and others.

So why would you use a Saas tool like Fivetran integrations for data replication over DMS? Today, you likely would not unless you have heavily invested in Fivetran already. Given the emergence of DMS and the refinement of the product by AWS over the past 12–24 months, it is a go-to offering for database replication. The only use case where we still leverage our Openbridge replication tools is for read-only data sources.

So you how much does a SaaS tool like Fivetran cost compared to AWS DMS? Fivetran costs will range from USD 36K to USD 120K per year. The Fivetran pricing model would likely be 20x more than a base DMS configuration. In fairness to Fivetran, you would never select them just for database replication services alone. The Fivetran cost would be prohibitive for a database replication use case. If you are thinking of Fivetran as a primary solution for database replication, you should explore DMS as an alternative first.

What about other SaaS vendors? Fivetran alternatives like Stich also offer replication services. While Fivetran competitors like Stich are less expensive, DMS still affords greater flexibility and cost efficiencies, especially given their pricing model (number of replicated rows).

For database replication, this is less about SaaS comparisons like Stitch vs. Fivetran but more about how these services compare to the latest AWS DMS offering. If you need to continuously replicate a read-only system, feel free to reach out the Openbridge team for details on our service.

Getting Started

Getting started with DMS will require that you create an AWS account, set up a migration process, and associated replication instance(s). In more sophisticated use cases, you may need to employ the DMS schema conversion tool.

As with any system migration of data from one location to another, make sure you have a database migration plan in place. This is critical for testing and signoff of the processes. Without this plan, subtle shifts, gaps, or bugs can corrupt the downstream processing.

Openbridge provides a fully-managed Amazon’s Data Migration Service (DMS) to customers. The typical for our customers is to use DMS to deliver data to an AWS S3 landing zone, we then ingest the data into a curated data lake, register everything in a data catalog, and create corresponding tables/views in Athena or Redshift Spectrum.

DWant to AWS DMS? Need a platform and team of experts to kickstart your data and analytics efforts? We can help! Getting traction adopting new technologies, especially if it means your team is working in different and unfamiliar ways, can be a roadblock for success. This is especially true in a self-service only world. If you want to discuss a proof-of-concept, pilot, project, or any other effort, the Openbridge platform and team of data experts are ready to help.

Reach out to us at hello@openbridge.com. Prefer to talk to someone? Set up a call with our team of data experts.

Visit us at www.openbridge.com to learn how we are helping other companies with break down their data silos.

References:


Database Replication: AWS Database Migration Service was originally published in Openbridge on Medium, where people are continuing the conversation by highlighting and responding to this story.



source https://blog.openbridge.com/database-replication-aws-database-migration-service-b3cffcd0eff?source=rss----4c5221789b3---4

Popular posts from this blog

Data Lake Icon: Visual Reference

Why Timeszones Cause Amazon Seller Central Confusion

PrestoDB vs PrestoSQL & the new Presto Foundation