The right cloud for the job: Multicloud database processing is here

David Linthicum

The idea is pretty simple and actually pretty old: Use a distributed architecture on large databases to quickly return the data requested. This approach runs the database query across many servers at the same time, then combines the results as they return from hundreds, perhaps thousands of servers in the cluster.

This idea has gotten new attention because it is the core idea behind MapReduce, the parallel processing model used by Hadoop in big data analytics. These types of distributed workloads have been used for years, typically with a homogeneous server cluster, meaning it works across lots of the same servers. That homogeneity restricts you to one server cluster or one cloud -- thus, one resource type and cost. But not any more.

In the emerging multicloud approach, the data-processing workloads run on the cloud services that best match the needs of the workload. That current push toward multicloud architectures provides the ability to place workloads on the public or private cloud services that best fit the needs of the workloads. This also provides the ability to run the workload on the cloud service that is most cost-efficient.

For example, when processing a query, the client that launches the database query may reside on a managed service provider. However, it may make the request to many server instances on the Amazon Web Services public cloud service. It could also manage a transactional database on the Microsoft Azure cloud. Moreover, it could store the results of the database request on a local OpenStack private cloud. You get the idea.

The benefits are obvious: You can mix and match the cloud services to the workloads, both increasing performance and saving money. Indeed, you could move workloads from cloud to cloud as needed.

A lot of database processing happens in cloud computing services these days -- and it's not cheap. Moving workloads among cloud services gives those who manage large distributed databases the power to use only the providers who offer the best and most cost-effective service -- or the providers who are best suited to their database-processing needs.

Of course, the trade-off is complexity, and there will certainly be a need for management and automation. Cloud management platform tools should be useful here, because they provide the ability to manage some of what I'm describing today.

It's not as scary as it sounds, and it's always good to have options.

Source: InfoWorld