Data platform cloudification: technology lessons learned

Let’s be absolutely clear about this: moving your SAS platform to the cloud should be first and foremost a business decision. Nevertheless, if there’s one key lesson we have learned from our first successful attempts at SAS cloudification, it’s that you also need to get the technology right.

Getting the technology right in this case means that you must know your SAS platform well enough in order to choose the cloud components that are best suited to it, from servers and storage systems to connectivity solutions. The important thing to realise at that point is that SAS is not just software. It is an actual platform, offering a wide array of products and solutions, modules and tools, that all interact with one other.

KYP: Know Your Platform

Looking at our own experience, at LACO we service several dozen SAS customers, none of whom own the same platform. In fact, not even two of them have a platform that is somewhat similar. Even though, some might share the same ‘topology’, they for sure built completely different solutions or applications on top of it, using different coding approaches and design patterns. This diversity is why it is so important that, as a first step, you really understand how your SAS platform is set up, how it is used and what exactly you are using it for – or wish to use it for. Once that essential analysis done, you can then choose the right cloud components to host your SAS platform. For, as we would like to show here, even within the cloud there are a lot of choices you can make.

Servers

Depending on the cloud platform or vendor you are working with, there are different types of servers or server configurations you can choose from. More generally speaking, and taking into account your SAS platform, you could opt for CPU-optimized, memory-optimised or storage-optimised servers. To run complex algorithms requiring a lot of calculation power, you’ll be probably best off with one of the first category, to host a high-performance in-memory reporting tool, you will likely end up in the second and in order to cope with a heavy I/O consuming ETL processes the latter category might be the best choice. This is just a first indication which should be challenged based on detailed monitoring of system resource usage. And this is only to choose the category from which you need to pick your detailed infrastructure. Believe us, in order to make the specific choices, details do matter!

Storage

As data platforms often tend to process data in steps – cleaning them up, for example, before rendering them available for reporting – it can be useful to save (some of) the data on storage disks in between steps. Of course, it goes without saying that the more data you read and write and the more complex those data, the more advisable it becomes to deploy very fast storage disks. They can help you shorten the lead time for this storage process, which is by its very nature and definition slower dan in-memory storage. In other cases or contexts, however, no such specific type of storage disk is needed: when you have smaller volumes or less complex data to process, for instance, or when time isn’t a critical factor and the data can just as well be stored and processed at night. As the storage cost will for any data platform have a significant impact on the overall financial case, choosing the right combination of storage types is not a trivial task.

Connectivity

A similar point applies to connectivity. Many data can be slowly batch-loaded into the cloud. But if, for example, you require reporting that is near real-time because it could have an impact on your production process, then it can be useful to look at the network speed too, to make sure that the necessary data end up more quickly in the cloud. And, by the way, do you know that the cost of downloading from the cloud might be much higher than the one for uploading your data into it? A wrong design choice leading to significant download volumes may give you a bad surprise when your invoice comes in…

In short, depending on the context, such as your installed SAS software products and the way they are used, you will end up with a completely different setup and configuration of your SAS platform in the cloud. One size fits no one… Optimization is key when it comes to SAS cloudification. Luckily we at LACO are more than experienced, when it comes to your specific needs.

Feel like taking a deeper dive into the cloud with SAS?

Samuel De Klerck, CDO at LACO

Samuel De Klerck

CDO at LACO

    Keep your data intelligence knowledge up to date

    Our newsletter takes care of it!