Implementation of Nomad Cluster for Massively Parallel Computing  

  • AWS Cloud Cost optimization 
  • On-premises to AWS cloud migration 
  • Technical Consulting

Client background

Our client is a software development company that creates a product based on a waveform inversion algorithm for building Earth models. S-Cube has developed a proprietary algorithmic toolbox known as X-Wave for Full Waveform Inversion. Their product analyzes gigabytes and terabytes of data to build 3-D models for use in the energy industry.

Business challenge

Our customer had a solution designed to be run on AWS. They used a custom solution to schedule parallel processing on hundreds of AWS instances. At this time, they had to manage instance creation and decommissioning themselves and respond to AWS spot instance withdrawals.

They wanted to improve the infrastructure by several criteria:

  • Separate the software from infrastructure so that they don’t need to manage the infrastructure from the application
  • Have an option to run containerized and non-containerized workloads and use both cloud and on-premises environments
  • Have an option to support multi-tenancy to run calculations for different clients
  • Use a technology set with a low entry level for small infrastructure teams

Value delivered

Gart Solutions has helped customers make their SaaS platform more economically efficient by restructuring its architecture with the most up-to-date cloud development techniques and technologies.

We have also enabled the customer to avoid being tied to a single vendor by implementing various third-party integrations that raise platforms monitoring capabilities and improve product offering.

Solution

Gart Solutions team helped the customer build a new approach that brings infrastructure management to the next level.

We used Infrastructure as Code with a layered approach to create a flexible landscape
and containerized the application, simplifying its launch. As the next step, we built
the infrastructure using Hashicorp Nomad as a workload orchestrator together
with the Autoscaler plugin.

This combination orchestrates jobs running on hundreds of computer nodes
in a high-performance manner, solving the issue of spot instance withdrawals.

By performing tests and ensuring that the system reacts well to large loads and can scale up to thousands of instances, we also tested and proved that the scaling time depends only
on the cloud solution, not on our system.

Results

We helped the customer build a new approach that brings infrastructure management to the next level and validate a proof of concept for this setup.

Let’s work together!

See how we can help to overcome your challenges

arrow arrow

Thank you
for contacting us!

Please, check your email

arrow arrow

Thank you

You've been subscribed