How We Turbocharged Our Prediction Pipeline Using Dask
At Pigment, we provide a central platform for organizations to model and plan using their most valuable data. As part of that mission, enabling predictive insights is key. Our first implementation of predictions worked well for smaller datasets, but as more users began applying the feature to larger and more complex data models, we quickly hit the limits of a single-machine setup in terms of memory and compute.
To address this, we redesigned our infrastructure around a distributed Dask cluster, allowing us to scale horizontally and support much larger workloads. In this post, we’ll share the architecture of our new solution and key takeaways from adopting Dask in production.