Deploying Apache Spark on Kubernetes with S3 Support

Introduction In this post we will show how to deploy a “stateless” Apache Spark cluster on Kubernetes. Spark is a fast analytics engine designed for large-scale data processing. Furthermore, we will then run analytics queries against data sitting in S3, in our case StorageGRID Webscale. We use S3 as the data source/target because it is an elegant … Read more