Spark Standalone Cluster Details

Spark Standalone Cluster Internals

January 15, 2018

Version: 1.0 Author(s): Sandeep Mahendru Creation Date: Jan 15, 2018 Introduction This article describes the internal workings of a Spark cluster operating in a standalone mode. The primary motivation is to understand the internal architecture and topology of the Spark cluster execution platform. I do have experience with building distributed systems using other clustering frameworks like Coherence and Hazelcast. · The architecture is based on a multicast group, a master and slave nodes. The execution managers in these systems follow a Task execution model operating on an in memory distributed key-value data structure [a Concurrent Hashmap]. · Tasks or Entry Processors are the basic unit of execution. · These managers provide gu...

Search This Blog

Spark Standalone Cluster Details

Posts

Spark Standalone Cluster Internals