Databricks Supports Apache Spark 2.4.0

November 20, 2018

Databricks is the first unified analytics vendor to support Apache Spark 2.4. It is supported as part of Databricks Runtime 5.0, which is now generally available. Databricks also introduced a key feature, HorovodRunner, within Runtime 5.0 to further simplify distributed deep learning.

The Apache Spark community made multiple valuable contributions to the Spark 2.4 release which was introduced on November 8, 2018. In this release, Project Hydrogen substantially improves the performance and fault-recovery of distributed deep learning and machine learning frameworks on Spark. Project Hydrogen directly addresses the challenges data teams face because there is a significant difference between how big data jobs and deep learning jobs are executed. Whereas Spark excels at data processing at massive scale, deep learning assumes complete coordination and dependency among tasks which is optimized for constant communication rather than scalability and fault tolerance.

“Innovation continues to thrive within the Apache Spark community. Project Hydrogen is the most recent major initiative with an aim to provide first-class support for popular distributed machine learning frameworks on Apache Spark,” said Reynold Xin, co-founder at Databricks, Apache Spark PMC member and the top contributor to the project.

Within Apache Spark 2.4, Project Hydrogen introduces Barrier Execution, a new scheduling mode that allows practitioners to properly embed distributed deep learning training as an Apache Spark workload. Added Xin, “This is the largest change to Spark’s scheduler since the inception of the project. At Databricks, we also found additional opportunities to simplify the complexity of machine learning workloads. Within Databricks’ Unified Analytics Platform, which is powered by Spark 2.4, we created further optimizations to simplify distributed deep learning.”

Databricks Simplifies Distributed Deep Learning in Runtime 5.0

Model experimentation usually takes place on a single-node machine, locally or in the cloud, before scaling out computation as needed. Migrating from single-node workloads to distributed training on a CPU or GPU clusters can often times require a full code rewrite, increasing the complexity of moving to distributed training. To accelerate migration to distributed deep learning, Databricks just released HorovodRunner. The new feature provides a simple way to scale up deep learning training workloads from a single machine to large clusters, reducing overall programming and training time from hours to minutes.

To help simplify deep learning further, Databricks also provides native integration with the most popular frameworks including TensorFlow, Keras, and Horovod, as well as a performance edge with the most popular machine learning algorithms from MLlib and GraphFrames. This provides practitioners with a convenient way to get machine learning clusters started in seconds, pre-configured with the latest machine learning frameworks, libraries, and their dependencies.

Terms of Use | Copyright © 2002 - 2018 CONSTITUENTWORKS SM  CORPORATION. All rights reserved. | Privacy Statement