Pepperdata Sees Big Data Cloud Wastage
August 7, 2020
"Big Data Performance report was compiled after reviewing comprehensive
data on the applications contained in Pepperdata's largest enterprise
customer clusters, representing nearly 400 petabytes of data on 5000
nodes. This equates to 4.5 million applications running in a 30-day
timeframe. The report provides insights into the enormous compute waste
that occurs with big data applications in the cloud.
When it comes to wastage, failures are important. Job failures cause serious performance degradation, and consume significant computational resources. In an unoptimized dataset, Pepperdata sees a wide range of failure rates across clusters. Some clusters will fail above 10%, and Spark applications tend to fail more often than MapReduce.
Prior to implementing Spark optimization: Across clusters, within a typical week, the median rate of maximum memory utilization is a mere 42.3%. The underutilization here represents two states: not enough jobs running to fully utilize the cluster resources or the jobs are wasting resources.
Prior to implementing cloud optimization: Comparing jobs used and wasted, the average wastage across 40 large clusters is 60+%. This wastage takes an interesting form; typically, with 95% of jobs, there is little wastage. Major wastage is usually found in 5% to 10% of total jobs.
This is why optimization is
inherently such a needle-in-a-haystack challenge, and why machine
learning can be such a help. Studies show that ML-powered statistical
models predict task failures with a precision up to 97.4%, and a recall
up to 96.2%. Applied to Hadoop, the percentage of failed jobs is reduced
by up to 45%, with an overhead of less than five minutes.
Most enterprises are able to increase task hours by a minimum of 14%. Some enterprises are able to increase task hours by as much as 52%.
25% of users are able to save a minimum of $400,000 per year. At the higher end, the most successful users are able to save a projected $7.9 million for the year.
To cut the waste out of IT operations processes and achieve true cloud optimization, enterprises need visibility and continuous tuning. This requires machine learning and a unified analytics stack performance platform. Such a setup equips IT operations teams with the cloud tools they need to keep their infrastructure running optimally, while minimizing spend.