Apache Spark is a well established tool in Big Data pipelines. There are open source as well as several commercial offerings for Spark. With such widespread usage, Spark is also often misused. Valuable customer dollars and resources are wasted due to poorly configured Spark clusters. Current profiling solutions for Spark lack :
Full insights into resource wastage and scope for optimization
Wide availability
Completeness
ZettaProf was built to address users’ requirements for building optimized Spark solutions. Over a period of time, it has evolved into a mature product which can be used by novice and expert Spark developers alike. Based on the insights offered by ZettaProf, we have been able to provide 10-100x performance improvements to our partner customers. Key capabilities which differentiate ZettaProf over other solutions:
Works with any Spark distribution : Open Source, Databricks, Azure Synapse, AWS EMR, nVidia RAPIDs (on cloud or on-prem)
Most comprehensive profiler with rich set of features and recommendations
Zero touch deployment with no change to existing Spark code or queries
ZettaProf Dashboard provides a bird’s eye view of Spark application by giving a summary of performance, report on resource utilization and recommendations for improvement. Users can deep-dive into fine-grained details to further look at various aspects (CPU, RAM, disk writes/reads, stage/operator runtimes etc.) of the full application as well as individual queries.
ZettaProf sits alongside the Spark UI and a click on “ZettaProf” tab will take the user to a dashboard as in the picture below:
Tabs can be further expanded to get additional information about the flagged issues as in the picture below:
From the Dashboard, user can deep-dive into specific issues by clicking on the hyperlink and also start the Application or Query level analysis. Use of resources, common Spark setup issues and skew/spill problems are displayed through charts and tables. User can perform trade-off analysis (more cores versus runtime impact), run query replay to identify slow stages or identify critical paths acting as bottlenecks for time consuming queries.
Topmost care about for any Spark application is to understand if the allocated resources are being used efficiently. ZettaProf’s Efficiency Analysis feature helps a user explore following:
CPU and RAM usage across complete application timeline
Top runtime consuming queries
Simulating impact of CPU allocation (increase/decrease executors) on application performance
Recommended settings for Executors, Cores, Memory to achieve best performance
Exploration of Application performance is provided by following features:
List of queries, stages and operations sortable by runtime
Queries impacted by Data and Time skew (imbalance)
Amount of disk reads/writes happening per query
Error reports and analysis
Executor and Driver efficiency
Exploration of Query performance is provided by following features:
List of stages and operations sortable by runtime
Details of Data & time skew at the task level
Details of disk read/writes
Errors specific to the query
Critical path analysis
Slowest path in the query plan
Fastest possible query runtime
Comparison of ZettaProf against other Spark profilers is given in the table below:
Feature
ZettaProf
Unravel
Pepper Data
Qubole
Databricks
Azure Synaps
Integration with Spark UI
Yes
No
No
No
Yes
Yes
Available on cloud and on-prem
Yes
Yes
Yes
Yes
Yes
Azure only
Support for all Spark distros
Yes
Partial
Partial
Partial
Partial
Azure only
Dashboard
Yes
Yes
Yes
No
No
No
Reasource problems : CPU, memory, disk, network etc.
Yes
Yes
Yes
Yes
Yes
Yes
Spark configuration issues
Yes
Yes
Yes
No
Yes
yes
Runtime problems : Lack of parallelism, bad join, spill, skew, errors, OOMs etc.
Yes
Yes
Yes
Yes
Yes
Yes
Recommendations to fix issues
Yes
Yes
No
Partial
No
Yes
Application comparison
Yes
No
Yes
No
No
yes
Query comparison
Yes
No
Yes
No
No
No
Query runtime simulation
Yes
No
No
No
No
Yes
Slowest/fastest path analysis
Yes
No
No
Yes
Yes
No
Runtime simulation with change in cores
Yes
No
No
Yes
No
No
Operator level runtime analysis
Yes
No
No
No
Yes
Yes
Alerts
No
Yes
Yes
No
No
No
ZettaProf is currently available under Limited Engagement Plan. If you would like to try ZettaProf, please contact support@zettabolt.com for a trial version.
Zettabolt Technologies (http://www.zettabolt.com) is focused on building profiling and optimization solutions for Big Data workloads. Large hardware vendors, finance institutions and e-Commerce companies use our solutions to achieve cost savings and performance improvements above and beyond what state-of-the-art technology can provide. With expertise in CPU, GPU and FPGA based optimizations, solutions have been built which can realize up to 100x speed-ups on a variety of customer workloads.