REPORT ON RESOURCE UTILIZATION AND RECOMMENDATIONS FOR IMPROVEMENT
Apache Spark is a well established tool in Big Data pipelines. There are open source as well as several commercial offerings for Spark. With such widespread usage, Spark is also often misused. Valuable customer dollars and resources are wasted due to poorly configured Spark clusters. Current profiling solutions for Spark lack : i. completeness, ii. wide availability, iii. full insights into resource wastage and scope for optimization.
ZettaProf was built to address users’ requirements for building optimized Spark solutions. Over a period of time, it has evolved into a mature product which can be used by novice and expert Spark developers alike. Based on the insights offered by ZettaProf, we have been able to provide 10-100x performance improvements to our partner customers. Key capabilities which differentiate ZettaProf over other solutions:
From the Dashboard, user can deep-dive into specific issues by clicking on the hyperlink and also start the Application or Query level analysis. Use of resources, common Spark setup issues and skew/spill problems are displayed through charts and tables. User can perform trade-off analysis (more cores versus runtime impact), run query replay to identify slow stages or identify critical paths acting as bottlenecks for time consuming queries.
ZettaProf is currently available under Limited Engagement Plan. If you would like to try ZettaProf, please contact email@example.com for a trial version.
Zettabolt Technologies (http://www.zettabolt.com) is focused on building profiling and optimization solutions for Big Data workloads. Large hardware vendors, finance institutions and e-Commerce companies use our solutions to achieve cost savings and performance improvements above and beyond what state-of-the-art technology can provide. With expertise in CPU, GPU and FPGA based optimizations, solutions have been built which can realize up to 100x speed-ups on a variety of customer workloads.