New Product Identifies Lines of Code and Stages that Cause Performance Issues Related to CPU, Memory, Garbage Collection, Network and Disk I/O
CUPERTINO, Calif. – May 23, 2017 – Pepperdata, the DevOps for Big Data company, today announced Pepperdata Code Analyzer for Apache Spark, which provides Spark application developers the ability to identify performance issues and connect them to particular blocks of code within an application. Code Analyzer is a new product that follows on the heels of Pepperdata Application Profiler, which provides Hadoop and Spark developers with actionable recommendations for improving job performance.
“One of the most significant challenges in Big Data is achieving optimal performance,” said Ash Munshi, CEO of Pepperdata. “Code Analyzer fills a huge void in application development for Spark, helping developers optimize Spark applications for large-scale production. Developers are now empowered to improve the performance of Spark applications with new information and insight around the code, build, test and release phases.”
The performance metrics from Spark Web UI have historically been a challenge for developers to understand and contextualize, especially without having granular, time-series data on hand. Developers cannot easily drill down into and understand the problematic sections of an application that require optimization. Further, as Spark clusters typically run many applications in parallel, the Spark Web UI doesn’t inform developers how applications are impacted by other applications running on the cluster.
Pepperdata Code Analyzer allows Spark application developers to precisely measure how cluster resources – including CPU, memory, and network and disk I/O–are consumed by any particular block of application code. Code Analyzer delivers additional insight by combining application information from the Spark engine with granular time-series data for all applications running on a cluster. Dev teams are empowered with the ability to pinpoint the specific segment of their application code responsible for performance issues.
“I develop a lot of complex Spark code to perform ETL on Hadoop clusters. In these complex, large-scale systems, you must be able to understand where the performance bottlenecks are,” said Ian O’Connell, software engineer at Stripe and Pepperdata Technology Advisory Board member. “Pepperdata Code Analyzer for Apache Spark gives developers detailed time-series performance data for things like CPU, JVM memory and I/O usage overlaid against Spark job stages. I’m excited about the direction Pepperdata is moving — letting developers quickly see problems in time-series views and tie them back to their actual Spark application code will be a very useful tool for developers working on production Spark applications.”
Benefits of Code Analyzer include:
● Identify which lines of code and which stages cause performance issues related to CPU, memory, garbage collection, network and disk I/O
● Easily disambiguate resources used during parallel stages
● Understand why run time variations occur for the same application
● Determine whether performance issues are due to the application or other workloads on the cluster
● Reduce the number of performance incidents in production
● Easily communicate detailed performance issues back to developers
“Chartboost is the world’s largest mobile games-only advertising platform, reaching one billion active players around the world every month. Chartboost utilizes Apache Spark on large Amazon EC2 Hadoop clusters for machine learning and ETL workflows,” said Michael McGowan, manager of Data Engineering at Chartboost. “Understanding Spark application performance in these complex environments is always a challenge. As a current user of Pepperdata Hadoop performance management tools, it has been great to work with Pepperdata on the development of Code Analyzer. It will give us comprehensive insight into Spark jobs.”
Pepperdata products and services are designed to accelerate the production use of Big Data applications by ensuring that performance is tightly integrated into the DevOps for Big Data cycle. Code Analyzer is integrated with Pepperdata products to provide an end-to-end DevOps solution, combining overall cluster awareness (monitoring, troubleshooting and alerting) with deep recommendations for improving the performance of individual jobs.
Availability and Pricing
Code Analyzer for Apache Spark will be available June 5 in early access, with general availability expected in Q3 2017. Pepperdata products are delivered to market as a combination of software running on customers’ clusters, on-premises or in the cloud, and as SaaS solutions. For pricing information or to schedule a demo, contact [email protected].
Pepperdata is the DevOps for Big Data company. Leading companies such as Comcast, Philips Wellcentive, and Zillow depend on Pepperdata to manage and improve the performance of Hadoop and Spark. Enterprise developers and operators use Pepperdata products and services to diagnose and solve performance problems in production and increase cluster utilization. The Pepperdata product suite improves communication of performance issues between Dev and Ops, shortens time to production, and increases cluster ROI. Pepperdata products and services work with customer Big Data systems both on-premise and in the cloud.
Founded in 2012, Pepperdata has raised $20M from investors including Citi Ventures, Signia Venture Partners and Wing Venture Capital, and attracted senior engineering talent from Yahoo, Google, Microsoft and Netflix. Pepperdata is headquartered in Cupertino, California