DevOps.com

  • Latest
    • Articles
    • Features
    • Most Read
    • News
    • News Releases
  • Topics
    • AI
    • Continuous Delivery
    • Continuous Testing
    • Cloud
    • Culture
    • DataOps
    • DevSecOps
    • Enterprise DevOps
    • Leadership Suite
    • DevOps Practice
    • ROELBOB
    • DevOps Toolbox
    • IT as Code
  • Videos/Podcasts
    • Techstrong.tv Podcast
    • Techstrong.tv Video Podcast
    • Techstrong.tv - Twitch
    • DevOps Unbound
  • Webinars
    • Upcoming
    • On-Demand Webinars
  • Library
  • Events
    • Upcoming Events
    • On-Demand Events
  • Sponsored Content
  • Related Sites
    • Techstrong Group
    • Container Journal
    • Security Boulevard
    • Techstrong Research
    • DevOps Chat
    • DevOps Dozen
    • DevOps TV
    • Techstrong TV
    • Techstrong.tv Podcast
    • Techstrong.tv Video Podcast
    • Techstrong.tv - Twitch
  • Media Kit
  • About
  • Sponsor
  • AI
  • Cloud
  • Continuous Delivery
  • Continuous Testing
  • DataOps
  • DevSecOps
  • DevOps Onramp
  • Platform Engineering
  • Low-Code/No-Code
  • IT as Code
  • More
    • Application Performance Management/Monitoring
    • Culture
    • Enterprise DevOps
    • ROELBOB

Home » Blogs » Application Performance Management/Monitoring » Using Spark and Jenkins to Deploy Code into Hadoop Clusters

Using Spark and Jenkins to Deploy Code into Hadoop Clusters

Avatar photoBy: Nodagala Nagarjuna on March 26, 2019 Leave a Comment

This article illustrates how to deploy an application into a Hadoop cluster using GitLab/Jenkins/Spark application.

Related Posts
  • Using Spark and Jenkins to Deploy Code into Hadoop Clusters
  • Bigdata – Understanding Hadoop and Its Ecosystem
  • One Team, 5,000 Jobs: Life in the DevOps Jungle
    Related Categories
  • Application Performance Management/Monitoring
  • DevOps Practice
  • DevOps Toolbox
    Related Topics
  • application deployment
  • gitlab
  • Hadoop Cluster
  • Jenkins
  • JIRA
Show more
Show less

In this article, we will cover the following:

TechStrong Con 2023Sponsorships Available
  • Auto-triggering Jenkins build when code is committed in Git.
  • Pulling the latest code from the repository through Jenkins.
  • Building, compiling and packaging the “.jar” using SBT tool.
  • Uploading the .jar to the Nexus repository with defined version.
  • Pulling the versioned jar and dependent files from Nexus repository to server.
  • Deploying the versioned “.jar” file through Spark job into Hadoop cluster.

Getting Started

The following must be performed first:

  1. Download and install Java 1.8.
  2. Install Jenkins.
  3. Integrate GitLab and Jira.
  4. Install SBT.
  5. Create the Hadoop cluster: Hadoop and Spark.

End to End Steps in Detail

When a developer pushes code to GitLab, Jenkins will trigger a predefined job using webhooks.

The Jenkins job will pull the code from version control using Git; it builds the code and makes the package as .jar file using the build tool SBT. This .jar file can be deployed into a Hadoop cluster with the help of a Spark command.

Once the deployment is completed in the Hadoop cluster, the application will start running in the background.

Initially, we can see the current running application in Hadoop (see Figure 1). It has an application ID, which is in a running state.

Figure 1
  • If we click the tracking UI, which is Application Master, it will navigate into the Spark streaming page.
Figure 2
  • In Figure 2, we can see batches are running in 40 seconds. For this scenario, we are changing from 40 seconds in the code.
  • When a developer makes changes to the code, he will commit in his local machine and test his code, then push to remote repository. I have updated steaming time from 40 to 50 seconds and committed.
  • If we can specify the Jira task number in the commit message, the same comment will be updated in the Jira task as below.
Figure 3

In the Jira application, we can see our commit message as comments.

Figure 4

GitLab and Jira integration process:

  • Install Git plugin to pull code from GitHub repository and SBT plugin to build and make a package as a .jar file in Jenkins.
  • Configure the Jenkins job under source code management and provide the Repository URL along with credentials and branch details.
Figure 5
  • Optional: Set webhooks to automatically trigger the Jenkins job when the push happens in GitLab.
Figure 6
  • In the “build” section, add the parameters clean compile package in the Actions field, which will build the code and create a .jar file.
Figure 7
  • Once the job has been triggered in Jenkins, monitor the Jenkins console for the next step and errors, if any.
Figure 8

Optional: If we have configured SonarQube, it will analyze the code files and provide the result in the SonarQube dashboard. This allows us to know the code quality, code coverage vulnerabilities and bugs to improve efficiency of code.

  • Optional: To maintain the versioning, we can upload our *.jar into Nexus/Artifactory.
  • Once .jar has been created, we can keep it in a specific folder and trigger a Spark job. This can be performed by triggering a Spark command from Jenkins. Execute the below command to submit spark job:

# spark-submit –name appName –master yarn –deploy-mode cluster –executor-memory 1g –driver-memory 1g –jars $(echo $dependencyJarDir/*.jar | tr ‘ ‘ ‘,’) –class com.coe.spotter.DQModule –conf spark.driver.extraJavaOptions=../src/main/resources/log4j-yarn.properties –conf spark.sql.warehouse.dir=./spark-warehouse –conf spark.yarn.submit.waitAppCompletion=false app.Jar

 

  • Once deployed, we can see the application ID number has been increased.
Figure 9

Timeout in the Spark streaming job has been increased to 50 seconds.

Figure 10

— Nodagala Nagarjuna

Filed Under: Application Performance Management/Monitoring, DevOps Practice, DevOps Toolbox Tagged With: application deployment, gitlab, Hadoop Cluster, Jenkins, JIRA

« ABO and the Mystical Art of Source Code Compilation
Sauce Labs Secures $50 Million in New Funding to Accelerate Rapid Growth Trajectory »

Techstrong TV – Live

Click full-screen to enable volume control
Watch latest episodes and shows

Upcoming Webinars

Evolution of Transactional Databases
Monday, January 30, 2023 - 3:00 pm EST
Moving Beyond SBOMs to Secure the Software Supply Chain
Tuesday, January 31, 2023 - 11:00 am EST
Achieving Complete Visibility in IT Operations, Analytics, and Security
Wednesday, February 1, 2023 - 11:00 am EST

Sponsored Content

The Google Cloud DevOps Awards: Apply Now!

January 10, 2023 | Brenna Washington

Codenotary Extends Dynamic SBOM Reach to Serverless Computing Platforms

December 9, 2022 | Mike Vizard

Why a Low-Code Platform Should Have Pro-Code Capabilities

March 24, 2021 | Andrew Manby

AWS Well-Architected Framework Elevates Agility

December 17, 2020 | JT Giri

Practical Approaches to Long-Term Cloud-Native Security

December 5, 2019 | Chris Tozzi

Latest from DevOps.com

Let the Machines Do It: AI-Directed Mobile App Testing
January 30, 2023 | Syed Hamid
Five Great DevOps Job Opportunities
January 30, 2023 | Mike Vizard
Stream Big, Think Bigger: Analyze Streaming Data at Scale
January 27, 2023 | Julia Brouillette
What’s Ahead for the Future of Data Streaming?
January 27, 2023 | Danica Fine
The Strategic Product Backlog: Lead, Follow, Watch and Explore
January 26, 2023 | Chad Sands

TSTV Podcast

On-Demand Webinars

DevOps.com Webinar ReplaysDevOps.com Webinar Replays

GET THE TOP STORIES OF THE WEEK

Most Read on DevOps.com

What DevOps Needs to Know About ChatGPT
January 24, 2023 | John Willis
Microsoft Outage Outrage: Was it BGP or DNS?
January 25, 2023 | Richi Jennings
Optimizing Cloud Costs for DevOps With AI-Assisted Orchestra...
January 24, 2023 | Marc Hornbeek
Dynatrace Survey Surfaces State of DevOps in the Enterprise
January 24, 2023 | Mike Vizard
Deploying a Service Mesh: Challenges and Solutions
January 24, 2023 | Gilad David Maayan
  • Home
  • About DevOps.com
  • Meet our Authors
  • Write for DevOps.com
  • Media Kit
  • Sponsor Info
  • Copyright
  • TOS
  • Privacy Policy

Powered by Techstrong Group, Inc.

© 2023 ·Techstrong Group, Inc.All rights reserved.