As we close out 2018, we at DevOps.com wanted to highlight the five most popular articles of the year. Following is the first in our weeklong series of the Best of 2018.
Well-written code is a work of art. Always has been, always will be. A programmer pulls a thought pretty much out of nowhere and transforms it into a working idea that can be used by others. It’s abstract expression made real. Computer programming requires a depth of creativity and discipline of logic that is hard to find elsewhere. Maybe architecture and the theoretical sciences come close, but computer programming stands apart. Computer programming is special and I love it!
Thus, I am always interested to learn a new programming language. Every programming language offers a new way to expand my thinking. It’s always time well-spent.
Recently I decided to learn COBOL, for no other reason than the fact that there are a lot of mainframe installations out there and it’s the mainstay language for many of them. Mainframes are critical to the operations of many banks, insurance companies, transportation systems and governmental agencies. Learning COBOL been on my Bucket List for while. So I took the plunge.
What does the term, COBOL stand for? Common Business-Oriented Language |
Working with COBOL in a Modern IDE
The first thing I needed in my journey to learn COBOL was an IDE. I am a big supporter of coding in an integrated development environment (IDE). I like being able to write, test and run code all in one place. Also, I find the support features that an IDE provides, such as visual code structure analysis, code completion and inline syntax checking, allow me to program and debug efficiently.
The IDE I found is an open source product, OpenCobolIDE, as shown below in Figure 1.
Figure 1: OpenCobolIDE provides many tools to make COBOL programming easier
OpenCobolIDE allows me to write, compile and run code all in one place, without having to go out to the command line, which is very good because I am the world’s worst typist!
Programming COBOL in the Real World When it comes time to do real world, mainframe based COBOL programming, you will do well to look at the IDEs that IBM provides. The tools are designed to go about against its Z Systems environment seamlessly as a set of Eclipse plugins. These tools allow COBOL developers working in the mainframe environment to code, debug, unit test and do problem determination. |
Having set up my development environment, the next thing to do was design my first project.
Designing My First COBOL Program
Typically when learning a new language I like to start small and work my way up into more complex tasks. I did the typical Hello World program. You write the code to display string, Hello World, to standard output. Every developer does one. It’s a trivial program, as shown below in Listing 1
Listing 1: A Hello World program written COBOL
While the Hello World required that I learn about the concept of a DIVISION, which I will discuss in detail below, in the scheme of things, I needed more. I wanted to write a program that would force me to learn how the following:
- How to create and use variables.
- How to structure data into a hierarchy.
- How to structure code into encapsulated procedures.
- How to do some basic arithmetic.
- How to accept user input and then do some manipulation around that input.
Thus, I create the program, RESEL-WORLD, as shown in Listing 2, below.
Listing 2: The COBOL program, RESEL-WORLD accepts and manipulates users’ input
The user input and program output is shown below in Figure 2.
Figure 2: The RESEL-WORLD COBOL program asks a user for first name, last name and age, then manipulates the input
The RESEL-WORLD program taught me a lot about COBOL and how to program in the language. In the spirit of giving, I am going to share what I learned from writing the program. However, before I delve into the details, please be advised that the information I’m presenting is but a high-level overview. As with any programming language, it takes about a year of consistent coding to become an entry-level professional. My sincere hope is that the information I provide gives the reader a good sense of the language and enough motivation to want to learn more. As I mentioned early, learning COBOL is worth the effort.
COBOL is Code in a Structured Document
The most important thing to understand when learning COBOL is that is very strict in terms of code layout. The layout rules relate to the use of columns and characters. Also, the format uses a hierarchical outline structure. The following sections describe the details of layout specification.
The COBOL Column Specification
In COBOL a line of code can be no longer than 80 columns in length. You can also think of a column as a character. Columns are segmented into groups, with each group serving a particular purpose. The columns groups are as follows:
Columns 1-6 is the group in which a programmer defines a sequence number. A sequence number is similar to a line number.
Column 7 is reserved for special characters. Asterisk (*) starts a line of comment. Hyphen (-) indicates line continuation and slash (/) is form feed.
Columns 8-11 is called Area A. Area A is the group of columns in which you start a DIVISION, PARAGRAPH and SECTION. We’ll talk more about these below in the next section, The COBOL Structural Hierarchy.
Columns 12-72, also known as Area B, is where you write code statements.
Columns 73-80 is reserved for developer use. You can write poetry in there, if you so desire. Just make sure you don’t go past column 80.
Figure 3, below illustrates the columns grouping described above.
Figure 3: Each column in a COBOL file is specified to serve a specific purpose
COBOL is a compiled language. When it’s time to run your code, the compiler will check to make sure that the code layout adheres to the column grouping specification. If there is a violation, the compiler will error.
Understanding Performance in Terms of Compilation and Compilers COBOL is a compiled language, as are others such as Java, C# and C++. Compilation is the process of taking textual source code and converting into a binary format that the computer can understand. There’s a lot of variety in the COBOL world when it comes to compilers, particularly when it comes to cost and performance efficiency. For example, how a compiler orchestrates the way numbers are loaded and then computed in memory matters a lot! A good compiler will make code run really fast. IBM has been in the COBOL business for a long time and has a keen understanding about making compilers that are fast and cost-effective. For example, the z14 System compilers improve efficiency by taking advantage of almost 24 new low-level instructions. As a result, computation speeds improve dramatically. As you work more with COBOL, you’ll come to appreciate the power that an industry-leading compiler such the IBM Z Systems brings to the coding experience. |
The COBOL Structural Hierarchy
The concept behind the file format of a COBOL program is based on the structure of a document outline, with a single top-level heading followed by subordinate levels. The organizational units that make up the hierarchy are PROGRAM, DIVISION, SECTION, PARAGRAPH, SENTENCE, STATEMENT and CHARACTER. Figure 4 illustrates the hierarchy.
Figure 4: The structural hierarchy of COBOL program entities
PROGRAM is the root level of the COBOL code hierarchy. PROGRAM represents the unit of code that the mainframe job scheduler, the JCL, loads into memory to run. The program is identified by the PROGRAM ID statement in the IDENTIFICATION DIVISION. The IDENTIFICATION DIVISION is part of the next level of hierarchy that descends from PROGRAM. PROGRAM must contain the IDENTIFICATION DIVISION.
There are other DIVISIONs that can be included. These other DIVISIONs are: ENVIRONMENT DIVISION, DATA DIVISION and PROCEDURE DIVISION, which occur after IDENTIFICATION. Also, these subsequent DIVISIONs must appear in the order defined: ENVIRONMENT, DATA and then PROCEDURE. You can read a detailed description of each DIVISION here. DIVISION names are terminated with a period, for example,
DATA DIVISION.
The next organizational level down from DIVISION is SECTION. (Please see Figure 5.) Each DIVISION will contain SECTIONs that are special to it. For example, the DATA DIVISION can contain FILE SECTION, a, WORKING-STORAGE SECTION and/or the LINKAGE SECTION. You can read the details about DIVISIONs and SECTIONs here.
Figure 5: A COBOL program is structured in a hierarchical manner
A SECTION name is terminated with a period, for example:
WORKING-STORAGE SECTION.
A SECTION contains zero or many PARAGRAPHs. (Typically a SECTION will have at least one PARAGRAPH.) A PARAGRAPH name is terminated with a period, similar to a DIVISION and a SECTION.
A PARAGRAPH contains SENTENCEs or STATEMENTs. A SENTENCE is a group of STATEMENTs. Usually a PARAGRAPH contains one or more STATEMENTs. A STATEMENT is a line of execution. A STATEMENT is made up of CHARACTERs. A CHARACTER can be an alphanumeric symbol or a special character. CHARACTER is at the bottom of the COBOL code format hierarchy.
Variable and Data Types
COBOL allows you to declare variables in a variety of data types. Special to COBOL is the concept of a variable level as represented by a level number. A level number defines a variable in terms of being or having a parent variable. (You’ll see more about this in the section, COBOL Supports Hierarchical Data, later on.)
The following statement, which will be declared in the WORKING-STORAGE SECTION of the DATA DIVISION, is the declaration of a variable, WS-QUANTITY. WS-QUANTITY will hold a numeric value.
01 WS-QUANTITY PIC 9(2) VALUE 12
The expression above above declares a variable , WS-QUANTITY to be a two-digit value with an initial value of 12. Also, the variable is declared to be at level 01, which is the highest level possible. The variable has no parent. The way we know that WS-QUANTITY is a variable for a two-digit value is due to the PIC (Picture) clause. You can think of a PIC clause as a type declaration. Table 1 provides a high-level description of the various PIC clause symbols.
Symbol | Description | Example Declaration | Sample Value |
9 | A numeric value where each occurance of 9 represents a digit | 99 or 9(2) | 35 |
a | Alphabetic | aaa or a(3) | “Bob” |
x | Alphanumeric | xxxx or x(4) | “R2D2?” |
v | Implicit decimal* | v(3) | .175 |
s | Sign | s9(2) | -76 |
p | Assumed decimal* | p9 | .6 |
Table 1: Symbols for a PIC Clause
*COBOL has a special way of storing numbers such that the numeric conversion take place when the characters are loaded into memory. For example, a variable with a PIC of p9(2) will be stored as 56 and loaded into memory as .56. For a more detailed discussion of usage and the PIC clause, go here.
COBOL Supports Hierarchical Data
Storing data in a named group of variables makes programing easier. For example, in the C programming language, we can use a struct to name a group of variables, as shown below in Listing 3.
struct User { char first_name[10]; char last_name[10] int user_id; };
Listing 3: A simple structure in C
One of the real nice things about COBOL is that it allows you to organize data according to a named group of values similar to that of a C struct. What’s interesting in a historical sense is that COBOL has had this capability well before Kernighan and Ritchie created the C programming language.
The way COBOL accomplishes named data grouping is with construct called a record. Figure 6 below shows a graphical representation of a record named, WS-USER.
Figure 6: The record, WS-USER contains subordinate variable, WS-FIRST-NAME, WS-LAST-NAME, WS-AGE
The code snippet in Listing 4, which is from the RESEL-WORLD program at the start of this article, shows the COBOL code that declares the record, WS-USER.
Listing 4: The record, WS-USER contains 3 subordinate elements, WS-FIRST-NAME, WS-LAST-NAME, WS-AGE
Notice in Listing 4 above the there are two levels of data variables in play: 01 WS-USER and 05 WS-FIRST-NAME, 05 WS-LAST-NAME, 05 WS-AGE. The variable, WS-USER is a parent to the subordinate, level 05 variables, WS-FIRST-NAME, WS-LAST-NAME, and WS-AGE.
Assigning variable to a numeric level is special to COBOL. The declaration logic is that a record variable (the root of the record) has a level number of 01. Levels 02 to 49 are used to declare subordinates elements. (You can think of a element as a member of the record.)
You use the reserved word OF to access an element of a record. Listing 5 below shows two statements. The first asks a user to input their first name. The second statement takes the value entered and assigns it to the element, WS-FIRST-NAME of the record, WS-USER.
DISPLAY "What is your first name?". ACCEPT WS-FIRST-NAME OF WS-USER.
Listing 5: Accessing an element of a record in COBOL
It make sense that records were part of COBOL from the start. COBOL was intended to be a language used for business application. Businesses have been organizing values according to named groups since before the customer form was created. COBOL was designed to reflect business needs. It’s interesting that the fundamental needs of businesses in terms of logical data structures have been surprisingly consistent over time.
COBOL Has a Natural Language Syntax
Expressing a statement in COBOL is very similar to the way a person speaks naturally. Listing 6 below shows the statement that adds together the values of the two variables, WS-AGE-DELTA and WS-AGE (from the record WS-USER), and stores the result in the variable, WS-NEW-AGE.
ADD WS-AGE-DELTA WS-AGE OF WS-USER TO WS-NEW-AGE.
Listing 6: Adding two integers, WS-AGE-and DELTA WS-AGE, then storing the sum in the variable, WS-NEW-AGE
Notice that the statement in Listing 6 is very close to saying, “Add WS-AGE-DELTA and WS-AGE from WS-USER to WS-NEW-AGE.”
What I find pretty astounding is that many modern programming paradigms try to capture the ease of natural language expression that COBOL has had for years. What comes to mind immediately is the use of natural language expressions in the NodeJS Chai package. Chai is used in NodeJS unit testing to express BDD assertions. Here’s a Chai expression that checks a variable, myVar to assert that it’s a string.
expect(myVar).to.be.a('string');
FAST FACT: Is COBOL a case-sensitive language? No. COBOL will consider a variable named WS-FIRST-NAME to be the same as one named ws-first-name. |
COBOL supports DRY
For programming language to be really useful, it needs to support DRY. DRY is an acronym for Don’t Repeat Yourself. Most programmers don’t want to write the same code over and over again, nor should they. What you want to do is to encapsulate code in to a single area of execution that can be called repeatedly. Being able to program by DRY is common for most languages—BASIC, Java, C#, Javascript, Python, to name a few and … you guessed it: COBOL.
COBOL supports code segmentation and reuse in a variety of ways. One way is through linkage, calling routines between programs. Another way is to segment code inline under a PARAGRAPH in the PROCEDURE DIVISION and then call that PARAGRAPH. (See Figure 7.)
Figure 7: You can make code reusable in COBOL PARAGRAPHs
The little RESEL-WORLD program I wrote uses the inline PARAGRAPH technique. Before I go in to the details of calling code encapsulated into a PARAGRAPH, it’s important that you know that in COBOL the entry point of execution into a program is the first line of code after the declaration of the PROCEDURE DIVISION, as shown below in Listing 7.
Listing 7: COBOL supports reusable code by using PARAGRAPHS to segment execution
Notice the PERFORM statements right after PROCEDURE DIVISION. These four statements are calling PARAGRAPHs. (For those of you familiar with JavaScript, you can think of each PARAGRAPH as a function definition.) Essentially what the code is saying is, “Execute the code at the paragraph named, GET-DATA, then perform the code at CALC-DATA, SHOW-DATA and then FINISH-UP.”
Encapsulating code into STATEMENTs within a PARAGRAPH prevents code from become spaghetti that is hard to maintain. Also, the encapsulation enforces the sensibility of DRY. Any language worth its salt needs to support some sort of encapsulation. COBOL meets the need with room to spare.
An Expressive Language for Now and the Future
The more I learn about COBOL, the more I like it. The language continues to evolve to meet the needs of our fast-changing times, with revisions as recent as 2014. Since its inception there have been a dozen enhancements to COBOL including a continuing stream of formal standards.
Today’s COBOL supports modern programming paradigms such as object orientation. The IDEs have grown to keep pace with the demands of modern users. And, given the immense installation base out there, there is a lot of money to be made doing COBOL programming. To quote Wikipedia:
In 2006 and 2012, Computerworld surveys found that over 60% of organizations used COBOL (more than C++ and Visual Basic .NET) and that for half of those, COBOL was used for the majority of their internal software.
What’s even more amazing to me is the cleverness of engineering the language has promoted. COBOL developers addressed problems in the past that still vex many today. We modern developers can fall prey to thinking that before we came along, there was no cool stuff. It’s like the young, aspiring guitarist who thinks thinks he has his chops down and can shred to the top of the heap. Then one day his music teacher gives him a 1936 recording of Django Reinhardt playing guitar, and playing with only two functional fingers on his left hand! At that point the young artist realizes that virtuosity transcends era and that creativity has no bounds. This is what it’s like for me as I learn COBOL. It’s a beautiful, expressive language that was cool then and is very cool now. Learning it is making me appreciate how much amazing thinking went on back then and continues to emerge. There’s a lot of rich opportunity at hand to make great code for mainframes using COBOL. It’s only a matter of mastery, creativity and discovery.
Special Thanks
If I have seen further it is by standing on the shoulders of giants.
—Isaac Newton, Letter to Robert Hooke, February 5, 1675
Learning any programming language is hard work. Years ago, when I wanted to learn a new language, I would get a few books on the topic and then hunker down to spend the hours necessary to absorb the required knowledge. If I had the time, I might take a course. Maybe I had an expert friend I could call up when I hit a wall or needed real-world guidance.
We’ve come a long way since that time. The internet is a game-changer. Online tutorials, videos and interactive, digital books make things a whole lot easier than in earlier times. My experience learning COBOL testifies to the value of these modern benefits. But still, as I’ve been learning, I’ve hit more than one wall that required professional guidance and review. Allan Kielstra and Roland Koo at IBM (who also wrote this great article) provided the expertise that made my learning COBOL not only less difficult, but also fun. I am in their debt. Their enthusiasm for COBOL was infectious. Their commitment to technical excellence is inspiring.