Basic Linux Commands: The diff command
What diff does
The diff command takes two files and compares them. The resulting output of the command shows the differences between the two files, and how to make the original file like the new file. But many people see the resulting output and quickly go cross-eyed with confusion. Here’s an example. You have two lists of employees (first names only). the first list looks like:
Anna
Bob
Chris
David
Evan
Fred
George
Hanna
Isobelle
Jack
Kelly
Lana
Misty
Nana
Olivia
Paul
Here is the second list:
Angie
Bob
Betty
Chenica
David
Evan
Fred
George
Hanna
Isobelle
Jackie
Kelly
Lana
Mary
Nana
Olivia
Paul
To find out, without having to scan and compare manually, the differences between the files you can run the command diff employee_list1 employee_list2.
The output of that command would look like:
1c1
< Anna
-–
> Angie
3c3,4
< Chris
-–
> Betty
> Chenica
10c11
< jack
-–
> Jackie
13<14
< Misty
-–
> Mary
To many, that output is enough to make you go cross-eyed trying to figure out the meaning. It doesn’t have to be that way. Let’s take a look how to make it easier to understand.
Contextual output
Instead of using the standard output, let’s take a look at what contextual output gives us. To get contextual output you issue the diff command like so:
diff –context=1 employee_list1 employee_list2
Contextual output begins with a two line header that will look like this:
***from_file from_file_modification_time
-–to_file to_file_modification_time
Following this section will be the meat of the output. This section will actually show you how to make the files match. This output will look like this:
*******************
***from_file_range***
from_file_line
from_file_line
-–to_file_range—
to_file_line
to_file_line
Now let’s look at the actual output from our two employee files.
*** employee_list1 2008-11-20 13:23:26.000000000 -0500
-– employee_list2 2008-11-20 13:24:12.000000000 -0500
***************
*** 1,4 ****
! Anna
Bob
! Chris
David
-– 1,5 —-
! Angie
Bob
! Betty
! Chenica
David
***************
*** 9,14 ****
Isobelle
! Jack
Kelly
Lana
! Misty
Nana
-– 10,15 —-
Isobelle
! Jackie
Kelly
Lana
! Mary
Nana
Breaking it down
The output is actually quite simple. Let’s look at the first chunk:
*** 1,4 ****
! Anna
Bob
! Chris
David
-– 1,5 —-
! Angie
Bob
! Betty
! Chenica
David
In order to illustrate what this means let’s compare the original lines that are reporting differences in both files:
From File
Anna
Bob
Chris
David
From Output
!Anna
Bob
!Chris
David
What the above is showing us is that in the “To File” “Anna” and “Chris” have changed (or, in our case, are missing). Now let’s look at the “To File”.
To File
Angie
Bob
Betty
Chenica
David
To File Output
!Angie
Bob
!Betty
!Chenica
David
What this is telling us is that “Angie”, “Betty”, and “Chenica” have changed (or, in our case, are missing) in the “From File”.
Final Thoughts
The diff command is incredibly useful. There are so many instances when file differences must be compared: from kernel patches, to employee files, to word processing files… the possibilities are only limited to your imagination. And diff doesn’t need to be confusing. With the help of contextual output, the results of running diff on two files can become very simple to understand. This article should have given you a good foundation with which to understand a Linux command that is often thought of as far too complex for the new user.
This post is part of the series: Linux configuring and programming tools
Linux can be a bit overwhelming as a new user. Fortunately there are plenty of tools to make this easier. In this Bright Hub Linux syntax commands series you will be introduced to various tools that will aid in the configuration and programming process.