Basic Linux Commands: The diff command

Basic Linux Commands: The diff command
Page content

What diff does

The diff command takes two files and compares them. The resulting output of the command shows the differences between the two files, and how to make the original file like the new file. But many people see the resulting output and quickly go cross-eyed with confusion. Here’s an example. You have two lists of employees (first names only). the first list looks like:

Anna

Bob

Chris

David

Evan

Fred

George

Hanna

Isobelle

Jack

Kelly

Lana

Misty

Nana

Olivia

Paul

Here is the second list:

Angie

Bob

Betty

Chenica

David

Evan

Fred

George

Hanna

Isobelle

Jackie

Kelly

Lana

Mary

Nana

Olivia

Paul

To find out, without having to scan and compare manually, the differences between the files you can run the command diff employee_list1 employee_list2.

The output of that command would look like:

1c1

< Anna

-–

> Angie

3c3,4

< Chris

-–

> Betty

> Chenica

10c11

< jack

-–

> Jackie

13<14

< Misty

-–

> Mary

To many, that output is enough to make you go cross-eyed trying to figure out the meaning. It doesn’t have to be that way. Let’s take a look how to make it easier to understand.

Contextual output

Instead of using the standard output, let’s take a look at what contextual output gives us. To get contextual output you issue the diff command like so:

diff –context=1 employee_list1 employee_list2

Contextual output begins with a two line header that will look like this:

***from_file from_file_modification_time

-–to_file to_file_modification_time

Following this section will be the meat of the output. This section will actually show you how to make the files match. This output will look like this:

*******************

***from_file_range***

from_file_line

from_file_line

-–to_file_range—

to_file_line

to_file_line

Now let’s look at the actual output from our two employee files.

*** employee_list1 2008-11-20 13:23:26.000000000 -0500

-– employee_list2 2008-11-20 13:24:12.000000000 -0500

***************

*** 1,4 ****

! Anna

Bob

! Chris

David

-– 1,5 —-

! Angie

Bob

! Betty

! Chenica

David

***************

*** 9,14 ****

Isobelle

! Jack

Kelly

Lana

! Misty

Nana

-– 10,15 —-

Isobelle

! Jackie

Kelly

Lana

! Mary

Nana

Breaking it down

The output is actually quite simple. Let’s look at the first chunk:

*** 1,4 ****

! Anna

Bob

! Chris

David

-– 1,5 —-

! Angie

Bob

! Betty

! Chenica

David

In order to illustrate what this means let’s compare the original lines that are reporting differences in both files:

From File

Anna

Bob

Chris

David

From Output

!Anna

Bob

!Chris

David

What the above is showing us is that in the “To File” “Anna” and “Chris” have changed (or, in our case, are missing). Now let’s look at the “To File”.

To File

Angie

Bob

Betty

Chenica

David

To File Output

!Angie

Bob

!Betty

!Chenica

David

What this is telling us is that “Angie”, “Betty”, and “Chenica” have changed (or, in our case, are missing) in the “From File”.

Final Thoughts

The diff command is incredibly useful. There are so many instances when file differences must be compared: from kernel patches, to employee files, to word processing files… the possibilities are only limited to your imagination. And diff doesn’t need to be confusing. With the help of contextual output, the results of running diff on two files can become very simple to understand. This article should have given you a good foundation with which to understand a Linux command that is often thought of as far too complex for the new user.

This post is part of the series: Linux configuring and programming tools

Linux can be a bit overwhelming as a new user. Fortunately there are plenty of tools to make this easier. In this Bright Hub Linux syntax commands series you will be introduced to various tools that will aid in the configuration and programming process.

  1. Linux Configuration and Programming tools
  2. Linux Command Line: g++
  3. Linux Command Line: which
  4. Linux Command Line: diff