Linux: Working with awk and sed - dummies

By Emmett Dulaney

This Linux example involves a database of books that includes the ISBN number of each title. In the old days, ISBN numbers were ten digits and included an identifier for the publisher and a unique number for each book. ISBN numbers are now thirteen digits for new books.

Old books (those published before the first of 2007) have both the old 10-digit and a new 13-digit number that can be used to identify them. For this example, the existing 10-digit number will stay in the database and a new field — holding the ISBN-13 number — will be added to the end of each entry.

To come up with the ISBN-13 number for the existing entries in the database, you start with 978, then use the first 9 digits of the old ISBN number. The thirteenth digit is a mathematical calculation (a check digit) obtained by doing the following:

  1. Add all odd-placed digits (the first, the third, the fifth, and so on).

  2. Multiply all even-placed digits by 3 and add them.

  3. Add the total of Step #2 to the total of Step #1.

  4. Find out what you need to add to round the number up to the nearest 10. This value becomes the thirteenth digit.

For example, consider the 10-digit ISBN 0743477103. It first becomes 978074347710, and then the steps work out like this:

  1. 9+8+7+3+7+1=35

  2. 7*3=21 ; 0*3=0; 4*3=12; 4*3=12; 7*3=21; 0*3=0; 21+0+12+12+21+0=66

  3. 66+35=101

  4. 110-101=9. The ISBN-13 thus becomes 9780743477109.

The beginning database resembles:

0743477103:Macbeth:Shakespeare, William
1578518520:The Innovator's Solution:Christensen, Clayton M.
0321349946:(SCTS) Symantec Certified Technical Specialist:Alston, Nik
1587052415:Cisco Network Admission Control, Volume I:Helfrich, Denise

And you want the resulting database to change so each line resembles something like this:

0743477103:Macbeth:Shakespeare, William:9780743477109

The example that follows accomplishes this goal. It’s not the prettiest thing ever written, but it walks through the process of tackling this problem, illustrating the use of awk and sed.