Learning by example: how bad code propagates

Studying code is one of the best ways to learn about programming. Experienced programmers use published algorithms and sample code; writing everything from scratch is foolish. Less experienced programmers often start with code they find in a book or on a web site and tinker with it to learn and solve their particular problem. Casual programmers cobble things together from code samples they don’t understand. It’s easy to find amateurish Visual Basic or PHP or Java code to make fun of.

Example code may be hastily written to illustrate a point in the documentation, or to show off a hack, but what an experienced programmer recognizes as a throwaway snippet may look like a valuable coding gem to someone who can’t judge code quality. Bad sample code can take on an unexpected life of its own and propagate from one amateur programmer to another.

I got a copy of Ajax Hacks at OSCON last year. The first example I opened the book to — validating credit card numbers on web forms — was at first sight so bad that I was tempted to submit errata to O’Reilly. The JavaScript code in that example perfectly illustrates how bad code encourages inexperienced programmers to get something to work by pasting in found code without understanding the problem or how the code works.

Web sites that accept credit cards usually check the format and type of the credit card and make sure the card number passes the Luhn checksum algorithm before sending the transaction to the payment gateway. I’ve seen lots of code for checking credit card number formats and applying the Luhn algorithm. The code in Ajax Hacks is typical:

/* Check whether the credit-card entry is null,
   not lengthy enough, or contains any letters.
   Remove any dashes or spaces from the entry,
   then run the Luhn algorithm on the resulting
   number.
*/
function clientsideVerify(ccVal){
    if(ccVal == null || ccVal.length < 13 ||
       ccVal.search(/[a-zA-Z]+/) != -1){
           return false;
    }

    ccVal=remDashSpace(ccVal);
    return (applyLuhn(ccVal) % 10) == 0;
}

function remDashSpace(_number){
    //remove dashes or spaces
    _number = _number.replace(/-/g,"");
    _number = _number.replace(/ /g,"");
    return _number;
}

//http://en.wikipedia.org/wiki/Luhn_formulafunction
applyLuhn(cc){
    //reverse the String
    var rev = reverse(cc);
    //get array of character Strings
    var revArr = rev.split("");
    var total = 0;
    var tmp = 0;
    //add up the numbers
    for(var i = 0; i < revArr.length; i++){
        if((i % 2) > 0){
            tmp = revArr[i]*2;
            tmp= (tmp < 9 ? tmp : (tmp - 9) );
            total += tmp;
        }
       else {
            total += Number(revArr[i]);
       }
    }//end for
    //alert(total);
    return total;
}

...

function reverse(str){
    var sArray = str.split("");
    var newS="";
    for(var i = sArray.length-1; i >= 0; i--){
        newS += sArray[i];
    }
    return newS;
}

At first glance I wondered, why not use Array.reverse() after converting the credit card number from a string to an array? The more I looked at the code the more I realized that the author had implemented the Luhn algorithm more or less verbatim from the Wikipedia article referenced in the comments. There are other problems: the functions to check for non-digits and remove spaces and dashes from the credit card number were not robust (a comma or $ would not be stripped) and demonstrate a poor grasp of regular expressions.

A JavaScript loop can move forward or backward through a string or array, so there’s no reason to actually reverse the string as the Luhn algorithm describes. Doubling and casting out nines for the digits one through nine can be done by looking up constants rather than doing the calculation each time, inside a loop, and I think it’s always a good idea to remove as much code from a loop as you can. A credit card number is either valid or invalid according to the Luhn algorithm, so the function to check a card number should return true or false, not a total that must then be checked somewhere — why make the calling function finish the Luhn check by doing a modulo operation on the total?

Here’s my version of the Luhn algorithm in Javascript:

function ccLuhn(cc)
{
    // digits 0-9 doubled with nines cast out
    var doubled = [0, 2, 4, 6, 8, 1, 3, 5, 7, 9];

    // remove non-digit characters
    cc = cc.replace(/[^\d]/g, '');
    var digits = cc.split('');

    // alternate between summing the digits
    // or the result of doubling the digits and
    // casting out nines (see Luhn description)
    var alt = false;
    var total = 0;
    while (digits.length)
    {
        var d = Number(digits.pop());
        total += (alt ? doubled[d] : d);
        alt = !alt;
    }
    return total % 10 == 0;
}

Users should be allowed to use spaces or dashes when entering a credit card number so my function just strips out any non-numeric characters before calculating the checksum.

The length and type of the credit card number should be checked, too. Some merchants only accept Visa, MasterCard, and Discover. The type of a credit card can be determined from the number, so there’s no reason to ask the user to choose the card type from a menu. Here’s some JavaScript code to determine the card type from the card number. This code also checks the length of the card number.

function ccType(cc)
{
    // regular expressions to match common card types
    // delete or comment out cards not accepted
    // see: www.merriampark.com/anatomycc.htm
    var cardpatterns = {
        'visa'       : /^(4\d{12})|(4\d{15})$/,
        'mastercard' : /^5[1-5]\d{14}$/,
        'discover'   : /^6011\d{12}$/,
        'amex'       : /^3[47]\d{13}$/,
        'diners'     : /^(30[0-5]\d{11})|(3[68]\d{12})$/
    };

    // return type of credit card
    // or 'unknown' if no match
    // remove non-digits
    cc = cc.replace(/[^\d]/g, '');

    for (var type in cardpatterns)
    {
        if (cardpatterns[type].test(cc))
            return type;
    }
    return 'unknown';
}

If a merchant doesn’t accept Diners or American Express those regular expressions can be commented out or deleted from cardpatterns.To check a credit card number pass it to ccType. If the card type is not “unknown” pass the card number to ccLuhn, and if that function returns true the card number is at least valid-looking.

The Ajax Hacks book has other examples that should have been more carefully reviewed and edited. The code samples aren’t even presented in a consistent coding style, which says something about how technical books are not always edited well enough to justify their price. Maybe the code was squeezed to save space on the printed page, but I would prefer better-formatted code in a smaller typeface. The downloadable code from the O’Reilly web site is formatted just as badly as in the book, too.

When writing code examples and snippets take care to write good examples. Experienced programmers will get the gist from throwaway code, but less-experienced programmers may just copy and paste the example without understanding it. Because there are many more people copying and pasting example code than professional programmers writing good code, the bad will tend to push out the good. That’s how things like FormMail.pl spread.

One thought on “Learning by example: how bad code propagates

  1. Pingback: Twitted by imogenezg

Comments are closed.