The 10 Most Common Beginner Programming Mistakes - dummies

The 10 Most Common Beginner Programming Mistakes

By Stephen R. Davis

As a beginning C++ programmer, you may never recognize mistakes that you make. All you notice is that it takes a lot longer to write and debug your programs. Even then, once the programs are deployed they still seem to have pesky errors that should have been found during testing.

To save time and trouble, look for the ten most common mistakes that beginning programmers make.

Not following a consistent writing style

Humans have a very limited amount of computing power to work with (the computing power between their ears). They need to maximize its effectiveness when taking on an admittedly difficult task such as programming.

Programmers are not using their carbon computers to maximum effect when they have to wade through programs that

  • Don’t have consistent indentation

  • Don’t use a clear convention for naming things

  • Don’t provide meaningful names for things

  • Don’t have meaningful but concise comments

Once you’ve adopted a clear programming style, it starts to feel as natural as those work jeans you pull on when you’re tackling weekend work around the house. You don’t have to think about it — you know where things are. You know, for example, that if a name is capitalized, then it’s probably the name of a class. If it’s all caps, then it’s a constant of some type. If it’s indented, then it’s within a loop or an if statement. Such conventions allow you to spend more of your precious brain power thinking about the problem you’re trying to solve — and less about the coding details.

Writing functions that are too big

People don’t start out with the intent of writing huge functions. They just start writing. This function needs to do this, and then that, and, oh yeah, this other thing over here. Pretty soon you’re up to 500 lines of code or more — and it’s difficult to figure out exactly what-all the function does.

Big functions are hard to debug and maintain for several reasons:

  • It’s hard to understand exactly what they’re trying to do.

  • There are too many interactions.

  • There are too many paths through the code.

A function is too big if it violates any of the following rules:

  • It should be no more than 50 lines in length.

  • It should be explainable in one sentence that doesn’t contain AND or OR.

  • It should contain no more than eight total if statements, switch statements, or looping constructs.

Of course, the 50-line rule is arbitrary; it became popular because that’s what would fit on a single page of computer printout paper. But it’s still about the right size for an upper bound. By the time you exceed that number, you’re getting into “When is this function going to end?” territory.

So what do you do if you find yourself exceeding these limits? You factor the existing function into a number of subfunctions by asking yourself, “What does this function do?” Every time you see an AND or an OR, it’s time to think function. Consider the following example:

“My function gets the name of a file from the keyboard, then opens the file and reads student objects until it gets to the end and then averages their GPAs and displays the results.”

This description suggests the following functions:

  • fstream getFileObject()returns the handle of a file indicated by the user.

  • Student* readStudents(fstream&) reads Student objects from a file.

  • averageStudentGPAs(Student*) averages the GPAs for a collection of students.

  • displayGPA(ostream& out, double gpa) displays the student GPA to the output stream.

Each of these functions is easy to understand and could be written in far less than 50 lines. The original function does little more than call these subfunctions to do all the work.

Writing code without a plan

Given a problem, beginning programmers are far too quick to start coding without a plan. A plan includes a whole raft of things that the experienced programmer comes to take for granted:

  • Specify the requirements: The programmer needs to understand what the program needs to do.

    This may sound obvious, but in the heat of battle, it’s easy to think about just one aspect of the problem without considering the complete problem. You need to document your program to impart some understanding to your user of what the program is supposed to do. Don’t forget to include the edge cases. This is your opportunity to determine the scope of the program — often the programmer is focused on solving a particular problem while users assume that the program will solve all problems.

  • Design the program: The programmer then needs to sit back and consider, on a very high level, how the program should work.

    This is difficult for beginners because they have no experience to fall back on. The programmer must decide how the program will work internally — including how the database tables (if any) will be laid out.

  • Design the interface: The programmer needs to decide what the user interface will look like and how it will work.

    • *Will the program have its own interface or will it be accessed through a browser?

    • *What will the user interface look like?

    • *How will the user navigate from one window to another?

  • Design the test: Beginners are surprised to learn that the beginning of the project is the best time to think about testing. Determining how you test your program will affect how you lay it out.

Learning a programming language is just the first step toward learning to program.

Global variables

Beginning programmers tend to declare all variables globally. “Why worry about all of this scope nonsense? Why not just declare the variable globally so I can use it when I want to?” Well, problems occur when the variable has a value that you don’t expect.

Did the variable not get initialized — either because you forgot or because the logic flow didn’t go through the initialization code? Or did the variable get initialized incorrectly? Or did the variable get initialized correctly but then reset by some other function to an unexpected value? There’s really no way to tell — unless you execute the program one section at a time while keeping an eagle eye on the variable.

There’s nothing more frustrating than finding that the problem you’ve been chasing all day goes back to a global variable, changing values unexpectedly, after a call to some function that has nothing to do with that variable.

Not validating user input

A program must be very careful whenever it accepts external input. First, the program must make sure that user input doesn’t overflow some internal buffer. Second, if the program uses user input to build queries, it must make sure that said input doesn’t contain controls that can leak into the query. Failure to do these basic checks can make your program hackable and a danger to society.

Beyond that risk, however, you need to make sure that what you’re reading is actually what you expect. This is normally done with markers. For example, the following format reads in scores on a student test:

1234 35 2345 37 3456 29 5678 31

The first number is the student ID. The second value is the corresponding score.

The problem is that there’s very little positional information provided that can be used to at least detect when the program is out of sync — and perhaps to get it back into sync . Consider what happens if the input file has even the smallest error:

1234 3 5 2345 37 3456 29 5678 31

Here an extra space was inserted between the ‘3’ and the ‘5’. So now 1234 will be assigned the value 3 rather than 35. But what about poor student 0005? He now gets a value 2345 and so on down the line.

The situation is improved if each input is placed on a separate line:

1234 3 5
2345 37
3456 29
5678 31

Now it’s possible to detect an error in the first row. Even if that error went undetected, a well-written program would use the newlines to resync so that only the value for 1234 would be stored incorrectly.

Using a stream object without checking for fail

This one is just so easy to screw up — and so hard to notice until it does.

It’s very easy to write something like the following:

int value;
while(!input.eof())
{
    input >> value;
    processValue(value)
}

This loop is supposed to read values from an input stream and processes them until it encounters the Endof File. The problem with this loop is that it works fine most of the time. If the program encounters a non-integer in the file, however, the program turns into an infinite loop. This is because once the extractor encounters something it doesn’t understand — say, a character where a number should be — it sets the fail flag in the input object. From that point on, the extractor stubbornly refuses to perform any input.

Worse than the refusal to perform I/O when the fail flag is set is the fact that the stream functions don’t complain about it. The program assumes that whatever that happens to be in value was just read from the file — when, in fact, it’s just left over from a previous read.

The following loop is far preferable:

int value;
while(!input.eof())
{
    input >> value;
    if (input.fail())
    {
        break;
    }
    processValue(value)
}

Now the loop exits if the program either reaches End of File or encounters a value that it doesn’t understand.

Mishandling an exception

The exception mechanism is a great tool for handling errors. Like any tool, however, exceptions can be misapplied. The most common error is to catch exceptions that you never intended to. The following code snippet demonstrates the principle:

// delete the file
try
{
    deleteFile(filename);
}
// ignore the error if the file is not present
catch(...)
{
}

The programmer knows that the deleteFile() function throws a FileNotFoundException if the file to be deleted is not present. Rather than catch that exception, however, she catches everything. Most of the time, the exception probably is because the file is not there but the exception could just as well be completely unrelated. By catching and ignoring the exception, the user is unaware that the file was not deleted.

The proper code would appear as follows:

// delete the file
try
{
    deleteFile(filename);
}
// ignore the error if the file is not present
catch(FileNotFoundException& e)
{
}

Note: The seeds of this error were laid in the decision to throw an exception if the file to be deleted is not found. One could argue that this is not an exceptional situation but a part of normal processing — and should have resulted in a simple error return.

Failing to maintain a program log

A production system needs to record what it’s doing. This is especially true for Internet-accessible systems. Sometimes, “it didn’t work” is about all the debug information that a programmer gets. How in the world can a programmer tell what’s happened? By referring to the logs.

Production systems keep a constant record of what they’re doing and who asked for it to be done. By getting the person’s ID and the approximate time that the failed request was made, a programmer can go back into the logs and find the request. Ideally, the log will tell the programmer the request, whether it worked or not, and if not, why not.

Most of the time a programmer gets to tell folks what they messed up in their requests — but if they really did stumble into a problem with the program, the log should be able to give the programmer enough information to at least re-create the error in the lab where I can study it offline, find the problem, and push a fix back out to production.

Not using a debugger

A debugger gives the programmer the opportunity to step slowly through your code to better understand exactly what it’s doing. This step is critical for a program that’s not working — but it’s just as important for a program that appears to be working fine.

I don’t know the number of times that I’ve stepped through a function that is generating the proper results only to realize that the function is not working the way I want it to. Either it doesn’t handle all of the possible legal input values or — more likely — it doesn’t detect the possible invalid input values.

The debugger gives you more information to work with when you need a more detailed understanding of what your program is doing.

Not backing up your work

“My disk crashed and I’ve lost two weeks of work!” There’s really no excuse for this old-fashioned disaster, although it may take a long time to recover from a disk crash. You may have to reformat a disk, rebuild the operating system, and who knows what else — but recovering the source code you’ve written should be no more difficult than copying files from one system to another.

Inexpensive systems exist that automatically back up the entire disk on a nightly basis. Other systems back up your files over the Internet, almost continuously. These systems are not expensive — especially in comparison to programmer time.

One other thing: Nightly backups can be kept in the same room as the original disk — but at least once per week, backups should be moved to a different physical location to guard against loss due to fire or water damage.

If you’re development facility as Internet access, you can maintain a copy of your source code and all the documentation in the cloud through a commercial service like DropBox.