How to Avoid Code Injection in C++ - dummies

How to Avoid Code Injection in C++

By Stephen R. Davis

The first rule of avoiding code inject into C++ programs is never, ever, allow user input to be processed by a general-purpose language interpreter. A common error with SQL-injection is that the program accepts user input as if it were always acceptable and inserts it into an SQL query that it then ships off to the database engine for processing.

As an example, a program asking for user input on a date could be hacked. The safest and most user-friendly approach would be to provide the user a calendar graphic from which he could select the start and end dates. The program would then create a date based on what the user clicked.

If this is not possible, then the program should carefully check the input to make sure that the input was in the proper format for a date, in this case yyyy/mm/dd — in other words, four digits followed by a slash followed by two digits and a slash and finally two more digits. Nothing else should be considered acceptable input.

Sometimes you can′t be that specific about the format. If you must allow the user to enter flexible text, then you can at least avoid special characters. For example, it′s pretty much impossible to do SQL code injection without using either a single or double quote.

You can′t insert HTML tags without using a less than (<) and greater than (>) sign. Or you could just take the approach that anything other than ASCII text will not be tolerated:

// check some string ′s′ to make sure it′s straight ASCII
size_type off = s.find_first_not_of(
if (off != string::npos)
    cerr << "Errorn";

This code searches the string s for a character that′s not one of the characters A through Z, a through z, 0 through 9, or underscore. If it finds such a character, then the program rejects the input.

If you allow only the Latin characters shown here, your application will not be useable in many foreign markets such as those that don′t use English character sets (such as Arabic, Chinese, Hebrew, or Russian, to name just a few). You may have to take the opposite approach and just look for the bad characters.