How to Control Variable Order in a Dataset

Data Science Essentials For Dummies

The order of variables (columns) in a dataset is usually just a matter of how they were arranged in the source file or the database query that was used to import them. That arrangement may not be convenient for you. If you have many variables, it may be hard to spot the ones you want to see.

Or perhaps some order makes sense to you, and you’d like the variables arranged that way.

Data-mining applications often allow you to change the order of variables, but instructions rarely show up clearly in the menu or help. These functions are usually buried within tools that serve broader purposes. Look for subtle options like these within data viewers and dialog boxes for other procedures (especially those for data manipulation and data export):

Drag and drop: Tables that display data or metadata (such as variable names and formats) may be interactive, allowing you to change variable order by dragging and dropping columns.
Up and Down buttons: Buttons labeled with the words Up and Down, or arrows pointing up or down, let you move variables up and down within lists to change order.
Selection order: When you select variables from a list (as you would to use only a few of a dataset’s variables in a particular procedure), the order in which you select them may persist in subsequent operations.
Sort gestures: When viewing variable lists (metadata), you may have sort options that enable you to rearrange variables, perhaps in alphabetical order or by type.

You may encounter situations where the arrangement of variables is not purely cosmetic. (For example, the Orange data-mining application expects the dependent variable to be the last variable in the dataset.) If you can’t find a way to reorder variables within your data-mining application, go back to your source data and change the format with another tool; then reimport the data into your data-mining application.

About This Article

About the book author:

Meta S. Brown helps organizations use practical data analysis to solve everyday business problems. A hands-on data miner who has tackled projects with up to $900 million at stake, she is a recognized expert in cutting-edge business analytics.