Visual Programming and Data Mining

By Meta S. Brown

Data miners work fast. To get speed, you’ll need to use appropriate tools and discover the tricks of the trade. Your best data-mining tool is your brain, with a bit of know-how. The second-best tool is a data-mining application with a visual programming interface.

image0.jpg

With visual programming, the steps in your work process are represented by small images that you organize on the screen to create a picture of the flow and logic of your work. Visual programming makes it easier to see what you’re doing across several steps than it would be with commands (programming) or conventional menus.

In this example, you can see the work process in the main area of the data-mining application. Around it are menus of recent projects, tools for data-mining functions, a viewer to help you navigate complex processes, and a log. These details vary a little from one product to another.

Look more closely at the process. Although you are just setting out in your quest to be a data miner, you can probably understand a lot of what’s going on just by looking at this diagram, including the following:

  • You can see the CSV Reader. If you’re aware of the .csv (comma-separated values) data format, you probably already know that this is data import. (And it’s the first step; you need data to do anything else.)

  • Then you see tools clearly labeled by functions like Column Rename and String Manipulation. These are data preparation steps.

  • Tree Learner might be mysterious if you’re new to modeling, but this tool creates a decision tree model from a subset of the data.

  • The final steps apply the model to data that was kept separate for testing, and perform some evaluation techniques.

    image1.jpg