Phase 5 of the CRISP-DM Process Model: Evaluation

By Meta S. Brown

In the first four phases of the Cross-Industry Standard Process for Data Mining (CRISP-DM) process model, you’ve explored data and you’ve found patterns, and now you have to ask: Are the results any good? You’ll evaluate not just the models you create but also the process that you used to create them, and their potential for practical use.

The evaluation phase includes three tasks. These are

  • Evaluating results

  • Reviewing the process

  • Determining the next steps

Task: Evaluating results

At this stage, you’ll assess the value of your models for meeting the business goals that started the data-mining process. You’ll look for any reasons why the model would not be satisfactory for business use. If possible, you’ll test the model in a practical application, to determine whether it works as well in the workplace as it did in your tests.

Deliverables for this task include two items:

  • Assessment of results (for business goals): Summarize the results with respect to the business success criteria that you established in the business-understanding phase. Explicitly state whether you have reached the business goals defined at the start of the project.

  • Approved models: These include any models that meet the business success criteria.

Task: Reviewing the process

Now that you have explored data and developed models, take time to review your process. This is an opportunity to spot issues that you might have overlooked and that might draw your attention to flaws in the work that you’ve done while you still have time to correct the problem before deployment. Also consider ways that you might improve your process for future projects.

The deliverable for this task is the review of process report. In it, you should outline your review process and findings and highlight any concerns that require immediate attention, such as steps that were overlooked or that should be revisited.

Task: Determining the next steps

The evaluation phase concludes with your recommendations for the next move. The model may be ready to deploy, or you may judge that it would be better to repeat some steps and try to improve it. Your findings may inspire new data-mining projects.

Deliverables for this task include two items:

  • List of possible actions: Describe each alternative action, along with the strongest reasons for and against it.

  • Decision: State the final decision on each possible action, along with the reasoning behind the decision.