The Rise of Open Data and its Role in Predictive Analytics

By Dr. Anasse Bari, Mohamed Chaouchi, Tommy Jung

Open Data could become a very useful tool for predictive analytics. Bob Lytle, the CEO of, and most recently known as the former CIO of TransUnion Canada, is leading efforts on the use of public information as an alternative and strategic data source for predictive modeling in the financial services and insurance sectors.

Open Data originated from the idea that access to government data should be free and available for everyone to use. Here some examples of open data:

The movement of Open Data was part of a simple theory, as Bob says: “We as citizens are the government, and therefore have a right to understand and even reuse the information generated by municipal, regional, and federal entities.” However, some data will remain private. Another reality about open data is that open data is dirty data.

Public data is often incomplete, missing many values. The team is building a platform for cleaning and reducing public data that will be ready to use for modelers across multiple business segments. Bob’s team is also using public data to generate open data-driven predictions to find the bottom 10 percent of businesses that will likely fail in 2017, and the top 10 percent that are almost sure to grow and extend.

Predictive models using Open Data information can be used by financial institutions to check portfolio trends and take action much earlier in the cycle, before adverse risk or churn events occur.