Data Science Programming All-in-One For Dummies
Book image
Explore Book Buy On Amazon
Because data is so valuable and users are sometimes adverse to giving it up, vendors constantly find new ways to collect data. One such method comes down to spying. Microsoft, for example, was recently accused (yet again) of spying on Windows 10 users even when the user doesn’t want their data collected.

Lest you think that Microsoft is solely interested in your computing concerns, think again. The data Microsoft admits to collecting (and there is likely more) is pretty amazing.

Microsoft’s data gathering doesn’t stop with your Windows 10 actions; it also collects data with Cortana, the personal assistant. Mind you, Alexa is accused of doing the same thing. Google, likewise, does the same thing. So, one of the trends the vendors are using is spying, and it doesn’t stop with Microsoft, nor does it stop with the obvious spying sources.

It might actually be possible to write an entire book on the ways in which people are spying on you, but that would make for a very paranoid book, and there are other new data collection trends to consider. You may have noticed that you get more email from everyone about the services or products you were provided. Everyone wants you to provide free information about your experiences in one of these forms:

  • Close-ended surveys: A close-ended survey is one in which the questions have specific answers that you check mark. The advantage is greater consistency of feedback. The disadvantage is that you can’t learn anything beyond the predefined answers.
  • Open-ended surveys: An open-ended survey is one in which the questions rely on text boxes in which the user enters data manually. In some cases, this form of survey enables you to find new information, but at the cost of consistency, reliability, and cleanliness of the data.
  • One-on-one interviews: Someone calls you or approaches you at a place like the mall and talks to you. When the interviewer is well trained, you obtain consistent data and can also discover new information. However, the quality of this information comes at the cost of paying someone to obtain it.
  • Focus group: Three or more people meet with an interviewer to discuss a topic (including products). Because the interviewer acts as a moderator, the consistency, reliability, and cleanliness of the data remain high and the costs are lower. However, now the data suffers contamination from the interaction between members of the focus group.
  • Direct observation: No conversation occurs in this case; someone monitors the interactions of another party with a product or service and records the responses using a script. However, because you now rely on a third party to interpret someone else’s actions, you have a problem with contamination in the form of bias. In addition, if the subject of the observation is aware of being monitored, the interactions likely won’t reflect reality.

These are just a few of the methods that are seeing greater use in data collection today. They’re just the tip of the iceberg. The key takeaway here is that no perfect means exists for collecting some types of data and all data collection methods require some sort of participative event.

Don’t want to find yourself in trouble? Here are ten mistakes to avoid when investing in data science.

About This Article

This article is from the book:

About the book authors:

John Mueller has published more than 100 books on technology, data, and programming. John has a website and blog where he writes articles on technology and offers assistance alongside his published books.

Luca Massaron is a data scientist specializing in insurance and finance. A Google Developer Expert in machine learning, he has been involved in quantitative analysis and algorithms since 2000.

John Mueller has published more than 100 books on technology, data, and programming. John has a website and blog where he writes articles on technology and offers assistance alongside his published books.

Luca Massaron is a data scientist specializing in insurance and finance. A Google Developer Expert in machine learning, he has been involved in quantitative analysis and algorithms since 2000.

This article can be found in the category: