How to Build Your Own Big Data Test Lab - dummies

How to Build Your Own Big Data Test Lab

By Jason Williamson

Here, you will find out how to build a pattern for some of the specific big data technologies, places to learn them, and a sandbox to practice. (A sandbox is a place to run a test.) This is just a sample pattern for learning core technologies and where to use them.

Suppose you want to do some big data analytics on your personal spending during the past three years to see if there are any predictive indicators or interesting insights you can learn by mashing up personal spending with the weather patterns in your city.

Maybe you want to get a little more interesting and see if there are any correlations with your Facebook activity, status changes, or friends posting travel pictures. Who knows? That’s the point: Try to find some interesting patterns. Right now, you don’t know if any patterns exist.

To execute this project, you need to create a project notebook. It can be digital (using Evernote, OneNote, or Notepad), or you can just use good old-fashioned paper. Project notebooks serve two very important purposes:

  • They’re pragmatic. Notebooks are the place for you to document ideas, learning plans, technical notes, and results.

  • They’re reviewable. You’ll find huge retrospective value (more on this later) in looking back at your progress.

Take time to journal your experiences — what’s working, what’s frustrating, and what you hope to accomplish while you’re learning.

Step 1: Define your goals

Spend some time writing down your goals. Studies show that people who articulate their goals in writing are much more likely to accomplish them. Be specific. Here are some examples:

  • Get comfortable with basic Python.

  • Get hands-on experience with big data using social media data. Learn how to grab data from Facebook with Python.

  • Learn a visualization tool to combine personal spending with Facebook data.

  • Complete this project within two weeks by working during the evenings and on weekends.

  • Spend less than $100.

Step 2: Take a skills inventory

Spend some time defining what technology skills you need to accomplish the project. Using your goals, you’ll be able to figure this out through your research on Python, reading forums, and just trying. If you don’t know what you don’t know, that’s okay. You’ll hit a bump and figure it out.

For this particular project, you’ll need to know

  • Basic Python.

  • Database skills, such as MySQL and Excel as a data source.

  • Tableau, a business intelligence and analytics software program.

  • Facebook application programming interfaces (APIs), which are access points that allow two applications — the one you’re building and Facebook — to communicate with each other.

Step 3: Mind the gap

Determine what you don’t know and estimate how much effort it will take to fill in those gaps. Make some notes on where you think you should go for help. You’re just estimating the work effort to learn things. You already know at this point that Python is a gap for you. Do your best to estimate how much effort you think it will be to learn it.

Step 4: Acquire knowledge

Start executing your basic learning plan and go make it happen. For this project, you’ll start off getting basic Python skills. When you’re comfortable with Python, you’ll start making API calls to Facebook to understand how to access data and status changes. At this point, you may feel ready to do some more interesting work, like grabbing picture posts from specific dates that correlate with large credit card purchases you’ve made.

Step 5: Look back

The retrospective step is extremely critical and perhaps the most important step in the whole process. Simply put, a retrospective is the exercise of looking back at the endeavor for the purpose of improving future performance. You look not only at what went wrong, but also at what went right, because you want to repeat those successes again.

Begin to evaluate whether your endeavor was successful so that you can improve your performance.

Process and outcome both have three levers, each of which impacts success. The three process levers are

  • Tools: Were the tools of learning effective for you?

  • Time: Did you accomplish this project in the time you expected?

  • Effort: Was your level of effort to learn greater or lesser than what you anticipated? This isn’t a measure of how hard something was to learn, but whether the difficulty met your expectations.

Here are some questions you can ask to illuminate this:

  • Were you able to easily find resources?

  • Where those resources easy for you to comprehend?

  • Did you make the time to learn?

  • Did you meet your time goals?

  • Did you effectively use tutorials to learn new skills?

The three outcome levers are

  • Knowledge: Do you now possess the knowledge you set out to learn?

  • Learning objectives: Did you accomplish your learning goals? For example, did you get hands-on experience with Python?

  • Value: Is what you learned relevant to finding a job?

Probing questions for outcome would include

  • Did I learn the programming languages and software that I needed to (for example, Python)?

  • Can I do this again with a different set of data with less effort?

  • If I talked to my boss about this project, would she let me try something at work?

Look at each of the six levers of success and determine how successful you think you are. Doing this type of activity allows you to reflect on your process and outcome with the purpose of learning what went well and what can be improved.

There are four possible states:

  • Success/Success: Both process and outcome were successful.

  • Success/Failure: The process was deemed successful, but the outcome was a failure.

  • Failure/Success: The process was deemed a failure, but the outcome was successful.

  • Failure/Failure: Both process and outcome were failures.