Online Communities that are Helpful for Getting a Big Data Job - dummies

Online Communities that are Helpful for Getting a Big Data Job

By Jason Williamson

The online community is extremely robust, especially for programmers and data analysts. This can be handy if you are seeking a big data job. Since the advent of the Internet, the culture of open collaboration has grown from a simple sharing of ideas to full-blown co-development.

Co-development is more than just sharing ideas on how to solve problems — it’s a community of people who work together to jointly develop software, usually under a collaborative, open-source license.

Online communities are great for people wanting to learn new technologies and concepts. Not only will you be able to get help on solving any problems you may have, but you’ll be able to connect to others who have solved similar problems or are on the same journey. You can even use crowdsourcing to co-develop your ideas and allow it to become a full-blown project or even participate in one yourself.

Crowdsourcing (using the community of ideas to develop a great idea) is usually marshaled from an online community instead of employees of a company, but it can be used for commercial purposes. My Starbucks Idea is an open community designed to solicit great ideas from customers that Starbucks may turn into a product or service someday. Crowdsourcing has gained in popularity because of two main reasons:

  • Companies found that utilizing outsourced talent who didn’t necessarily expect financial compensation was cost-effective.

  • Companies found that by allowing experts from around the world to solve problems, they could get better and more diverse solutions.

If you feel trapped in the tragic cycle of “How do I get experience if no one will give me a chance?”, you can participate today by contributing code, ideas, or testing to a host of open-source big data projects.

This culture of open knowledge sharing is an amazing tool for advancing all technologies both for commercial and public or free use. Oracle is a great example of this. Oracle boasts to be the world’s largest enterprise software company with the foundation of its massive revenue and profits being centered on the Oracle Database, not an open-source platform by any means.

What many people don’t know is that Oracle has been and continues to be a key contributor and tester for core Linux libraries and functions like Libstdc++ and CRFS. If the community can collaborate to move technology forward, both enterprise and the public benefit.

Can anyone test this code? The answer is yes. With open-source software, the source code is also submitted to the public, so any bugs or gaps can be tested by the public. The more eyes on it, the better it becomes.

There are several types of online communities to check out:

  • Boards and forums: If you’ve been programming or using business intelligence tools for more than two years, you’ll likely be familiar with the online forums, the most common and easiest-to-access communities. Forums are where people can post questions, code, and errors on specific topics, and the community of other readers can respond. Sometimes it’s moderated by a leader, and sometimes solely by other readers.

    The conversation is archived so that future users can explore problems and solutions. When you start to query the Internet with your code errors or problems, you tend to migrate back to the communities that are most active and post helpful responses quickly.

  • Internet relay chat (IRC): IRCs are simply chat servers that transmit text messages back and forth. Although IRC usage has declined during the past several years, there are still more than 500,000 active IRC channels. They’re a great way to get connected with a community of users in real time. For Hadoop, the IRC channel is #hadoop.

  • Open-source development communities: These are hosted communities on the Internet categorized by some sort of open-source project. Here someone — either a single individual or a group — posts source code for some software application, and then people contribute to all the coding and testing of that application.

    End-users of these applications are freely able to download the source code and can do with it what they want, within the confines of the open-source license. This is a wonderful way to get quickly plugged into a community as a project contributor or a tester. You may even want to offer up your own project to the community at some point.

    Here are some open-source development communities worth checking out:

    • GitHub: An online repository that facilitates collaboration among programmers. Projects can be both public and private.

    • Google Developers: An online repository for projects based on Google applications.

    • SourceForge: An online repository for storing open and free software projects.

  • Software foundations: Software foundations are usually formal nonprofit organizations that started as simple open-source projects that matured over time because of widespread adoption. Classic examples of these are the following:

    • PHP: Home to the one of the predominate web programming languages.

    • The Apache Software Foundation: Hosts all the Apache open-source projects, including the Hadoop framework.

    • Python: Python is a widely used scripting language with a wide level of adoption. Many big data projects are implemented using Python.

    Membership is open to the public but is tightly controlled by the organizations. They still have the same characteristics as smaller open-source projects found on GitHub, for example, but with a greater degree of support, documentation, and active discussion.