Sqoop 2.0 Preview - dummies

By Dirk deRoos

With all the success surrounding Sqoop 1.x upon its graduation from the Apache incubator, Sqoop has momentum! So, as you might expect, Sqoop 2.0 is in the works with exciting new features on the way. You can see that Sqoop 1.99.3 is downloadable, complete with documentation.

You’re probably wondering how many 1.99.x releases will be available before the big 2.0 hits. Well, most crystal balls only work part-time so the answer is “not yet.”

Here is a preview of Sqoop 2.0 features. However, you know the drill: The situation can change leading up to the 2.0 release. The figure illustrates (documented) design plans for Sqoop 2.0.


As you can see, the big change in the works is that Sqoop 2.0 will have a separate server, which is good news for a number of reasons. First, you won’t have to do so much work. The Sqoop connector and JDBC driver will be installed once by the system administrator for your cluster instead of once per Sqoop client.

You still have to do the work, but maybe you’ll like the next benefit: Sqoop 2.0 will be more secure! With a Sqoop server as part of the architecture, sensitive operations such as connecting to the database servers only have to happen on the Sqoop server and you’ll have role-based access control.

Additionally, Sqoop clients can leverage Sqoop from anywhere on the network (thanks to the new rest interface), and they will enjoy a new graphical user interface (GUI). You’ll agree that the command line options are necessary and powerful for scripting purposes, but everyone likes a cool GUI from time to time. Sqoop requires many command line options, which can be error-prone without a GUI to guide you.

You probably noticed MapReduce (instead of just map tasks) proudly displayed. Until then, enjoy Sqoop 1.x and start experimenting with 1.99.x.