[important]This blog article was inspired by a reader comment from Tony asking about what is the significance of Hadoop and what specific skills he should learn.[/important]
Hi Tony, Hadoop permits the storage and processing of large volumes of data with cheap commodity hardware. As everything becomes more digital in the information age, the high volume, high velocity, and high variety of data being created need to be stored in way that’s relatively cheap, fast, and easy. Hadoop is open source software which acts like the “operating system” for the distributed HDFS data file system. This infrastructure layer makes it feasible for folks to analyze and explore the large volumes of data with MapReduce and related programming environments for various business or social purposes, such as understanding customer behavior as well as to find a cure for rare diseases.
The need for processing data is generating demand for employees to work with the data. Gartner predicts 4.4 million IT jobs globally will be created by 2015 to handle big data.
As far as what big data skills to learn during this time of transition, it should be a combination of not just what’s highest paying but also what areas you have passion for. I see 3 broad groups of data specialists.
- Current system administrators can learn some Java skills as well as cloud services management skills to start working with Hadoop installation and operations.
- Current DBAs and ETL data architects can learn Apache Pig and related technologies to develop, operate, and optimize the massive data flows going into the Hadoop system.
- Current BI analysts and data analysts can learn SQL and Hive and R and Python to wrangle, analyze, and visualize the data collected within Hadoop.
Tony, which of these 3 areas would you be most interested in?
Ultimately, in my experience, people who do things in their career that they are excited about and have a passion for, can go farther and faster with the self-motivation than if they did something that they didn’t like, but felt like they needed to do it for other reasons. You are awesome in already taking initiative in your career by doing your research including visiting my blog.
This current wave of “big data” has tremendous opportunities. The deluge of big data is likely to persist in the future. Tools to handle big data will eventually become mainstream and commonplace, which is when almost everyone is working with big data. However, enterprising folks can still get ahead of the mainstream today by investing in skills and career development. I realize this may sound like hyperbole, but this is the historical pattern that we have seen around how technology gets adopted and the resulting shifts in the workforce (e.g. printing press, radio, television, computers, internet, etc.).
Tony, I wish you and my other readers the best as you transform yourselves by learning Hadoop or other big data technologies. I hope that my blog can help you along on your journey. Feel free to ask other questions any time.