Where can you find a thriving community of thousands of Hadoop and Big Data experts and fans, exchanging best practices, and discussing the latest trends? On the LinkedIn social network, you will find these experts active in discussion groups. You will also find recruiters lurking in these discussion groups, looking for the next hot talent to scoop up for their companies, both Hadoop wizards and data scientists. [see disclaimer]
HadoopWizard.com’s Top 10 List of LinkedIn Groups for Hadoop (January 2013)
See below for the HadoopWizard.com‘s January 2013 list of top LinkedIn discussion groups for Hadoop. These Hadoop-specific groups represented below have the most members and the most discussion activity. (The group names and descriptions were provided by the group owners, and they vary a lot!) There are many groups on LinkedIn which have either low membership or low activity, but the ones below are the best groups for you to consider joining. There are also other groups which discuss Big Data and statistical analysis in general, rather than specifically Hadoop, but you’ll need to wait for part 2 in my series on professional communities.
1. Hadoop Users
This group is the original and most established group for Hadoop users on LinkedIn.
Description: A group for Hadoop users.
Activity: Active, 47 discussions this month
Members Count: 24,236 member
Created: October 7, 2008
2. Distributed Computing & Applications Professional: Hadoop, MapReduce, NoSQL, MongoDB, VLDB, Big Data
This group covers all sorts of big data architecture technologies.
Description: A group for scalable computing and distributed computing and applications professionals located throughout the world. Hadoop, MapReduce, NoSQL, Hive, Slide, Shale, Cloud Computing, Distributed File System, Google File System, High Performance Computing, Utility Computing, Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3), Amazon Elastic MapReduce, HDFS filesystem database, Nutch, HBase, Hypertable, MapReduce, Apache Pig, Apache Mahout, Machine Learning, VLDB, Data Mining, Cassandra, CouchDB, MongoDB, node.js, Python, Groovy, Ruby on Rails, RoR, Scala, Clojure, JVM, NetKernel, Resource Oriented Computing, ROC, neo4j, redis, node.js, Heroku, AppFog, Nodejitsu
Activity: Active, 22 discussions this month
Members Count: 4,640 members
Created: October 14, 2009
3. Cloudera Hadoop Users
This group is a subgroup of the above Hadoop Users group, focusing on Cloudera, which is a commercial enterprise version of Hadoop. The below description comes from the group.
Description: Cloudera (www.cloudera.com) is the leading provider of Apache Hadoop-based software and services and works with customers in financial services, web, retail, telecommunications, government and others industries. Cloudera’s Distribution for Apache Hadoop and Cloudera Enterprise, help organizations profit from all of their information.
Activity: Very Active,117 discussions this month
Members Count: 4,033 members
Created: June 26, 2009
4. HADOOP Professionals MapReduce Hive HBase Mahout HDFS Cassandra Apache Big Data Cloud Computing IT
This group has very active discussions on Hadoop and related technologies. Please note that this group’s membership is more weighted to members from the Pakistan country (20%), and Finance job function (18%).
Description: Scalable distributed parallel computation Cloudera MapR GreenPlum Hortonworks database structured storage warehouse infrastructure machine learning Sensor mining library language execution framework for high performance file system Developer Lucene Nutch anada Seattle Boston San Francisco New York
Activity: Very Active,192 discussions this month
Members Count: 4,834 members
Created: January 5, 2011
Notes: This group membership is from 20% Pakistan country, and 18% Finance job function.
5. Hadoop Hive
This group discusses the use of the Hive language, which is a high level SQL-like language used on top of MapReduce for easier access to analyze data on Hadoop.
Description: Hive is a data warehouse infrastructure built on top of Hadoop that provides tools to enable easy data summarization, adhoc querying and analysis of large datasets data stored in Hadoop files. It provides a mechanism to put structure on this data and it also provides a simple query language called Hive QL which is based on SQL and which enables users familiar with SQL to query this data. At the same time, this language also allows traditional map/reduce programmers to be able to plug in their custom mappers and reducers to do more sophisticated analysis which may not be supported by the built-in capabilities of the language.
Activity: Very Active,70 discussions this month
Members Count: 3,339 members
Created: June 15, 2009
Notes: It is a subproject of Hadoop
6. Big Data, Hadoop & Cloud Computing
This group has a very active community, discussing all manners of Hadoop’s role in Big Data.
Description: Place to discuss about Big Data Projects, Hadoop, Cloud computing, and its issues and solutions. Its a public group, anyone can join and share their ideas.
Activity: Very Active, 224 discussions this month
Members Count: 3,339 members
Created: July 6, 2011
7. Hadoop India
This group on the top 10 list is focused on regional members in India who use Hadoop. The most represented area in this group is Bengaluru, India (29%).
Description: Group for Hadoop India Users
Activity: Very Active,199 discussions this month
Members Count: 3,185 members
Created: September 2, 2009
8. Big Data Analytics and Hadoop
This moderately active group discusses Hadoop and Big Data. Note that at least 57% of the members of this group are in India, with 35% coming from Hyderabad.
Description: Big Data Analytics and Hadoop group enables knowledge sharing, networking and increasing the awareness. We strive hard to create a strong community and identify and share exciting happenings in the field. We encourage practitioners to use the group for announcements, soliciting opinions from their members.
Activity: Active, 30 discussions this month
Members Count: 2,708 members
Created: June 14, 2012
9. Big Data Solutions Software Infrastructure Applications Consultants Experts Hadoop Analytics
This group is a subgroup of Hadoop Professionals, which is group #4 above. Members of this group are primarily in the Middle-East, with 30% from Pakistan and 18% from United Arab Emirates.
Description: Information Technology Predictive Modeling Business Intelligence BI Decision Support Text Mining Virtualization Statistics Apps Developer Software Enterprise Mobile Web Machine Oracle Database SAAS Linux Java API SSD Consultant Cloudera MapR Greenplum San Francisco New York DC Seattle Boston Canada
Activity: Very Active, 82 discussions this month
Members Count: 2,855 members
Created: January 17, 2011
10. BIG DATA Professionals Architects Scientists Analytics Experts Developers Cloud Computing NoSQL BI
This is an extremely active group, with over 900 new discussions per month. Remember, you can set your group email preferences to a summary digest to limit the number of emails that you receive from this group.
Description: Information Technology Predictive Modeling Business Intelligence BI Decision Support Text Mining Machine Virtualization Statistics Apps Developer Software Enterprise Mobile Web Oracle Database SAAS Linux Java API SSD Consultant Cloudera MapR Greenplum San Francisco New York DC Seattle Boston Canada.
Activity: Very Active, 937 discussions this month
Members Count: 14,828 members
Created: Created: September 1, 2008
You can join LinkedIn discussion groups to learn about the latest happenings in the world of Hadoop, as well as engage in discussions with other professionals online. When you contribute positively to the discussions to help the community, you will get the additional benefit of being recognized as a Hadoop expert by recruiters and others, which will open up more opportunities to you.
What other groups on LinkedIn have you found to be helpful? You can share your response below.
Please stay tuned for Part 2 of my series of professional communities for Hadoop and Big Data analytics.
Image credit: LinkedIn