Analytics Resources: Insights, Best Practices,

Case Studies and More

close
Written by Umesh Kakkad
on November 22, 2013

Various components of Hadoop ecosystem

1. Hadoop File System (HDFS): Clustered redundant file system for Hadoop

2. HBASE: Column oriented database scaling to billions of rows

3. MAPREDUCE: Parallel computation on cluster of servers

4. MAHOUT: Library of machine learning & data mining algorithms

5. HIVE: Datawarehouse with SQL like access

6. PIG: high level programming language for Hadoop

7. HCATALOG: Schema and datatype sharing over PIG, HIVE and MAPREDUCE

8. SQOOP: Imports data from relational databases

9. FLUME: Collection and import of log and event data

10. WHIRR: Cloud agnostic deployment of clusters

11. OOZIE: Orchestration and workflow management

12. AMBARI: Deployment configuration and Monitoring

13. ZOOKEEPER: Configuration Management and co-ordination.

You may also like:

Big Data Data Modernization Azure Migration Azure Databricks

Making the Most Out of Your Azure Investment with Azure Databricks

With End of Support for Microsoft’s most popular SQL Server 2008, moving to Azure is the obvious next step. While many b...

Big Data Hadoop

HIVE-TEZ Query Optimization

Following up on my earlier post on some of the configuration and optimization techniques for HIVE-TEZ , this document de...

Big Data Hadoop

HIVE-TEZ SQL Query Optimization Best Practices

Introduction Whie working on my current project for a large bank on a data warehouse and processing engine built using H...