安哥网络 发表于 2014-11-22 01:57:20

Hadoop当前所包含的子项目汇总

目前,Hadoop project下已经包含了很多的子项目,有的是从原有的hadoop项目中细化出来的,有的是在hadoop的基础之上演变出来的,本文只是引用hadoop文档中关于其子项目的介绍,以备了解。

The project includes these subprojects:

[*]Hadoop Common: The common utilities that support the other Hadoop subprojects.
[*]Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.
[*]Hadoop MapReduce: A software framework for distributed processing of large data sets on compute clusters.
Other Hadoop-related projects at Apache include:

[*]Avro™: A data serialization system.
[*]Cassandra™: A scalable multi-master database with no single points of failure.
[*]Chukwa™: A data collection system for managing large distributed systems.
[*]HBase™: A scalable, distributed database that supports structured data storage for large tables.
[*]Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying.
[*]Mahout™: A Scalable machine learning and data mining library.
[*]Pig™: A high-level data-flow language and execution framework for parallel computation.
[*]ZooKeeper™: A high-performance coordination service for distributed applications.
摘自:http://blog.csdn.net/derekjiang/article/details/6834657

页: [1]
查看完整版本: Hadoop当前所包含的子项目汇总