Distributed data processing platform that offers the folowing core capabilities

  • YARN - cluster resource manager (manage resource, node and application)
  • HDFS - Distributed Storage (name node, data node)
  • Map/Reduce - Distributed computing
    • programming model - technique
    • programing framework - API/ Service to use