Big Data processing using Apache Spark and Hadoop

Access Full Text

Big Data processing using Apache Spark and Hadoop

For access to this article, please select a purchase option:

Buy chapter PDF
£10.00
(plus tax if applicable)
Buy Knowledge Pack
10 chapters for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
Big Data and Software Defined Networks — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Author(s): Koichi Shirahata 1  and  Satoshi Matsuoka 2
View affiliations
Source: Big Data and Software Defined Networks,2018
Publication date March 2018

In this chapter, we introduce overview of what is Big Data processing and how Big Data is processed using Apache Hadoop and Spark, mostly in distributed computing platforms.

Chapter Contents:

  • 6.1 Introduction
  • 6.2 Big Data processing
  • 6.2.1 Big Data processing models
  • 6.2.2 Big Data processing implementations
  • 6.2.3 MapReduce-based Big Data processing implementations
  • 6.2.4 Computing platforms for Big Data processing
  • 6.3 Apache Hadoop
  • 6.3.1 Overview of Hadoop
  • 6.3.2 Hadoop MapReduce
  • 6.3.3 Hadoop distributed file system
  • 6.3.4 YARN
  • 6.3.5 Hadoop libraries
  • 6.3.6 Research activities on Hadoop
  • 6.4 Apache Spark
  • 6.4.1 Overview of Spark
  • 6.4.2 Resilient distributed dataset
  • 6.4.3 Spark libraries
  • 6.4.4 Using both Spark and Hadoop cooperatively
  • 6.4.5 Research activities on Spark
  • 6.5 Open issues and challenges
  • 6.5.1 Storage
  • 6.5.2 Computation
  • 6.5.3 Network
  • 6.5.4 Data analysis
  • 6.6 Summary
  • References

Inspec keywords: Big Data; parallel processing

Other keywords: Hadoop; Apache Spark; distributed computing platforms; Big Data processing

Subjects: Parallel software; Data handling techniques

Preview this chapter:
Zoom in
Zoomout

Big Data processing using Apache Spark and Hadoop, Page 1 of 2

| /docserver/preview/fulltext/books/pc/pbpc015e/PBPC015E_ch6-1.gif /docserver/preview/fulltext/books/pc/pbpc015e/PBPC015E_ch6-2.gif

Related content

content/books/10.1049/pbpc015e_ch6
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading