Benchmarking big data recommendation algorithms using Hadoop orApache Spark

Benchmarking big data recommendation algorithms using Hadoop orApache Spark

For access to this article, please select a purchase option:

Buy chapter PDF
(plus tax if applicable)
Buy Knowledge Pack
10 chapters for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
Big Data Recommender Systems - Volume 1: Algorithms, Architectures, Big Data, Security and Trust — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Recommender or recommendation systems have gained popularity in recent years, and big data is the driving force behind recommendation systems. Recommendation systems changed the way websites communicate with the users by providing a recommendation based on users history such as purchases and searches. Recommendation systems are used in a variety of areas such as movies, music, research articles and social tags. For example, recommendation system in Facebook “People you may know,” Netflix “Because you watched” and YouTube “Recommend for you.” These systems usually produce a list of recommendations in two ways: collaborative and content-based (CB) filtering. Collaborative filtering (CF) is based on a model of prior user behavior, which can be constructed from sole user's action or from the actions of other users who have similar behaviors, while content-based filtering constructs a recommendation on user's behavior such as by using historical browsing information. Apart from these, the hybrid approach can be used by combining two models. While designing, such systems require compute function values at several thousand points and thus are computationally quite extensive. These systems need parallel computations to speed up the search for an acceptable solution that can be recommended through nature-inspired computation. There are many factors that are essential while designing accurate recommendation algorithms. Some of these factors are diversity, recommender persistence, privacy, user demographics, trust and labeling. Recommendation system cannot perform its job without data, and big data supplies the amount of user's data such as past purchase history, browsing history [1,2]. In fact, efficient recommendation system requires big data. The best solution is Hadoop; it is a platform used to store, generate, manage and distribute big data easily around several large server nodes [3-5]. Hadoop offers Hadoop distributed file system (HDFS), which distributes all the data in different clusters and performs parallel operations. This chapter will explore big data issues and specific in Hadoop and HDFS.

Chapter Contents:

  • 3.1 Introduction
  • 3.2 Big data
  • 3.2.1 Hadoop
  • 3.2.2 Presenting the MapReduce model
  • 3.2.3 Hadoop input/output
  • 3.2.4 Apache Ambari and Ambari architecture
  • 3.2.5 Future of Hadoop
  • 3.2.6 How Hadoop works in social networking
  • 3.3 Apache Spark
  • 3.4 Recommender systems
  • 3.4.1 Design of recommender systems
  • 3.4.2 Collaborative recommendation and collaborative filtering
  • 3.4.3 Reducing the sparsity problem
  • 3.4.4 Content-based recommendation
  • 3.4.5 Visualization of recommendation
  • 3.4.6 Hybrid recommendation approaches
  • 3.5 Systems based on nature-inspired algorithms
  • 3.6 Benchmarking: big data benchmarking
  • 3.7 Summary
  • References

Inspec keywords: data analysis; distributed databases; collaborative filtering; recommender systems; Big Data; social networking (online)

Other keywords: content-based filtering; Hadoop distributed file system; recommender system; YouTube “Recommend for you”; historical browsing information; efficient recommendation system; Apache Spark; collaborative filtering; big data recommendation algorithms; Facebook “People you may know"; recommender persistence; Netflix “Because you watched”

Subjects: Data handling techniques; Distributed databases; Information networks; Information retrieval techniques

Preview this chapter:
Zoom in

Benchmarking big data recommendation algorithms using Hadoop orApache Spark, Page 1 of 2

| /docserver/preview/fulltext/books/pc/pbpc035f/PBPC035F_ch3-1.gif /docserver/preview/fulltext/books/pc/pbpc035f/PBPC035F_ch3-2.gif

Related content

This is a required field
Please enter a valid email address