Big Data and Software Defined Networks
Big Data Analytics and Software Defined Networking (SDN) are helping to drive the management of data and usage of the extraordinary increase of computer processing power provided by Cloud Data Centres (CDCs). SDN helps CDCs run their services more efficiently by enabling managers to configure, manage, secure, and optimize the network resources very quickly. Big-Data Analytics in turn has entered CDCs to harvest the massive computing powers and deduct information that was never reachable by conventional methods. Big Data and Software Defined Networks investigates areas where Big-Data and SDN can help each other in delivering more efficient services. SDN can help Big-Data applications overcome one of their major challenges: message passing among cooperative nodes.Through proper bandwidth allocation and prioritization, critical surges of Big-Data flows can be better handled to effectively reduce their impacts on CDCs. Big-Data, in turn, can help SDN controllers better analyze collected network information and make more efficient decisions about the allocation of resources to different network flows.
Inspec keywords: bandwidth allocation; software defined networking; information analysis; resource allocation; Big Data; message passing
Other keywords: collected network information analysis; network flows; bandwidth allocation; SDN controllers; software defined networks; cooperative nodes; message passing; resource allocation; Big Data
Subjects: Computer networks and techniques; General and management topics; General electrical engineering topics; Distributed systems software; Data handling techniques; Computer communications
- Book DOI: 10.1049/PBPC015E
- Chapter DOI: 10.1049/PBPC015E
- ISBN: 9781785613043
- e-ISBN: 9781785613050
- Page count: 504
- Format: PDF
-
Front Matter
- + Show details - Hide details
-
p.
(1)
-
Part I. Introduction
1 Introduction to SDN
- + Show details - Hide details
-
p.
3
–25
(23)
In today's networking world, there is a dire shortage of traditional networking capabilities. SDN offers a separation of data and control planes, increases the granularity of the data control, simplifies the network equipment, the integration with management systems and OSS/BSS. This automates and accelerates the development of new services.
2 SDN implementations and protocols
- + Show details - Hide details
-
p.
27
–48
(22)
This chapter begins by explaining the main SDN concepts with the focus on a SDN controller. It presents the most important aspects to consider when we desire to go from traditional network to a SDN networks. We present an in-depth analysis of the most commonly used and modern SDN controllers and analyse the main features, capabilities and requirements of one of the presented controllers. OpenFlow is the standard leading in the market allowing the management of the forwarding plane devices such as routers or switches. While there are other standards with the same aim, OpenFlow has secured a position in the market and has been expanded rapidly. Therefore, an analysis is presented on a different OpenFlow compatible device for the implementation of an SDN network. This study encompasses both software and hardware solutions along with the scope of implementation or use of these devices. This chapter ends up presenting a description of OpenFlow protocol alternatives, a more detailed description of OpenFlow and its components and other wellknown southbound protocols involved for the management and configuration of the devices.
3 SDN components and OpenFlow
- + Show details - Hide details
-
p.
49
–67
(19)
Today's Internet suffers from ever-increasing challenges in scalability, mobility, and security, which calls for deep innovations on network protocols and infrastructures. However, the distributed controlling mechanism, especially the bundle of control plane and the data plane within network devices, sharply restricts such evolutions. In response, the software-defined networking (SDN), an emerging networking paradigm, proposes to decouple the control and data planes, producing logically centralized controllers, simple yet efficient forwarding devices, and potential abilities in functionalities programming. This chapter presents a short yet comprehensive overview of SDN components and the OpenFlow protocol on basis of both classic and latest literatures. The topics range from fundamental building blocks, layered architectures, novel controlling mechanisms, and design principles and efforts of OpenFlow switches.
4 SDN for cloud data centres
- + Show details - Hide details
-
p.
69
–89
(21)
In this chapter, we provide a technical overview of cloud DCs and their network infrastructure evolution and discuss how SDN has emerged as a prominent technology for configuring and managing large-scale complex networks in this space. After comparing and contrasting the most common DC network topologies (such as canonical and fat tree, B-cube, DCell, etc.), we discuss the main challenges that SDN can help addressing due to, among others, the fast and flexible deployment of advanced services it can facilitate, its inherent programmability, and its suitability for supporting measurement-based resource provisioning. We subsequently describe the benefits of using SDN for DC network configuration and management and briefly outline some prominent SDN deployments over large-scale DCs. We discuss the potential of SDN to play the role of the central nervous system for the converged management of server and network resources over single-administrative DC environments and, finally, we highlight promising open issues for future research and development in this area.
5 Introduction to big data
- + Show details - Hide details
-
p.
91
–114
(24)
The amount of data generated during the last few years has been unprecedented. This is not only due to the prevalence of online social networks and the ubiquitous devices connected to the Internet but also as the result of the advances in technology across other fields, for instance, whole genome sequencing. Hence, it is fair to say that we are living in the era of big data. Big data refers to large datasets or data flows that have outpaced our capability to store and process and cannot be analyzed by traditional means. In the presence of these challenges, traditional platforms fail to show the expected performance, and thus, new systems for storing and processing large-scale data are crucial to emerge. In this chapter, we explore some of the new trends of technology for handling big data.
6 Big Data processing using Apache Spark and Hadoop
- + Show details - Hide details
-
p.
115
–138
(24)
In this chapter, we introduce overview of what is Big Data processing and how Big Data is processed using Apache Hadoop and Spark, mostly in distributed computing platforms.
7 Big Data stream processing
- + Show details - Hide details
-
p.
139
–158
(20)
At the beginning of twenty-first century, the research interest of a new model of streamlined data processing has been arising, involving a huge volume of data in today's market that makes it impossible to store and process data along with the traditional way. Data stream processing (DSP) is a data computational paradigm that enables the real-time processing of continuous data streams instead of maintaining the static relationship among them. In this model, a large volume of raw tuple of data enters in a rapid, continuous, and streaming manner to the ecosystem. Such a set of streams is unbounded in size, while the data arrival time and data processing time have an online nature.
8 Big Data in cloud data centers
- + Show details - Hide details
-
p.
159
–182
(24)
Big Data refers to a collection of massive volume of data that cannot be processed by conventional data processing tools and technologies. In recent years, the data production sources are enlarged noticeably, such as high-end streaming devices, wireless sensor networks, satellite, wearable Internet of Things devices. These data generation sources generate massive amount of data in continuous manner. Nowadays, Big Data analytics plays a significant role in various environments it includes business monitoring, healthcare applications, production development, research and development, share market prediction, business process, industrial applications, social network analysis, weather analysis and environmental monitoring. A data center is a facility composed of networked computers and storage that businesses or other organizations use to process, analyze, store and distribute huge volume of data. In recent years, cloud data centers have been used to store and process the Big Data. This chapter reviews various architectures to store and process the Big Data in cloud data centers. In addition, this chapter also describesthe challenges and applications of Big Data analytics in cloud data centers.
-
Part II. How SDN helps Big Data
9 SDN helps volume in Big Data
- + Show details - Hide details
-
p.
185
–205
(21)
Both Big Data and SDN are described in detail in previous chapters. This chapter investigates how SDN architecture can leverage its unique features to mitigate the challenges of Big Data volume. Accordingly, first, we provide an overview of Big Data volume, its effects on the underlying network, and mention some potential SDN solutions to address the corresponding challenges. Second, we elaborate more on the network-monitoring, traffic-engineering, and fault-tolerant mechanisms which we believe they may help to address the challenges of Big Data volume. Finally, this chapter is concluded with some open issues.
10 SDN helps velocity in Big Data
- + Show details - Hide details
-
p.
207
–227
(21)
Currently, improving the performance of Big Data in general and velocity in particular is challenging due to the inefficiency of current network management, and the lack of coordination between the application layer and the network layer to achieve better scheduling decisions, which can improve the Big Data velocity performance. In this chapter, we discuss the role of recently emerged software defined networking (SDN) technology in helping the velocity dimension of Big Data. We start the chapter by providing a brief introduction of Big Data velocity and its characteristics and different modes of Big Data processing, followed by a brief explanation of how SDN can overcome the challenges of Big Data velocity. In the second part of the chapter, we describe in detail some proposed solutions which have applied SDN to improve Big Data performance in term of shortened processing time in different Big Data processing frameworks ranging from batch-oriented, MapReduce-based frameworks to real-time and stream-processing frameworks such as Spark and Storm. Finally, we conclude the chapter with a discussion of some open issues.
11 SDN helps value in Big Data
- + Show details - Hide details
-
p.
229
–251
(23)
In this chapter, we are investigating the ways that software-defined network (SDN) [2-5] facilitates the creation of value in Big Data [6-8]. We will use the term value inclusively, meaning that it refers to the monetary value that an organization could additionally generate from Big Data, as well as the extraction of knowledge, best practices, and transfer of knowledge resulting from Big Data. In order to cover the broad spectrum of the ways SDN accommodates generating extra value from Big Data, the discussion will focus on four deployment scenarios spanning over two dimensions: the infrastructure setting and type (i.e., centralized/decentralized, public/private) and the nature of the data (i.e., at rest/streamed and private/public).
12 SDN helps other Vs in Big Data
- + Show details - Hide details
-
p.
253
–273
(21)
Big Data is defined by a set of attributes or adjectives collectively known as the Vs of Big Data. Among these Vs, we discussed how Software-Defined Networking (SDN) helps Big Data achieve volume, velocity, and value in the previous chapters. Variety, volatility, validity, veracity, and visibility can be considered the “other Vs” that define Big Data. In this chapter, we will look into these other Vs in Big Data, and how SDN can be leveraged to achieve them. We will further discuss how SDN-based Big Data solutions are designed, and how SDN controllers are extended and exploited to create network, middleware, and system architectures for Big Data, focusing on these attributes.
13 SDN helps Big Data to optimize storage
- + Show details - Hide details
-
p.
275
–295
(21)
Distributed key-value stores have become the sine qua non for supporting today's large-scale web services. The extreme latency and throughput requirements of modern web applications are driving the use of distributed in-memory object caches. Similarly, the use of persistent object stores has been growing rapidly as they combine key advantages such as HTTP-based RESTfulAPIs, high availability, elasticity with a pay-as-you-go pricing model that allows applications to scale as needed. Consequently, there is an urgent need for optimizing the emerging software defined cloud data centers to efficiently support such applications at scale. In this chapter, we discuss different techniques to optimize the Big Data processing and data management using key-value stores and software defined networks in virtualized cloud data centers. Specifically, we explore two key questions. (1) How do cloud services users, i.e., tenants, get the most bang-for-the-buck with a distributed in-memory key-value store deployment in a shared multitenant environment? (2) How do tenants enhance cloud object store's capabilities through fine-grained resource management to effectively meet their SLAs while maximizing resource efficiency? Moreover, we also present the state of the art in this domain and provide a brief analysis of desirable features. We then demonstrate through experiments the impact of SDN-based Big Data storage management solution on improving performance and overall resource efficiency. Finally, we discuss open issues in SDN-based Big Data I/O stacks and future directions.
14 SDN helps Big Data to optimize access to data
- + Show details - Hide details
-
p.
297
–317
(21)
This chapter introduces the state of the art in the emerging area of combining high performance computing (HPC) with Big Data Analysis. To understand the new area, the chapter first surveys the existing approaches to integrating HPC with Big Data. Next, the chapter introduces several optimization solutions that focus on how to minimize the data transfer time from computation-intensive applications to analysis intensive applications as well as minimizing the end-to-end time-to-solution. The solutions utilize Software Defined Network (SDN) to adaptively use both high speed interconnect network and high performance parallel file systems to optimize the application performance. A computational framework called DataBroker is designed and developed to enable a tight integration of HPC with data analysis. Multiple types of experiments have been conducted to show different performance issues in both message passing and parallel file systems and to verify the effectiveness of the proposed research approaches.
15 SDN helps Big Data to become fault tolerant
- + Show details - Hide details
-
p.
319
–336
(18)
SDN networks would have many advantages to be used as fault-tolerant Big Data infrastructures such as programmability and global network view which help monitor and control the network behavior adaptively and efficiently. This chapter studied a number of requirements to provide fault tolerance in networks that Big Data applications perform upon. First, we studied the key requirements to be fault tolerant. The network topology design is crucial to provide resiliency against node or link failure. Second, we mentioned the principle concepts of fault tolerance and elaborated on reactive and proactive methods as two common approaches to deal with the failures in networks. Third, the fault-tolerant mechanisms in SDN architecture and their advantages were elucidated. Consequently, we investigated a number of studies that leverage SDN to provide fault tolerance. Finally, this chapter was concluded by introducing open issues and challenges in SDN architecture to provide a perfect fault-tolerant network.
-
Part III. How Big Data helps SDN
16 How Big Data helps SDN with data protection and privacy
- + Show details - Hide details
-
p.
339
–351
(13)
This chapter will discuss Big Data (BD) as a tool in software-defined networking (SDN) from the perspective of information privacy and data protection. First, it will discuss how BD and SDN are connected and expected to provide better services. Then, the chapter will describe the core of data protection and privacy requirements in Europe, followed by a discussion about the implications for BD use in SDN. The chapter will conclude with recommendations and privacy design considerations for BD in SDN.
17 Big Data helps SDN to detect intrusions and secure data flows
- + Show details - Hide details
-
p.
353
–373
(21)
In this chapter, we examine the security risks of SDN with the consideration of intrusions and abnormal data flows. Specifically, we discuss how SDN brings unique risks and threats to network service providers and customers. Then, we discuss the potential of integrating Big Data analytics into SDN for security enhancement and provide some examples to end this chapter.
18 Big Data helps SDN to manage traffic
- + Show details - Hide details
-
p.
375
–388
(14)
Traffic management plays a crucial role in achieving high-performance networking with optimal resource utilization. However, efficient and effective traffic management could be very challenging in large-scale dynamic networking environments. Software-defined networking (SDN) together with Big Data analytics offers a promising approach to addressing this challenging problem. We first provide an overview of the general process of network traffic management, in both conventional Internet Protocol (IP)-based networks and the emerging SDN networks. Then, we present an architectural framework of Big Data-based traffic management in SDN. We discuss some possible Big Data analytics applications for data analysis and decision-making in SDN for traffic management. We also identify some open issues and challenges that must be addressed for applying Big Data analytics techniques in SDN traffic management, which offer possible topics for future research and technology development.
19 Big Data helps SDN to optimize its controllers
- + Show details - Hide details
-
p.
389
–407
(19)
In this chapter, we first discuss the basic features and recent issues of the SDN control plane, notably the controller element. Then, we present feasible ideas to address the SDN controller-related problems using Big Data analytics techniques. Accordingly, we propose that Big Data can help various aspects of the SDN controller to address scalability issue and resiliency problem. Furthermore, we proposed six applicable scenarios for optimizing the SDN controller using the Big Data analytics: (i) controller scale-up/out against network traffic concentration, (ii) controller scale-in for reduced energy usage, (iii) backup controller placement for fault tolerance and high availability, (iv) creating backup paths to improve fault tolerance, (v) controller placement for low latency between controllers and switches, and (vi) flow rule aggregation to reduce the SDN controller's traffic. Although real-world practices on optimizing SDN controllers using Big Data are absent in the literature, we expect scenarios we highlighted in this chapter to be highly applicable to optimize the SDN controller in the future.
20 Big Data helps SDN to verify integrity of control/data planes
- + Show details - Hide details
-
p.
409
–431
(23)
In this chapter, we apply the Big Data analytics from graph computing perspective to help traffic engineering in SDN networks. Specifically, we propose a high-speed top K shortest paths (KSP) algorithm to calculate routes, develop several efficient schemes for routing errors detection, and present a novel edge-set-based graph processing engine to deal with large-scale graph data from SDN. Compared to existing solutions, the experiments show that our proposed KSP algorithm brings 3-6× speedup, and our graph processing engine achieves 3-16× speedup.
21 Big Data helps SDN to improve application specific quality of service
- + Show details - Hide details
-
p.
433
–455
(23)
This chapter first provides an outline of the current results in the domains of: (1) QoS/QoE CaM for real-time multimedia services that is supported by SDN, and (2) Big Data analytics and methods that are used for QoS/QoE CaM. Then, three specific use case scenarios with respect to video streaming services are presented, so as to illustrate the expected benefits of incorporating Big Data analytics into SDN-based CaM for the purposes of improving or optimizing QoS/QoE. In the end, we describe our vision and a high-level view of an SDN-based architecture for QoS/QoE CaM that is enriched with Big Data analytics' functional blocks and summarize corresponding challenges.
-
Back Matter
- + Show details - Hide details
-
p.
(1)