Data Fusion in Wireless Sensor Networks: A statistical signal processing perspective
2: Department of Advanced Analytics and Machine Learning, Kongsberg Digital AS Norway, City, Norway
The role of data fusion has been expanding in recent years through the incorporation of pervasive applications, where the physical infrastructure is coupled with information and communication technologies, such as wireless sensor networks for the internet of things (IoT), e-health and Industry 4.0. In this edited reference, the authors provide advanced tools for the design, analysis and implementation of inference algorithms in wireless sensor networks. The book is directed at the sensing, signal processing, and ICTs research communities. The contents will be of particular use to researchers (from academia and industry) and practitioners working in wireless sensor networks, IoT, E-health and Industry 4.0 applications who wish to understand the basics of inference problems. It will also be of interest to professionals, and graduate and PhD students who wish to understand the fundamental concepts of inference algorithms based on intelligent and energy-efficient protocols.
Inspec keywords: sensor fusion; wireless sensor networks; Internet of Things; telecommunication power management
Other keywords: wireless sensor networks; fusion center; data fusion; statistical signal processing; centralized architectures; decentralized architectures; low-power wireless communications; energy harvesting; Internet-of-Things
Subjects: Signal processing and detection; Textbooks; General and management topics; Wireless sensor networks; Telecommunication systems (energy utilisation); Education and training; Sensor fusion
- Book DOI: 10.1049/PBCE117E
- Chapter DOI: 10.1049/PBCE117E
- ISBN: 9781785615849
- e-ISBN: 9781785615856
- Page count: 336
- Format: PDF
-
Front Matter
- + Show details - Hide details
-
p.
(1)
-
Part I Sensing model uncertainty
1 Generalized score-tests for decision fusion with sensing model uncertainty
- + Show details - Hide details
-
p.
3
–23
(21)
This chapter investigates distributed detection of a phenomenon of interest (POI) via decision fusion in wireless sensor networks (WSNs). The decisions are collected by a fusion center (FC), which is in charge of performing a more accurate global decision. So as to account for a realistic scenario, it is assumed that the POI presents a signature with limited spatial extent, and its exact location and emitted amplitude (or energy) are not known. More specifically, when the POI is present, the sensors observe a signal with an attenuation depending on the distance between the sensor and the (unknown) target position, embedded in Gaussian noise. The unavailability of a completely specified model defeats the applicability of the well-known (optimal) likelihood-ratio (LR) test (LRT). As a consequence, in the general case, the FC is usually in charge of solving a composite hypothesis test and the generalized LRT (GLRT) is commonly employed. Unfortunately, in these scenarios, its complexity is typically high. Accordingly, the present chapter discusses the development of generalized score tests as alternatives with reduced computational complexity. After a brief recall of the GLRT for the considered problems, fusion rules corresponding to generalized versions of well-known score tests are introduced, based on Davies'framework, since the resulting problems include nuisance parameters only under the POI-present hypothesis. The focus is on two relevant signal models, i.e., the cases of random and deterministic unknown signals, leading to one-sided and two-sided testing, respectively. Finally, a convincing (semi-theoretical) rationale for threshold-optimization is presented and analyzed.
2 Compressed distributed detection and estimation
- + Show details - Hide details
-
p.
25
–56
(32)
Detection and estimation are two fundamental tasks that are performed by distributed sensor networks. It is a challenging problem to design efficient protocols and algorithms to perform these tasks taking the inherent scarce network resources, such as limited node power and the limited communication bandwidth, into account. Despite there being quite a rich literature related to energy efficiency in distributed sensor networks, there is still much ongoing research on investigating how to optimize power and communication bandwidth in processing high-dimensional (multimodal) data generated at emerging sensors with high fidelity and resolution. Recent advances in compressive sensing (CS) have led to novel ways of thinking about energy efficient signal processing in sensor networks. CS is well motivated for distributed sensor network applications since data compression at the sensors prior to transmission to the fusion center (FC) is vital to minimize the energy and communication requirements. The universal and agnostic nature of the CS measurement scheme is promising in acquiring compressed data for a variety of inference tasks. In many distributed sensor network applications, sparsity is a common characteristic that can be observed in various forms. While the CS theory widely focuses on sparse signal reconstruction, further research beyond the standard CS framework is needed to understand its applicability in solving a variety of inference problems. In this book chapter, our goal is to provide an up-to-date review on CS-based detection and estimation as applicable to sensor networks. After an introduction, we provide a brief overview of the theory of CS. Then, we discuss CS-based detection and parameter estimation problems considering different signal and noise models. The impact of compression via CS on detection and estimation is quantified in terms of different performance metrics.
3 Heterogeneous sensor data fusion by deep learning
- + Show details - Hide details
-
p.
57
–77
(21)
Heterogeneous sensor data fusion for decision-making is a challenging field that has gathered significant interest in recent years. In agriculture, for example, environmental conditions such as temperature, illuminance and humidity can be correlated with plant growth data, so that appropriate actions may be taken to maximize crop yield. In this chapter, we will provide an overview of heterogeneous sensor data fusion, including the background, basic deep learning techniques, and how these techniques can be used for sensor data fusion tasks. We will close this chapter with a detailed case study.
-
Part II Reporting channel uncertainty
4 Energy-efficient clustering and collision-aware distributed detection/estimation in random-access-based WSNs
- + Show details - Hide details
-
p.
81
–110
(30)
In this chapter, we focus exclusively on our work on gathering data from a large number of sources in the stadium. In the theoretical work reported here, we are particularly interested in minimizing the time and energy required to gather data from many different sensor nodes. There is a parallel effort, reported elsewhere, to take the results of our theoretical research and turn them into real systems that we deploy in the stadium and study when they are used during the sporting events paces. Our current efforts include: Gathering video and data from the game to share with fans in the stands via web applications that enable on-demand access to multimedia content, including video-clips of plays, visualization of game events, and game/player stats. Developing wireless sensor networks to monitor structural vibrations of the stadium and audio of the crowd. An extreme emitter density test bed for RF spectrum sensing and cross-layer localization of wireless devices. Analytical models and associated algorithms for collecting, processing, and communicating very large amounts of data for detection, estimation, and other tasks.
5 Channel-aware decision fusion in MIMO wireless sensor networks
- + Show details - Hide details
-
p.
111
–129
(19)
This chapter deals with a distributed version of the binary-hypothesis test which formalizes the case in which a wireless sensor network (WSN) is used for detecting a binary event, and a fusion center (FC) with multiple antennas collects the information for a robust decision. The presence of multiple antennas at both transmit and receive sides resembles a multiple-input-multiple-output (MIMO) system and allows for utilization of array-processing techniques providing spectral efficiency and fading mitigation. The problem is here referred to as MIMO decision fusion. Coherent decision fusion, i.e., the case when instantaneous channel state information (CSI) is available at the FC, is first studied: “Decode-and-Fuse” and “Decode-then-Fuse” approaches are introduced and compared. Successively, noncoherent decision fusion, i.e., the case when statistical CSI is available at the FC, is analyzed: the focus is on the energy test and related optimality characteristics.
6 Channel-aware detection and estimation in the massive MIMO regime
- + Show details - Hide details
-
p.
131
–151
(21)
This chapter investigates channel-aware distributed detection (viz. binary hypothesis testing, HT) and estimation (EST) over a “virtual” and “massive” multiple-input- multiple-output (MIMO) channel at the fusion center (FC), underlining analogies and differences with uplink communication in a multiuser (massive) MIMO setup. The considered scenario takes into account channel estimation and inhomogeneous large-scale fading between the sensors and the FC. In the former case, the aim is the development of (widely) linear fusion rules, as opposed to the unsuitable (optimum) log-likelihood ratio (LLR). In the latter case, the aim is the power allocation design for decentralized estimation of a correlated random source vector with amplify-andforward sensors and an FC adopting a minimum mean square error (MMSE) approach. In both cases, the well-known favorable propagation condition achieved in massive MIMO is exploited. In the HT problem, this greatly simplifies the development of suboptimal rules, whereas for EST problem this allows to obtain an asymptotic MSE approximation, which is then used with convex optimization techniques to solve the optimal sensor power allocation problem in an efficient fashion.
-
Part III Distributed inference over graphs
7 Decentralized detection via running consensus
- + Show details - Hide details
-
p.
155
–174
(20)
Consensus by sensor gossip, which ensures information retrieval from any subset of sensors at an arbitrary instant of time, is a popular paradigm for modern sensor networks designed for inference purposes. In realistic applications, the network continuously senses the surrounding environment, while consensus among its nodes is simultaneously enforced. The basic consensus equation is coherently modified allowing the sensor state update to include both the consensus contribution of neighboring nodes and new measurements. This new paradigm is often referred as running consensus. We review the state-of-the-art of running consensus techniques and discuss their applications with special emphasis to detection problems. The running consensus, when compared with the ideal centralized detection statistic, is affected by an error term that, under suitable conditions, can be negligible. We study such conditions exploiting the theory of locally optimum statistics. We prove the asymptotic equivalence of the running consensus with the ideal centralized system, in terms of detection capabilities. Specifically, such asymptotic optimality is demonstrated for two cases: the fixed sample size (FSS) test and the sequential test.
8 Distributed recursive testing of composite hypothesis in multi-agent networks
- + Show details - Hide details
-
p.
175
–200
(26)
This chapter considers the problem of recursive composite hypothesis testing in a network of sparsely connected agents. In classical centralized composite hypothesis testing, procedures such as the generalized likelihood ratio test (GLRT), i.e., the detection procedure which uses the underlying parameter estimate based on all the collected samples as a plug-in estimate, may exhibit poor performance until a reasonably accurate parameter estimate (typically the maximum likelihood estimate of the underlying parameter/state) is obtained. Usually in setups that employ the classical (centralized) GLRTs, the data-collection phase precedes the parameter estimation and detection statistic update phase, thus rendering the testing an essentially offline batch procedure. The motivation behind studying distributed recursive online detection algorithms in contrast to offline batch processing based detection algorithms is that in most multi-agent networked scenarios, which are typically energy constrained, the priority is to obtain reasonable inference performance by expending fewer amount of resources.
9 Expectation–maximisation based distributed estimation in sensor networks
- + Show details - Hide details
-
p.
201
–230
(30)
Estimating the unknown parameters of a statistical model based on the observations collected by a sensor network is an important problem with application in multiple fields. In this setting, distributed processing, by which computations are carried out within the network in order to avoid raw data transmission to a fusion centre, is a desirable feature resulting in improved robustness and energy savings. In the presence of incomplete data, the expectation-maximisation (EM) algorithm is a popular means to iteratively compute the maximum likelihood (ML) estimate. It has found application in diverse fields such as computational biology, anomaly detection, speech segmentation, reinforcement learning, and motion estimation, among others. In this chapter we will review the formulation of the centralised EM estimation algorithm as a starting point and then discuss distributed versions well suited for implementation in sensor networks. The first class of these distributed versions requires specialised routing through the network in terms of a linear or circular path visiting all nodes, whereas the second class does away with this requirement by using the concept of network consensus to diffuse information through the network. Our focus will be on a relevant sensor network application, in which the parameter of a linear model is to be estimated in the presence of an unknown number of randomly malfunctioning sensors.
-
Part IV Cross-layer issues
10 Distributed estimation in energy harvesting wireless sensor networks
- + Show details - Hide details
-
p.
233
–259
(27)
In this chapter, distributed estimation is examined for energy-harvesting wireless sensor networks (WSNs), where the energy available at the sensors is converted entirely from ambient sources. In this application, each sensor takes a local measurement of the common parameter of interest and forwards it to the fusion center, where the final estimate is performed. Due to the randomness of the energy arrival, the transmission energy and status of the energy-harvesting sensors are unknown and, thus, the final maximum likelihood estimate at the fusion center can be computed using the expectation-maximization (EM) algorithm. Furthermore, by taking into consideration the spatial heterogeneity of the energy arrival, the sensor deployment problem is also examined for the purpose of reconstructing the entire random field. Numerical simulations are provided to demonstrate the effectiveness of the proposed schemes.
11 Secure estimation in wireless sensor networks in the presence of an eavesdropper
- + Show details - Hide details
-
p.
261
–290
(30)
In this chapter, we investigate the performance of distributed estimation schemes in wireless sensor networks (WSNs), in the presence of an eavesdropper. The sensors transmit observations to the fusion center (FC), which at the same time are overheard by the eavesdropper. Both the FC and the eavesdropper reconstruct a minimum mean squared error (MSE) (MMSE) estimate of the physical quantity observed. We address the problem of transmit power allocation for system performance optimization subject to a total average power constraint on the sensor(s), and a security/secrecy constraint on the eavesdropper. We introduce a notion of security in estimation that aims to keep the eavesdropper's MSE or distortion above a certain level. We study this notion in (1) an expected sense, (2) a short-term sense and (3) a probabilistic sense via the concept of secrecy outage. Various system scenarios will be considered such as multiple sensors, multiple transmit antennas and full and partial channel state information (CSI).
12 Robust fusion of unreliable data sources using error-correcting output codes
- + Show details - Hide details
-
p.
291
–311
(21)
The emergence of big and dirty data era demands new distributed learning and inference solutions to tackle the problem of inference with corrupted data. The central goal of this chapter is to discuss the presence of corrupted data in the context of distributed inference networks (DINs) and discuss coding-theoretic strategies to ensure reliable inference performance in several practical scenarios. It discusses a generalization of the classical Byzantine Generals problem in the context of distributed inference to different topologies. Over the last three decades, research community has extensively studied the impact of imperfect transmission channels or sensor faults on distributed inference systems. However, corrupted (Byzantine) data models, considered in this chapter, are philosophically different from the imperfect channels or faulty sensor cases. Byzantines are intentional and intelligent and therefore can optimize over the data corruption parameters. While learning their behavior and actively countering them is a viable approach, this chapter presents a new paradigm of mitigation strategies that use coding-theoretic results. The general approach of error-correcting output codes (ECOC) for data fusion is presented and its applicability for several inference problems in practice dealing with unreliable data including Byzantines is shown. This approach is then shown to be applicable to a wider range of inference problems such as classification using crowdsourced data.
13 Conclusions and future perspectives
- + Show details - Hide details
-
p.
313
–315
(3)
This edited book has dealt with data fusion in wireless sensor networks (WSNs) from a statistical signal-processing perspective. The effective use of data fusion in sensor networks is not new and has had extensive application to surveillance, security, traffic control, health care, environmental and industrial monitoring in the last decades. However, the rising paradigms of Internet-of-Things (IoT) and cyber-physical systems is fueling the huge, steadily increasing and pervasive (i.e., virtually in every spot) presence of wireless-connected network-enabled devices, in most cases inexpensive sensors, in everyday life (including most workplaces). On one hand, this represents an unprecedented opportunity toward the accomplishment of the innovative concepts of Smart Cities, Smart Factories, Industry 4.0 and Society 5.0. On the other hand, the stringent bandwidth and energy limitations imposed by the network coupled with sensor-fabrication (e.g., size and cost) constraints makes the design of inference systems, able to fuse sensors'information to obtain high-level information of a certain phenomenon of interest, very challenging. Such picture is worsened by the obvious necessity of dealing with measurements drawn from heterogeneous (multimodal) sensors, which provide information-diversity and enable the collection of different kinds of data (in big volumes) about the surrounding scenario. The aim of this book was to provide a systematic overview of the relevant issues and corresponding methodological solutions to some of the key milestones within this (wide) research field. These include the sensing and reporting phases, the architectural choice of the WSN (with the corresponding implications) and some selected cross-layer aspects, imposed by both the context and novel technological advancements.
-
Back Matter
- + Show details - Hide details
-
p.
(1)