Fixing Data Anomalies with Prediction Based Algorithm in Wireless Sensor Networks

Reading time: 5 minute
...

📝 Original Info

  • Title: Fixing Data Anomalies with Prediction Based Algorithm in Wireless Sensor Networks
  • ArXiv ID: 1111.3334
  • Date: 2011-11-15
  • Authors: 원문에 명시된 저자 정보가 제공되지 않았습니다. —

📝 Abstract

Data inconsistencies are present in the data collected over a large wireless sensor network (WSN), usually deployed for any kind of monitoring applications. Before passing this data to some WSN applications for decision making, it is necessary to ensure that the data received are clean and accurate. In this paper, we have used a statistical tool to examine the past data to fit in a highly sophisticated prediction model i.e., ARIMA for a given sensor node and with this, the model corrects the data using forecast value if any data anomaly exists there. Another scheme is also proposed for detecting data anomaly at sink among the aggregated data in the data are received from a particular sensor node. The effectiveness of our methods are validated by data collected over a real WSN application consisting of Crossbow IRIS Motes \cite{Crossbow:2009}.

💡 Deep Analysis

Figure 1

📄 Full Content

Wireless Sensor Networks (WSNs) are formed by large number of autonomous units called sensor nodes. Each sensor node has the capability of sampling data, processing it and sending the data through radio transmitters. In this aspect, each sensor node is independent of its sampling and sending mechanisms and its values. This independent working nature of sensor nodes set up notion of independent data transmission to the base station. The aggregated data at sink are independent. The base station is a processing center also called sink node or simple sink.

WSNs are extensively used in natural environment monitoring and inventory management. Lots of specific applications have been developed to monitor very delegate processes that include: nuclear reactor control, habitat monitoring, object tracking, mines monitoring, fire detection, wild life monitoring, etc. Depending on the application and user requirement, sensor nodes report the data to the sink either in synchronous mode or in asynchronous mode. Usually sensor nodes sensed data in a fixed time indexed manner and transmit the data to the sink periodically.

The WSNs based applications that we have mentioned above use aggregated data to perform a certain task and give meaningful outputs to the network or to the user. The aggregated data from the WSN may be affected by anomalies in the WSN. The anomaly detection is possible when the aggregated data at sink do not follow a certain pattern [2]. Anomalous data patterns can be caused due to case 1: unreliability of wireless sensor networks or case 2: due to occurrence of unusual phenomena in the monitored region. For case 1: the unreliability of wireless sensor networks incurs faulty sensors and the faults occur due to hardware malfunction, sampling errors, transmission loss etc. Detecting data anomaly for both of the cases, case 1 and 2 are very important with respect to any type of monitoring applications. One important objective of WSN application is to detect the occurrence of unusual phenomena in the monitored region and to take necessary action for that. Another objective of WSN application is to make appropriate decisions based on aggregated data at sink in spite of the unreliability of the wireless sensor networks. Hence, it becomes crucial for us to correct the data before applying it to the applications. Otherwise, the anomalous data produced due to unreliability of wireless network will have a great impact on making appropriate decisions.

In this paper we make an attempt to exploit behavior of a single node over a considerable time to correct data if there is any anomaly in the data. Specifically, we fit a statistical model to a single node as we know, all nodes transmit data independent of each others and it is quiet clear that we may not know the spatial information before hand. We validate our model with data gathered over a real WSN for considerable period of time based on the IRIS platform [1].

In this paper we present an appropriate statistical modeling i.e., ARIM A(p, d, q) using the data of a real WSN application consisting of Crossbow IRIS Motes. We propose an algorithm 1: To find suitable ARIMA model and Forecast, which corrects the anomalous data at sink for each sensor node with ARIMA forecast values at any point of time. The forecast values are also used in the algorithm 2: Anomaly Detection for detecting anomalous data of a sensor node with 95% confidence interval. The algorithms applied for each node are solely dependent upon the data stream transmitted by that particular sensor node. As the algorithms use past data of individual node only, it is imperative that the algorithms do not depend upon state of other nodes in the network. We also do not consider contextual and temporal relationship among the nodes to predict the forecast value. While, if needed, contextual or temporal relationship can be used to further smooth our results as suggested by [3].

The advantages of the proposed works are following compare to the earlier works. Our anomaly correction algorithm only needs data from the particular node we want to study. The proposed algorithm can be used for the purpose of fault tolerance in the following way. If few nodes fail to sense data due to transient fault at a particular instance of time, still we can produce data by processing its old data. Our method is highly sophistic method, ARIMA, in statistics time series models are known to represent many complex processes than any other models. Once the preliminary condition of stationary is satisfied then we can use them to represent complex series. Finally, all our data processing is to be done at the sink, which is suppose to have sufficient power and enough computational capability for fitting the statistics models, detecting and correcting the anomaly for the data of any sensor node.

Statistical modeling is used in literature for the purpose of data gathering with less number of transmission, anomaly detection in the gathered

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut