Histograms to Quantify Dataset Shift for Spectrum Data Analytics

Cloud/software-based wireless resource controllers have been recently proposed to exploit radio frequency (RF) data analytics for a network control, configuration and management. For efficient resource controller design, tracking the right metrics in real-time (analytics) and making realistic predictions (deep learning) will play an important role to increase its efficiency. This factor becomes particularly critical as radio environments are generally dynamic, and the data sets collected may exhibit shift in distribution over time and/or space. When a trained model is deployed at the controller without taking into account dataset shift, a large amount of prediction errors may take place. This paper quantifies dataset shift in real wireless physical layer data by using a statistical distance method called earth mover’s distance (EMD). It utilizes an FPGA to process in real-time the inphase and quadrature (IQ) samples to obtain useful information, such as histograms of wireless channel utilization (CU). We have prototyped the data processing modules on a Xilinx System on Chip (SoC) board using Vivado, Vivado HLS, SDK and MATLAB tools. The histograms are sent as low-overhead analytics to the resource controller server where they are processed to evaluate dataset shift. The presented results provide insight into dataset shift in real wireless CU data collected over multiple weeks in the University of Oulu using the implemented modules on SoC devices. The results can be used to design approaches that can prevent failures due to datashift in deep learning models for wireless networks.