Privacy preserving sentiment analysis on multiple edge data streams with Apache NiFi

Sentiment analysis, also known as opinion mining, plays a big role in both private and public sector Business Intelligence (BI); it attempts to improve public and customer experience. Nevertheless, de-identified sentiment scores from public social media posts can compromise individual privacy due to their vulnerability to record linkage attacks. Established privacy-preserving methods like k-anonymity, l-diversity and t-closeness are offline models exclusively designed for data at rest. Recently, a number of online anonymization algorithms (CASTLE, SKY, SWAF) have been proposed to complement the functional requirements of streaming applications, but without open-source implementation. In this paper, we present a reusable Apache NiFi dataflow that buffers tweets from multiple edge devices and performs anonymized sentiment analysis in real-time, using randomization. The solution can be easily adapted to suit different scenarios, enabling researchers to deploy custom anonymization algorithms.

Authors:
Pandya Abhinay, Kostakos Panos, Mehmood Hassan, Cortes Marta, Gilman Ekaterina, Oussalah Mourad, Pirttikangas Susanna

Publication type:
A4 Article in conference proceedings

Place of publication:
2019 European Intelligence and Security Informatics Conference (EISIC)

Keywords:
anonymization, Apache NiFi, IoT privacy, sentiment analysis, Social Media

Published:

Full citation:
A. Pandya et al., “Privacy preserving sentiment analysis on multiple edge data streams with Apache NiFi,” 2019 European Intelligence and Security Informatics Conference (EISIC), Oulu, Finland, 2019, pp. 130-133, doi: 10.1109/EISIC49498.2019.9108851

DOI:
https://doi.org/10.1109/EISIC49498.2019.9108851

Read the publication here:
http://urn.fi/urn:nbn:fi-fe2020061644570