Sentiment analysis, also known as opinion mining, plays a big role in both private and public sector Business Intelligence (BI); it attempts to improve public and customer experience. Nevertheless, de-identified sentiment scores from public social media posts can compromise individual privacy due to their vulnerability to record linkage attacks. Established privacy-preserving methods like k-anonymity, l-diversity and t-closeness are offline models exclusively designed for data at rest. Recently, a number of online anonymization algorithms (CASTLE, SKY, SWAF) have been proposed to complement the functional requirements of streaming applications, but without open-source implementation. In this paper, we present a reusable Apache NiFi dataflow that buffers tweets from multiple edge devices and performs anonymized sentiment analysis in real-time, using randomization. The solution can be easily adapted to suit different scenarios, enabling researchers to deploy custom anonymization algorithms.
Pandya Abhinay, Kostakos Panos, Mehmood Hassan, Cortes Marta, Gilman Ekaterina, Oussalah Mourad, Pirttikangas Susanna
A4 Article in conference proceedings
Place of publication:
2019 European Intelligence and Security Informatics Conference (EISIC)
A. Pandya et al., “Privacy preserving sentiment analysis on multiple edge data streams with Apache NiFi,” 2019 European Intelligence and Security Informatics Conference (EISIC), Oulu, Finland, 2019, pp. 130-133, doi: 10.1109/EISIC49498.2019.9108851
Read the publication here: