Implementing Big Data Lake for Heterogeneous Data Sources

Modern connected cities are more and more leveraging advances in ICT to improve their services and the quality of life of their inhabitants. The data generated from different sources, such as environmental sensors, social networking platforms, traffic counters, are harnessed to achieve these end goals. However, collecting, integrating, and analyzing all the heterogeneous data sources available from the cities is a challenge. This article suggests a data lake approach built on Big Data technologies, to gather all the data together for further analysis. The platform, described here, enables data collection, storage, integration, and further analysis and visualization of the results. This solution is the first attempt to integrate a diverse set of data sources from four pilot cities as part of the CUTLER project (Coastal urban development through the lenses of resiliency). The design and implementation details, as well as usage scenarios are presented in this paper.

Authors:
Mehmood Hassan, Gilman Ekaterina, Cortes Marta, Kostakos Panos, Byrne Andrew, Valta Katerina, Tekes Stavros, Riekki Jukka

Publication type:
A4 Article in conference proceedings

Place of publication:
IEEE 35th International Conference on Data Engineering Workshops (ICDEW)

Keywords:
Big Data, Data Analysis, data lake, smart city

Published:
1 July 2019

Full citation:
H. Mehmood et al., “Implementing Big Data Lake for Heterogeneous Data Sources,” 2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW), Macao, Macao, 2019, pp. 37-44. doi: 10.1109/ICDEW.2019.00-37

DOI:
https://doi.org/10.1109/ICDEW.2019.00-37

Read the publication here:
http://urn.fi/urn:nbn:fi-fe2019082024798