An Outlook to Declarative Languagesfor Big Steaming Data
The Big Data movement proposes data streaming systems to tame velocity and to enable reactive decision making. However, approaching such systems is still too complex due to the paradigm shift they require, i.e., moving from scalable batch processing to continuous data analysis and pattern detection. Recently, declarative Languages are playing a crucial role in fostering the adoption of Stream Processing solutions. In particular, several key players introduce SQL extensions for stream processing. These new languages are currently playing a central role in fostering the stream processing paradigm shift. In this tutorial, we give an overview of the various languages for declarative querying interfaces big streaming data. To this extent, we discuss how the different Big Stream Processing Engines (BigSPE) interpret, execute, and optimize continuous queries expressed with SQL-like languages such as KSQL, Flink-SQL, and Spark SQL. Finally, we present the open research challenges in the domain.
Riccardo Tommasini is an assistant professor at the Data System Group, University of Tartu, Estonia. Riccardo did his PhD at the Department of Electronics and Information of the Politecnico di Milano. His thesis, titled “Velocity on the Web”, investigates the velocity aspects that concern the Web environment. His research interests span Stream Processing, Knowledge Graphs, Logics and Programming Languages. Riccardo’s tutorial activities comprise Stream Reasoning Tutorials at ISWC 2017, ICWE 2018, ESWC 2019, and TheWebConf 2019, and DEBS 2019.
Sherif Sakr is the Head of Data Systems Group at the Institute of Computer Science, University of Tartu, Estonia. He received his PhD degree in Computer and Information Science from Konstanz University, Germany in 2007. He is currently the Editor-in-Chief of the Springer Encyclopedia of Big Data Technologies. His research interest include data and information management, big data processing systems, big data analytics and data science. Prof. Sakr has published more than 150 research papers in international journals and conferences. He delivered several tutorials in various conferences including WWW’12, IC2E’14, CAiSE’14, EDBT Summer School 2015. The 2nd ScaDS International Summer School on Big Data 2016, The 3rd Keystone Training School on Keyword search in Big Linked Data 2017, DEBS 2019 and ISWC 2019.
Emanuele Della Valle is an Associate Professor at the Department of Electronics and Information of the Politecnico di Milano. His research interests covered Big Data, Stream Processing, Semantic technologies, Data Science, Web Information Retrieval, and Service Oriented Architectures. His work on Stream Reasoning research filed was applied in analysing Social Media, Mobile Telecom and IoT data streams in collaboration with Telecom Italia, IBM, Siemens, Oracle, Indra, and Statoil. Emanuele presented several Stream Reasoning related tutorials at SemTech 2011, ESWC 2011, ISWC 2013, ESWC 2014, ISWC 2014, ISWC 2015, ISWC 2016, DEBS 2016, ISWC 2017 and KR 2018.
Hojjat Jafarpour is a Software Engineer and the creator of KSQL at Confluent. Before joining Confluent he has worked at NEC Labs, Informatica, Quantcast and Tidemark on various big data management projects. Hojjiat earned his PhD in computer science from UC Irvine, where he worked on scalable stream processing and publish/subscribe systems.
preprint open proceedings videp
bibtex
@inproceedings{DBLP:conf/edbt/0001SVJ20,
author = {Riccardo Tommasini and
Sherif Sakr and
Emanuele Della Valle and
Hojjat Jafarpour},
editor = {Angela Bonifati and
Yongluan Zhou and
Marcos Antonio Vaz Salles and
Alexander B{\"{o}}hm and
Dan Olteanu and
George H. L. Fletcher and
Arijit Khan and
Bin Yang},
title = {Declarative Languages for Big Streaming Data},
booktitle = {Proceedings of the 23nd International Conference on Extending Database
Technology, {EDBT} 2020, Copenhagen, Denmark, March 30 - April 02,
2020},
pages = {643--646},
publisher = {OpenProceedings.org},
year = {2020},
url = {https://doi.org/10.5441/002/edbt.2020.84},
doi = {10.5441/002/edbt.2020.84},
timestamp = {Wed, 25 Mar 2020 15:46:06 +0100},
biburl = {https://dblp.org/rec/conf/edbt/0001SVJ20.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}