Abstract
Emerging applications such as those running on the Internet of Things (IoT) devices produce constant data streams that need to be processed in real-time. Distributed stream processing systems (DSPs), with geographically distributed cluster networks interconnected via wide area network (WAN) links, have recently gained interest in handling these applications. How-ever, these applications have stringent requirements such as low-latency and high bandwidth that must be guaranteed to ensure the quality of service (QoS). These application requirements raise fundamental DSPs resource management and scheduling challenge. In this paper, we formulate the problem of placement of worker nodes on a geo-distributed DSPs cluster network as a multi-criteria decision-making problem and propose an additive weighting-based approach to solve it. The proposed solution finds the trade-off among different network parameters and allows executing the tasks according to the desired performance metrics. We evaluated the proposed approach using the Yahoo! streaming benchmark on a testbed and compare it against mechanisms deployed in Apache Spark, Apache Storm, and Apache Flink. The results of the evaluation show that our approach improves the performance of Spark up to 2.2x-7.2x, Storm up to 1.2x-3.4x, and Flink up to 1.4x-3.3x compared to other approaches, which makes our approach useful for use in practical environments.
Original language | English |
---|---|
Title of host publication | Proceedings - 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021 |
Editors | Laurent Lefevre, Stacy Patterson, Young Choon Lee, Haiying Shen, Shashikant Ilager, Mohammad Goudarzi, Adel N. Toosi, Rajkumar Buyya |
Publisher | IEEE/LEOS |
Pages | 820-827 |
Number of pages | 8 |
ISBN (Electronic) | 9781728195865 |
ISBN (Print) | 978-1-7281-9587-2 |
DOIs | |
Publication status | Published - 13 May 2021 |
Event | 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid) - Melbourne, Australia Duration: 10 May 2021 → 13 May 2021 |
Conference
Conference | 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid) |
---|---|
Period | 10/05/21 → 13/05/21 |
Keywords
- Wide area networks
- Additives
- Storms
- Bandwidth
- Quality of service
- Benchmark testing
- Topology
- Geo-distributed analytics
- worker node placement
- Stream processing
- Simple Additive Weighting
- IoT