Projects per year
Abstract
Apache Spark is one of the most commonly used
frameworks for Big Data processing. Research on the provided
streaming dynamic resource allocation feature, has been shown
that large data load fluctuations, for instance, in website traffic,
have a negative impact on the automatic scaling. Research has
also indicated that the lack of data load prediction, which
aims at the identification of the expected data load increase on
peak hours/days, is the root cause of the aforementioned issue.
Hence, this paper proposes an enhanced solution, namely, KORDI
(Knowledge-based Orchestrated Resource DIstribution), aiming
at optimising the allocation of Spark resources on Streaming
applications in real time with the use of SARIMAX model.
The experimental evaluation proves that the proposed solution
provides a cost reduction of 38% without affecting stability.
frameworks for Big Data processing. Research on the provided
streaming dynamic resource allocation feature, has been shown
that large data load fluctuations, for instance, in website traffic,
have a negative impact on the automatic scaling. Research has
also indicated that the lack of data load prediction, which
aims at the identification of the expected data load increase on
peak hours/days, is the root cause of the aforementioned issue.
Hence, this paper proposes an enhanced solution, namely, KORDI
(Knowledge-based Orchestrated Resource DIstribution), aiming
at optimising the allocation of Spark resources on Streaming
applications in real time with the use of SARIMAX model.
The experimental evaluation proves that the proposed solution
provides a cost reduction of 38% without affecting stability.
Original language | English |
---|---|
Title of host publication | 2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS-23) |
Pages | 1-3 |
Number of pages | 3 |
Publication status | Accepted/In press - 2023 |
Event | 2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) - Raleigh, North Carolina, Raleigh, United States Duration: 23 Apr 2023 → 25 Apr 2023 https://ispass.org/ispass2023/ |
Conference
Conference | 2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) |
---|---|
Country/Territory | United States |
City | Raleigh |
Period | 23/04/23 → 25/04/23 |
Internet address |
Fingerprint
Dive into the research topics of 'KORD-I: A Framework for Real-Time Performance and Cost Optimization of Apache Spark Streaming'. Together they form a unique fingerprint.Projects
- 2 Active
-
OZR2749: International Joint Research Group - VUB_UPatras International Joint Research Group on ICT (JICT)
Deligiannis, N., Jansen, B., Schelkens, P. & Athanassios, S.
12/02/15 → 11/02/27
Project: Fundamental