Paper Draft - HackMD

# Resource Allocation for Collaborative and Distributed Edge Computing System. ## Abstract - [**Abstract**] (250~300 words) - [**Intro**]: With the rapid development of edge computing system and containerization technologies, more and more internet service provider has deployed their application server onto the MEC system. By the way, as the computing and networking resource on edge system are not so powerful, it is important to develop a strategy which optimize the utilization of both computing and networking resources. - [**Prob**]: The problem of collaborative task offloading and resource allocation is studied in order to maximize the users' quality of experience (QoE), which is defined as the average delay experienced by all users in the system. The considered problem is formulated as a multi-server multi-allocation problem (MSMA) that involves jointly optimizing requests offloading decision and computing resource allocation in the overall system. - [**Proposal**]: The optimization of propagation and processing delay formulated in the considered problem can be reduced to a knapsack problem, which is known as NP-complete. To overcome this drawback, we propose to transform the original problem into a resource allocation (RA) problem with fixed task offloading decision and a task offloading (TO) problem that optimizes the optimal-value function corresponding to the RA problem. - [**Solution**]: We proposed a greedy algorithm for the TO problem that achieves an optimal solution in polynomial time. On the other hand, we addressed the RA problem using a novel heuristic algorithm. - [**Result**]: Simulation results show that the proposed design surpasses the baseline and that it significantly improves the users’ QoE over traditional approaches. - [**Index Terms**] - Edge Computing, Computational Offloading, Multi-Server Resource Allocation, Distributed System. ## Introduction The emergence of 6G networks is revolutionizing communication with unprecedented data speeds and advanced capabilities, propelling innovation in connectivity. This progress has significantly impacted the electric vehicle (EV) industry, fostering enhanced connectivity and intelligent infrastructure for smart and efficient electric transportation systems. Nevertheless, in light of constrained resources, vehicles may notice a subtle performance decline while processing these applications locally. In this context, IEEE1935 Edge Computing(EC) [1], which aims to provide computing resources at network edges has become a promising solution in EV utilization, offering real-time processing capabilities at the network’s edge for improved data analysis and swift decision-making [2]. Through the implementation of Vehicular Edge Computing (VEC), computational tasks within the system can be intelligently offloaded to nearby Edge Computing Servers (ECS). This strategic offloading enhances processing efficiency, leading to a notable reduction in response time and end-to-end delay compared to conventional cloud-based approaches. Consequently, this approach not only optimizes system performance but also elevates users' Quality of Experience (QoE). The convergence of task offloading (TS) and resource allocation (RA) has emerged as a prominent area of interest within academia and industry alike. Various studies [3], [4], [5] have been dedicated to exploring task-offloading strategies centered around remote servers. However, these investigations primarily tackled the issue of resource inadequacy from the client-side vantage point. Notably absent from these works is the consideration of the inherent challenge posed by the uneven geographical distribution of service requests, leading to load imbalances among edge servers situated in different clusters. Recognizing this gap in the existing literature is crucial for a comprehensive understanding of the dynamics between task offloading, resource allocation, and the geographical distribution of service demands in the realm of edge computing. Addressing this challenge, collaborative edge-edge cooperation models have been introduced in [6], [7]. Within these models, a task offloaded to a high-load edge server has the flexibility to be further delegated to an adjacent low-load edge server. This strategic approach aims to enhance the overall performance of the Vehicular Edge Computing (VEC) system through effective load balancing. A distinctive feature of our VEC scenario lies in the consideration of multiple network areas with varying inter-propagation speeds. The overarching objective is to minimize the user's Quality of Experience (QoE), highlighting the nuanced and sophisticated nature of our proposed solution. This paper makes notable contributions in the following key aspects: 1. A comprehensive Quality of Experience (QoE) model is introduced to optimize average latency and end-to-end delay within the system. This model is intricately crafted to encompass both round-trip propagation delay and processing delay, providing a holistic framework for performance enhancement. 2. In addressing the intricate challenge of joint task offloading and resource allocation, a task offloading policy is devised. This policy, aligned with the IEEE P1935 standard, adopts a pragmatic approach by breaking down large-scale problems into a set of manageable low-dimensional subproblems. This strategic design facilitates efficient problem-solving for enhanced system performance. 3. The proposed framework is rigorously validated through practical implementation and experimentation. Leveraging Kubernetes clusters, simulations are conducted based on real-world traffic datasets and a sample application service. These experiments serve to affirm further the effectiveness and applicability of the proposed strategies in real-world scenarios. The paper's subsequent sections are structured as follows: Section II introduces and explains the proposed system model, offering context to our work. Section III delves into the intricacies of the algorithm design, while simulations and evaluations are detailed in Section IV, providing practical insights. The paper concludes in Section V, summarizing key findings and outlining potential directions for future research. ## System Model This section begins by introducing the collaborative cluster-cluster system architecture. Subsequently, it delves into the detailed processes of TO and RA within the VEC system. Finally, the paper outlines the QoE model considered in the context of this study. 1. **Network Model** ![Paper_Scenario](https://hackmd.io/_uploads/S1xNB6mq6.png) Fig.1: The topology of the proposed VEC network scenario. Illustrated in Fig.1, our study revolves around a hierarchical VEC network, encompassing several edge clusters that span various network areas. Within this framework, compute nodes, representing fundamental processing elements in the edge network, can take on various processor types. Control nodes assume the responsibility of managing the virtualized infrastructure and resources of the compute nodes within their designated edge areas. At the helm of this structure, the Orchestrator is meticulously designed to maintain a global perspective of the entire edge network. It plays a pivotal role in making resource allocation decisions associated with each cluster, aiming to optimize the QoE across the entire system. Within our system, key components include an EC Orchestrator (ECO), a network of Edge Clusters denoted as $C=\{1, 2, ..., c\}$, and a set of Edge Applications represented by $A=\{1, 2, ..., a\}$. Additionally, diverse resource types, denoted collectively as $R=\{1, 2, ..., r\}$, are available across the entire system. The system further accommodates requests from all vehicles in the area of the $i$th cluster seeking the $k$th application's service, with "Requests" serving as a metric for the quantity of requests soliciting the kth application's service covered by Edge Cluster $i$. Notably, within each cluster's coverage area, the requesting services exhibit variations in processing delay, data size, and distinct inter-propagation speeds across clusters. 2. **Computing & Communication Model** * **Computing Model:** In acknowledging the influence of allocated computing resources on the performance of a deployed machine/container, we draw insights from recent advancements, such as those presented in [8]. Various methods, including linear regression and Support Vector Machine (SVM) models, have been proposed to assess the intricate relationship between allocated computing resources and the overall performance of a Docker container. In light of this, we conceptualize the task computing delay as a function of the allocated resource, denoted by $\eta_k$. Specifically, $\eta_k$ represents the function capturing the average processing delay of application $k$ under diverse allocations of computing resources. Within the framework of processing delay functions, we make the simplifying assumption that these values will be truthfully provided by the service provider of the application. However, it is essential to recognize that the availability and distribution of computing resources differ across each edge cluster. Consequently, we impose the following constraint to ensure a realistic depiction of the system's dynamics: $$C_1: \sum _k Res_{jkr}\le Res_{jr}', j\in C, k\in A, r\in R;$$ which implies that the aggregate resource allocation for all applications within a given cluster $j$ must not surpass the available resources allocated to that cluster. * **Communication Model:** The computational tasks received within cluster $i$ can be dynamically distributed among other edge clusters, necessitating consideration of the associated task transmission delay. Similar to the work [9], we introduce the term "unit propagation delay" denoted as $e_{ij}$ between clusters $i$ and $j$," which can be readily obtained through periodic measurements. Additionally, we assume that the average data size of a deployed application is provided by its service provider. Consequently, we formulate the model for average transmission delay for a single requests being forwarded from cluster $i$ to cluster $j$ as the following: $$t_{ij}=e_{ij}\cdot d_k, i\in C, j\in C, k\in A;$$ This expression denotes the average transmission delay for requests forwarded from cluster $i$ to cluster $j$. 3. **QoE Model** There are several approach to define the QoE and QoS of a network system. Unlike the recent works in [10], our objective is to enhance the QoE for users in the system, focusing specifically on the latency experienced by each user. The overarching concept involves identifying an adequately representative average delay for the entire system. Consequently, we calculate the average delay by considering all clusters, aiming to capture a holistic perspective of the system's performance. $$D=avg([D_i])=\frac{\sum _{i\in C}(D_i\cdot \sum _{k\in A}Req_{ik})}{\sum _{i\in C}\sum _{k\in A}Req_{ik}};$$ where $D$ represent the average delay in the overall system, $D_i$ is the average delay in cluster $i$ and $Req_{ik}$ is the amount of requests of application $k$ in cluster $i$. In the aforementioned context, we introduced the term "average delay" for a single cluster without a precise definition. The average cluster delay is derived by calculating the overall delay through weighted averaging of the product of the forwarded requests amount $N_{ijk}$ and its experienced delay, which is the summation of processing delay $D_{P_{ij}}$ and propagation delay $D_{T_{ij}}$ in our case. The detailed definition of the average delay for cluster $i$ is specified below: $$D_i=avg([D_{P_{ij}}+D_{T_{ij}}])=\frac{\sum _{j\in C}(T_{ij}+P_{ij})}{\sum _{k\in A}Req_{ik}}, i\in C;$$ where the variable $T_{ij}$ is the result of multiplying the transmission delay between cluster $i$ and cluster $j$ by the quantity of requests forwarded from cluster $i$ to cluster $j$. On the other hand, $P_{ij}$ is the product of the processing delay and the number of requests received in cluster $i$, which is currently being handled in cluster $j$. The propagation term, $T_{ij}$, is determined by multiplying the average data size of the requests with the average propagation speed between cluster $i$ and $j$. Our objective in this context is to calculate the summation of the propagation delay for all requests forwarded from cluster $i$. Consequently, this relationship can be expressed by incorporating the summation term for the quantity being forwarded. $$T_{ij}=e_{ij}\cdot \sum _{k\in A}(N_{ijk}\cdot d_k),i\in C, j\in C;$$ On a contrasting note, the processing term, $P_{ij}$, can be acquired by utilizing the performance function $\eta _k$ provided by each service provider. This function takes resource allocation and concurrent serving requests as inputs and outputs a processing delay. Similarly, we introduce the summation term to calculate the overall case, encompassing the total processing delay for requests forwarded from cluster $i$. $$P_{ij}=\sum _{k\in A}(\eta _{jk}\cdot N_{ijk}), i\in C, j\in C$$ At this juncture, we have deconstructed the problem of optimizing the overall effective delay, $D$, into two distinct subproblems: cluster-level processing and propagation delay, respectively. These relationships are articulated through the crucial term $N_{ijk}$, which holds significance and will be elaborated upon in the subsequent section. ## Proposed Method In this section, we commence by presenting the concept of redistributing requests within the system. Subsequently, we mathematically formulate the forwarding of requests employing a fixed forwarding policy. Finally, we integrate and address the problem by employing our proposed heuristic algorithm for effective solution. 1. **Re-distribution of clusters' requests** In this section, we initially introduce the redistribution of request density through a load-balancing method. The fundamental concept is straightforward: clusters with a higher allocation of resources should handle a greater volume of requests. Consequently, the redistributed requests reflect the quantity that each cluster should effectively manage. The detailed formulation is presented below. $$Req_{jk}'=\frac{|R_{jk}|}{\sum _{n\in C}|R_{nk}|}\sum _{n\in C}Req_{nk}$$ $$|R_{jk}|=\sqrt{ \sum _{l\in R}R_{jkl}^2}; $$ Here, $Req_{jk}'$ symbolizes the redistributed quantity of requests for application $k$ that ought to be managed in cluster $j$. The term $|R_{jk}|$ denotes the allocated resource amount in cluster $j$ for application $k$. Given the availability of multiple resource types for allocation, we formulate this term by employing the concept of vector magnitude. In addition, the notation $\sum _{n\in C}Req_{nk}$ straightforwardly signifies the total requests for application k across the entire system. 2. **Requests Forwarding Formulation** Having obtained the redistributed requests, we can iteratively compute the forwarding amounts. To derive a sub-optimal forwarding strategy, we adopt a greedy approach, prioritizing clusters with higher inter-propagation speeds in the iteration process. In this approach, we consistently compute the self-offloaded part, where tasks are processed at the local cluster without forwarding, given the absence of propagation delay between any cluster and itself. $$N_{iik}=\begin{cases} Req_{ik}', Req_{ik}\ge Req_{ik}';\\ Req_{ik}, Req_{ik}< Req_{ik}'; \end{cases}$$ Afterward, we formulated the rest of the terms in the ascending order of propagation speed. $$N_{ijk}=\begin{cases} 0, Req_{ik}'+\sum _{\beta \in Clus-prevC_j}N_{i\beta k}\ge Req_{ik} || Req_{jk}'\le Req_{jk}+\sum _{\alpha\in Clus-prevC_i }N_{\alpha jk};\\ min[(Req_{jk}'-Req_{jk}-\sum _{\alpha \in Clus-prevC_i}N_{\alpha jk}), (Req_{ik}-Req_{ik}'-\sum _{\beta \in Clus-prevC_j}N_{i\beta k})], else \end{cases}$$ - $prevC_n$: Set of cluster(s) iterated for cluster $n$ before. The formulated relation considers two conditions: the clusters forwarding requests must have available request slots, and the cluster to which requests are being forwarded should have excess requests. The function enters the second part only if both conditions are simultaneously valid. In this second part, where the function doesn't always equal zero, we establish another set of conditional values. It represents the smaller of the available request slots on the cluster forwarding requests and the excess requests on the receiving cluster. The detailed algorithm for the formulation of the above mathematical relations is as shown below: ``` Algorithm 1: Greedy Requests Forwarding Procedure. ================================================== Input: Requests Distribution, Re-distributed Requests, Inter-propagation speed Output: Requests Forwarding Strategy ================================================== 1. foreach i==j do 2. foreach k in A do 3. Compute the cases of how much requests doesn't require forwarding, namely Niik. 4. Store the computed cases in an array prevC. 5. Sort the tuple storing the inter-propagation speed between 2 clusters in ascending order. 6. foreach i!=j do 7. foreach k in A do 8. Compute the cases of how much requests should be forwarded the other clusters, namely Nijk. 9. Return the computed requests forwarding strategy N. ``` It's important to note that the formulated strategy above is not rigid; rather, it is a dynamic function. This function takes a resource allocation strategy as its input and produces a greedy forwarding policy under our assumptions. 3. Utilization of Sigmoid Filter: In our discussion thus far, we haven't addressed the resource constraints on each cluster and have yet to incorporate them into the mathematical relations defined above. While numerous mathematical approaches can solve optimization problems, the challenge lies in discarding invalid solutions due to resource constraints. To address this, we introduce a filtering mechanism simulated by a sigmoid function. A sigmoid function acts as a rough unit step function. By adjusting the coefficient to a higher value, the sigmoid function approximates a unit-step function. In this way, we can incorporate the sigmoid function into our defined model to adhere to the resource constraints. The advantage of using a sigmoid function as our filter lies in its ease of derivation. Unlike a unit-step function, a sigmoid function lacks any discontinuous points, making it computationally straightforward. $$\sigma (ax) = \frac{1}{1+e^{-ax}}$$ In order to formulate the system delay equation with resource constraint, we simply add a conditional term on the delay model. That is as shown below: $$D^*=D+\Omega \sum _{\gamma _i \in Constraints} \sigma(\gamma _i)$$ where $D^*$ represent the average system delay under resource constraint, $\Omega$ is a hyperparameter used to simulate an enourmous delay while surpassing the resource constraints. 4. **PSO Algorithm** The formulated problem is represented as a function that takes resource allocation and requests distribution as inputs and yields the average system delay as an output. With the mathematical definition in place, numerous methods can be employed to solve this optimization problem. To streamline the calculation, we propose a heuristic solution that leverages Particle Swarm Optimizer (PSO) for optimization. The formulated average delay model serves as the fitness function within the PSO solver. In this context, each particle in the PSO solver corresponds to a potential resource allocation strategy. By utilizing the system delay model, we can easily translate these resource allocation strategies into average system delays, thereby optimizing the overall system QoE. ## Simulation Results In this section, we provide an overview of the simulation environment, encompassing system parameters, methodology, and the dataset employed. Subsequently, we present and compare our proposed solution with alternative approaches against a set of general benchmarks for a comprehensive evaluation. 1. **Simulation Setup** To comprehensively evaluate the performance of our proposed scheme, we conducted extensive simulations across various scenarios. The parameters varied include the variance of requests distributed among the clusters and the amount of cummulative requests distributed among clusters. The simulation parameters were carefully chosen in accordance with prior research, particularly studies referenced in [9] and [11]. Results from each scenario were averaged over 150 simulations to ensure robustness. In our default settings, we employed 3 clusters, each equipped with 2 resource types: CPUs randomly ranging from 5 to 15, and memory randomly ranging from 4GB to 10GB. Additionally, the default data size of applications ranged randomly from 1MB to 5MB, while unit propagation delays fell within the range of 1MB/s to 20MB/s. Finally, the processing efficiencies of deployed applications were set to default values as the following: $$\eta _{jk}=\frac{Req_{jk}'}{k_1\cdot CPU_{jk}}+\frac{Req_{jk}'}{k_2\cdot Memory_{jk}};$$ Here, $k_1$ and $k_2$ represent the weighted coefficients of the resource allocation. It's important to note that this performance model can be adjusted as needed to align with the characteristics of the deployed applications in real-world scenarios. For the sake of efficiency in our simulations, we have opted for a simplified model, understanding that it provides a foundational framework for our evaluation. 2. **Performance Evaluation** This section delves into an assessment of the proposed framework's performance, focusing on the average system delay. To gauge the quality of resource allocation decisions, we consider the following benchmarks: 1. Local Computing Only(LCO): This refers to the non-collaborative edge system where no cluster-cluster forwarding is allowed within the framework. 2. Game-based Task Offloading(GTO): This represents a non-collaborative framework where each cluster makes rational decisions solely for its own benefit. The idea of the scheme is embodied in [9]. #### Average Delay With Different Task Amount ![Result1](https://hackmd.io/_uploads/ByebluA96.png) Fig.2: Average delay with the increase of task amount. In Figure 2, we observe the comparison of the average delay optimized by our proposed algorithm with two reference schemes as the number of requests in the overall system increases. Across all schemes, there is a notable trend of increasing average delay with the rising request volume. Particularly striking is the consistently higher average delay observed in the NTO scheme compared to other schemes. This disparity primarily stems from non-uniform resource and request distribution among clusters, resulting in certain clusters exhibiting lower proficiency in local task processing. Consequently, this exacerbates the overall delay of the system. Conversely, our proposed CTO scheme stands out as the most effective. It optimizes resource utilization while ensuring vehicle task delay tolerance and holding time limits, ultimately leading to minimized overall average delay in the system. #### Average Delay With Different Variance Of Requests Distribution ![Result2](https://hackmd.io/_uploads/H1OZldC5a.png) Fig.3: Average delay with the increase of variance of requests distribution. In Figure 3, we modify the variance of the requests distribution to create a less uniform distribution among the clusters, while keeping the other parameters at default values. We observe that the average system delay increases for all schemes except our proposed scheme as the variance of the requests distribution rises. This is attributed to the growing variance in requests distribution, which amplifies the processing delay for a larger portion of requests in the more heavily loaded clusters, while marginally reducing delays for requests in less burdened clusters. Furthermore, our proposed approach, which prioritizes collaborative task offloading, effectively addresses this challenge and consistently outperforms other schemes. ## Conclusion - [**Overview & contribution**] - "In this paper, we proposed ...." ## References About 15 references 1. T. -Y. Chen, Y. Chiang, J. -H. Wu, H. -T. Chen, C. -C. Chen and H. -Y. Wei, "IEEE P1935 Edge/Fog Manageability and Orchestration: Standard and Usage Example," 2023 IEEE International Conference on Edge Computing and Communications (EDGE), Chicago, IL, USA, 2023, pp. 96-103. 2. Y. Chiang et al., "Management and Orchestration of Edge Computing for IoT: A Comprehensive Survey," in IEEE Internet of Things Journal, vol. 10, no. 16, pp. 14307-14331, 15 Aug.15, 2023. 3. M. Najm, M. Patra, and V. Tamarapalli, “Cost-and-delay aware dynamic resource allocation in federated vehicular clouds,” IEEE Trans. Veh. Tech- nol., vol. 70, no. 6, pp. 6159–6171, Jun. 2021. 4. Y. Chiang, C. -H. Hsu, G. -H. Chen and H. -Y. Wei, "Deep Q-Learning-Based Dynamic Network Slicing and Task Offloading in Edge Network," in IEEE Transactions on Network and Service Management, vol. 20, no. 1, pp. 369-384, March 2023. 5. L. H. Phuc, M. Kundroo, D. -H. Park, S. Kim and T. Kim, "Node-Based Horizontal Pod Autoscaler in KubeEdge-Based Edge Computing Infrastructure," in IEEE Access, vol. 10, pp. 134417-134426, 2022. 6. L. Liu, J. Feng, X. Mu, Q. Pei, D. Lan and M. Xiao, "Asynchronous Deep Reinforcement Learning for Collaborative Task Computing and On-Demand Resource Allocation in Vehicular Edge Computing," in IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 12, pp. 15513-15526, Dec. 2023. 7. T. X. Tran and D. Pompili, "Joint Task Offloading and Resource Allocation for Multi-Server Mobile-Edge Computing Networks," in IEEE Transactions on Vehicular Technology, vol. 68, no. 1, pp. 856-868, Jan. 2019. 8. K. Ye, Y. Kou, C. Lu, Y. Wang and C. -Z. Xu, "Modeling Application Performance in Docker Containers Using Machine Learning Techniques," 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS), Singapore, 2018. 9. W. Fan et al., "Game-Based Task Offloading and Resource Allocation for Vehicular Edge Computing With Edge-Edge Cooperation," in IEEE Transactions on Vehicular Technology, vol. 72, no. 6, pp. 7857-7870, June 2023. 10. Y. -J. Ku, P. -H. Chiang and S. Dey, "Real-Time QoS Optimization for Vehicular Edge Computing With Off-Grid Roadside Units," in IEEE Transactions on Vehicular Technology, vol. 69, no. 10, pp. 11975-11991, Oct. 2020. 11. W. Fan et al., "Joint Task Offloading and Resource Allocation for Vehicular Edge Computing Based on V2I and V2V Modes," in IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 4, pp. 4277-4292, April 2023. https://www.sciencedirect.com/science/article/pii/S1383762121001570 - https://ieeexplore.ieee.org/document/9687301 - https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8533343 - https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9775565