OPTIMIZATION OF GARBAGE REMOVAL WITHIN A TERRITORIAL COMMUNITY

This paper proposes an algorithm for optimizing the garbage collection route in a local community (or a separate settlement). The study was conducted for one garbage truck. To achieve the maximum efficiency of the algorithm, it has been assumed that the points of discharge of collected waste by a garbage truck could be arranged along the way between the proposed clusters of garbage collection points. The optimization of the built routes has been proven, taking into consideration the above assumptions. The study’s results could be used to reduce the budget expenditures by territorial community authorities for the collection and disposal of waste. The reported solutions could significantly shorten the garbage collection time, which would improve the environmental and aesthetic situation within the study area. The use of a new algorithm makes it possible to display the results both in quantitative and qualitative forms. An improved k-means algorithm with a maximum cluster size was selected for clustering. Each cluster was built on the basis of the maximal value of garbage truck tonnage. That means that the size of the cluster would be determined by the value of the maximum amount of waste that can be removed by a garbage truck in one run. A task of the traveling salesman was applied to find the shortest path between representatives of one cluster (garbage collection points) calling at all its points and to establish the optimal path between all the clusters formed for a territorial community.


Introduction
Approaches to tackling the issue related to garbage collection and disposal amaze with their diversity and vary from country to country. For proper use of the terminology associated with all methods, they were generalized to two types. The first combines all approaches within the following category -"the task of waste collection and disposal when collection points are located in fixed places". The other includes all approaches that fall under the subject "the issue of waste collection and disposal when collection points are located from house to house".
Garbage handling includes consideration of issues such as the process of production, collection of waste, their processing and transportation, including the process of developing routes for transportation. Addressing these issues in a wrong way, in particular, the routing of transportation warrants an increase in financial costs by the state government.
Due to the growth of the global population and the increase in household waste, the interest of governments and researchers in the "garbage problem" is increasing every year. The task to manage waste faces a wide spectrum of questions of a diverse nature, namely the creation of waste in production, its collection, transportation, and disposal, the minimization of harmful effects on the environment, as well as recycling. This variety of tasks creates different areas of interest for researchers. Part of them pays attention to the environmental side of this problem while another part -to the financial aspects. For example, the routing of transportation involves large costs in the form of capital, labor, variable operating costs, etc. Each voiced concern is becoming increasingly relevant and problematic every year.
Models for optimizing the above issues are termed garbage collection and disposal tasks. These tasks are an extension of the already known vehicle routing problems. The task of garbage collection and disposal can be represented as the task to route a waste collecting vehicle calling at certain groups of nodes (clusters) and disposing of the collected materials at certain specified points (garbage collection points). An example of clustering is shown in Fig. 1.
The task for transporting machines, similar to the problem to route vehicles, considers fixed load. Moreover, such tasks consider the same goals as the problems of vehicle routing, in particular, minimizing the total distance and/or travel time, minimizing the fullness of the vehicle container. The difference between these tasks is that the problems of waste collection and disposal accept a condition for many visits to points where the materials are disposed of for all examined vehicles. In the context of such a problem, landfills are points of garbage disposal. Another difference is that before returning to the points of departure, these vehicles must discharge the collected waste.
A significant increase in the rate of globalization, the deterioration of the environmental situation significantly affect the growth of prices for all financial issues that are raised when discussing garbage disposal. That forces the leadership of local communities or settlements to pay more attention to the issue of optimal garbage collection.

Literature review and problem statement
Optimization of waste collection is a very important and relevant topic in the world. Improper waste management, such as inefficient transportation or routing, could lead to higher waste collection costs. In general, the problem of waste collection in different countries is formulated in two ways depending on the type of urban environment, namely: "arc problem" and "problem with a node". When such a task is stated as an arc problem, waste accumulates along a street and the vehicle must call at the street to collect it. When formulated as a problem with the node, waste accumulates at garbage collection points, and the vehicle collects it in such designated places. In this case, one solves the problem of vehicle routing. Paper [1] considers a two-purpose model of garbage collection in rural areas during the planned period. In addition, the arc problem in the territories of rural areas in Spain is considered. However, the issue for the node problem remained unresolved.
Study [2] gives a mini-review of the latest approaches and their application in the collection and transportation of waste. Several meta-heuristic algorithms are considered, such as optimization of ant colonies, simulated annealing, genetic algorithm, search for large neighborhoods, greedy randomized adaptive search procedures, and others.
Work [3] reports a solution to the problem of optimizing the collection and disposal of garbage, including time breaks for transport drivers of waste transportation. An algorithm for solving the routing problem with time windows was built. A time window considered is a lunch break for drivers. The results are based on data from a Danish waste recycling company.
In [4], an algorithm for solving the problem of optimizing garbage collection is proposed but the result contains improvements either relative to distance or relative to the number of vehicles. A simultaneous optimization solution in relation to both transport and the number of vehicles is proposed in [5].
Paper [6] proposed a hybrid algorithm involving the optimized method of a swarm of chaotic particles and Arc-GIS. The proposed method was experimentally tested on Danang's real dataset. It produced a better total amount of waste collected than the corresponding methods but lost to others due to an increase in the value of the distance traveled and more working time.
The use of ArcGIS to optimize garbage collection was also suggested in [7], where an extension for ArcGIS net-work analysis was proposed. That approach was used by three local authorities in Ghana. Owing to this option, the weekly distance on the road decreased by 81.27 km, the total number of vehiclesby 4.79 %, the journey timeby 853.59 minutes.
In [8], the genetic algorithm and algorithm of the nearest neighbor were applied. The results showed a significant reduction in the total travel distance compared to the previous situation. The distance of trucks decreased by an average of 66.42 %.
Given the increase in waste generation and all the issues associated with its effective disposal, there is a growing need to improve existing optimization approaches for waste collection routing problems (WCRP) and to devise new approaches and algorithms.

The aim and objectives of the study
The aim of this work is to optimize the path of garbage collection by one garbage truck within the local community. This could reduce financial costs related to waste collection and disposal.
To accomplish the aim, the following tasks have been set: -to build an algorithm for optimizing the route of garbage collection; -to develop software based on the proposed algorithm and to verify the method.

The study materials and methods
The focus of this study is to optimize the routing of garbage collection within a local community under certain conditions. The first condition is that the task is considered for one vehicle (garbage truck). The next condition implies that landfills could be arranged along the way between proposed clusters of garbage collection points. And the final condition assumes that each transport vehicle has a limited fixed load. The approach to garbage collection is in line with the approach according to which "garbage collection takes place at fixed points within a local community".
During the study, we have applied an improved k-means method algorithm for a problem of volume clustering [9]. The purpose of this method is to assign each point from the set to the corresponding (nearest) cluster, based on the centers of clusters called centroids (the "mean" point within a cluster).
A simple k-means algorithm involves the following steps: 1) the number of clusters k should be selected; 2) randomly generate k clusters and define their centroids, or randomly specify k points that correspond to the centroids of clusters; 3) assign each point in the set to the closest centroid (cluster, accordingly); 4) recalculate the new cluster centroids; 5) repeat steps 3, 4 until some criterion of convergence is achieved, or the reassignment makes no sense.
However, since each garbage collection point (GCP) can have a different capacity, it is more logical to form clusters in such a way that GCP with greater priority (that is, with a larger capacity) is located closer to the centroid. If one ignores this assumption, there may be a problem with optimization since it is possible to create more clusters than necessary. That is, if, at first, GCPs with a smaller capacity were related to, then those are left that have larger capacities, the materials of which may not fit in the garbage truck; then another cluster needs to be formed. That is why an improved k-means algorithm was used to eliminate this problem.
In addition, in the course of this study, we shall employ the task of a traveling salesman. It can be solved by different methods. In general, they can be divided into two types: exact methods and heuristic methods. The essence of accurate methods is to find a guaranteed optimal path but under the assumption that the time resource for such a search is unlimited. Heuristic methods make it possible to find good solutions under the conditions of limited search duration. To study the optimal path between representatives of one cluster (waste collection points) and between the clusters themselves, we shall use an exact method of discrete optimization, which is termed the branch and boundary method [10,11]. Discrete optimization methods are efficiently used to find solutions to volumetric problems; they produce optimal or approximate solutions.
Python programming language and NumPy library for numerical calculations were used to write the software. The OpenCV library was applied to visualize the data. Characteristics of the computer hardware involved in the experiments are Intel(R) Core (TM) i7 processor (8 th generation, 4 cores), 16 GB of RAM.

1. Algorithm for optimizing garbage collection within a local community
The first step of the algorithm considers the construction of clusters using an improved k-means algorithm for the volume clustering problem.
Suppose that the problem considers a territorial community that has n, n∈N garbage collection points (GCP) with a known distribution (that is, for each such point, we have its coordinates ((x,y)∈R 2 ).
where k is the number of clusters, n is the total number of GCPs, n j is the number of GCPs in one cluster j.
Let only one garbage truck is available for garbage collection and its tonnage C is known; also known are d 1 ,…, d n is the capacity of a particular GCP.
Let X be a binary matrix such that if the garbage collection point belongs to the cluster othe 1, The purpose of the problem is to find such X in order to minimize where cost ij is the Euclidean distance calculated from the formula where (x i ,y i ), (x j ,y j ) are the coordinates of the i-th and j-th GCPs, under the following conditions (3) describes the condition that each GCP is tied to only one cluster, and condition (4) describes the requirement that no more than C tons of garbage can be collected from a single cluster.
The improved k-means algorithm for a volume clustering problem [9] includes the following steps: 1) calculate the number of clusters based on the capacity of a garbage collection point d i (i=1,…,n; n is the number of collection points) and the tonnage of a vehicle (cluster capacity, C): 2) select the initial k centroids by arranging GCPs (d i ) based on the downward order requirement (d 1 >d 2 >…>d n-1 >d n ). Let it be the list D. Then the first k points become centroids; 3) assign DMS to a cluster. Determine the Euclidean distances between each point to all k centroids. Group all points to the nearest centroid j. To find the appropriate centroid j for a collection point, we calculate the priority value as = .
ij i i cost Prioirty P d (6) This priority defines the GCP, which has the highest priority to have a centroid j. The selected point is appointed based on constraint (3). If the constraint is not met, ta GCP would be assigned to the next nearest centroid based on (6) and (4); 4) calculation of centroids. Centroid (X i ,Y i ) is calculated based on cluster members for each cluster. Let (x 1 ,y 1 ), (x 2 ,y 2 ),…,(x j ,y j ) be coordinates of the cluster j members.
where с j represents the j-th centroid; n j is the number of GCPs in the cluster j; m is the iterator of the cycle; 5) convergent criterion. Repeat the procedure while there are changes in the formed cluster.
The found clusters are denoted as K i ,i=1,…,k; k∈N is the number of clusters.
The next step involves constructing the shortest path between the clusters using the task of a traveling salesman. This step would solve the issue of the procedure for removing garbage from each formed set of points (cluster by cluster, 1 cluster=one vehicle). To solve the task of a traveling salesman, we shall use a branch and boundary method [10]. To describe the stages of a given method, the following lemmas and statements are required.
Lemma 1. Suppose we have found the length of the optimal Hamiltonian cycle with a matrix of distances A. If we subtract a number from the elements of a row or column of the matrix and solve the problem again with a new matrix, the Hamiltonian cycle of the salesman would not change. Moreover, its length should decrease by this number. Statement 1. The prior addition of the edge (i, j) to the Hamiltonian cycle reduces the dimensionality of the distance matrix by striking out the i-th row and j-th column from this matrix. It should be noted that if some edge (i, j) can be attached to any initial edge (a, b), a non-Hamilton path can be formulated -so one needs to immediately remove the edge ( j, a)). Statement 2. A priori extraction of the edge (i, j) from the Hamilton cycle makes it possible to perform an additional construction of the matrix and improve the lower boundary. Such extraction is carried out by replacing the corresponding element of the distance matrix with ∞.
At the first stage, some set R of all possible Hamilton cycles and some lower set of edges of the length of these cycles are determined, which is denoted φ R . Next, we find such a Hamilton cycle, the value of the lower face of the length of which is better than the value of φ R .
The i j Based on these resulting values of the lower faces, select the subset that has the lowest value of φ. The selected subset is further broken down in the same way until the dimensionality of the distance matrix A is 2×2. The third stage is based on the construction of a tree of subsets relationships that were obtained as a result of breaking the initial matrix of distances A (the vertices of the tree are the corresponding calculated lower faces for each subset). This tree makes it possible to isolate the current Hamilton path. It is worth noting that in the case when such a built tree has some broken branches, then one needs to check whether the lower faces of such branches are not less than the length of the current Hamilton cycle. If the check gives a negative result (that is, one or more lower boundaries from the subsets of broken branches are greater than the length of the current Hamilton cycle), then new Hamilton cycles can be found by the same approach (splitting). The desired Hamilton cycle is the cycle from those obtained, which has the smallest lower boundary of lengths.
The connection between these stages is as follows. Applying Lemma 1 first on the rows of the known distance matrix A with the minimum elements of the rows we obtain a completely consolidated matrix that contains at least one zero in each row and column. Moreover, the elements of such a matrix are not negative. Note that the numbers α i , β j are termed coefficients of compilation. It also follows from Lemma 1 that the length L of the Hamilton cycle of the unconsolidated distance matrix can be represented as where L 1 is the length of the Hamilton path of the consolidated distance matrix; γ is defined as Obviously, L 1 ≥0. Given this and (8), it is possible to define the lower limit of the lengths of the set of Hamilton cycles as The choice of a special edge (i, j) for the second stage is determined on the basis of Statements 1 and 2: for some zero element a ij =0 of the resulting consolidated matrix A, we imaginary replace with ∞ and determine ( ) Similarly calculated is ( ) γ , i j for all zero elements of the consolidated matrix. The biggest change in the length of the Hamilton cycle would occur when the edge with the maximum length is removed, so the role of the edge (i, j) would belong to such an edge (r, s) that With the same approach to selecting the edge of breaking, we continue to break the distance matrix A until its dimensionality decreases to a value of 2×2. After reducing the resulting matrix to the specified size, its zero elements would indicate the edges that must be added to the cycle in order to derive the optimal Hamilton cycle. The third stage would make it possible to check the presence of other optimal Hamilton ways and, if they exist, choose the one that has the least length value. This terminates the branch and boundary method's algorithm.
To apply the above branch and boundary method in the second step of the algorithm for garbage collection and finding the shortest path between the clusters, we select the one that is closest to the driver as the initial cluster.
Having found the shortest path between clusters, we proceed to the third step, in which we first find the first and last point within each cluster. These would be the points that are the closest points between the clusters of the shortest path.
Let the order of the clusters of the shortest path be K 1 , K 2 ,…, K k . The starting point in K 1 is the point closest to the driver. Next, we select the point from K 1 , which is the closest to some point from the next cluster K 2 . This point, respectively, becomes the first in cluster K 2 . Similarly, we find such points for each cluster. Having the first and last points, we build the shortest path in each cluster, using the task of a traveling salesman and a branch and boundary method in order to call at all the garbage collection points. Since the starting point and the final point have already been set, and there is no need to return to the starting point, it is necessary to make certain modifications before starting the implementation of the branch and boundary method. As already proposed in [12], we add a fictitious node that connects to the initial and final node with edges weighing 0. Since the task of a traveling salesman must contain a fictitious node, the result must contain a sequence of "starting point -fictitious node -end point" (there is no other way to reach the fictitious node). Thus, the problem has (n+1) points. After solving the problem, we simply remove the fictitious point, and then the minimum length of the Hamiltonian path would be defined, and we shall find the shortest path without turning back to the starting point.
Since our clusters were built on the basis of the capacity of a garbage truck C, then, to optimally and logically build clusters, we believe that there is a landfill between the clusters where one can empty the garbage truck.

2. Software implementation of the algorithm for optimizing garbage collection within a local community
The following is the pseudocode for the first step of the algorithm, namely: to cluster garbage collection points with an improved k-means algorithm for the volume clustering problem.

Pseudocode:
Calculate k using (1) Select the first k centroids from list D Initialize the binary matrix X with zeros until converged for each r i ∈R until r i is assigned Calculate Euclidean distance using (2) to each of the k clusters and arrange it in orderly order.
Assign the nearest centroid r i as m. Group all unassigned points as G with the nearest centroid m.
Calculate the priority value for r i ∈R using (6).
Assign r i ∈R to the nearest centroid based on the priority value without violating constraint (3).
Renew x ij . if r i is not assigned, then select the next nearest centroid end if end until end for each Calculate a new centroid from the formed clusters using (7).

end until
The second stage in writing the software is the formation of logic for constructing the shortest path between the found clusters K i , i∈ [1, k], where k is the number of clusters formed. To do this, use the branch and boundary method's algorithm to solve the task of a traveling salesman. The initial cluster is considered the cluster that is closest to the driver. The pseudocode of the algorithm is given below.

Pseudocode:
Define the distance matrix А l for each cluster K l , l∈ [1, k] for each K l until is found until the reduced matrix has a dimensionality of 2×2.
Reduce the distance matrix K l , l∈ [1, k]  The third step is to write part of the code to construct the shortest path inside each cluster K i , i∈ [1,k], where k is the number of clusters formed. First, we find the first and last points, which are the closest points between clusters of the shortest path. The starting point in K 1 would be the point closest to the driver. It is the point from K 1 , which is the closest to some point from the next cluster K 2 . This point would be the first in cluster K 2 . In the same way, we find such points for each cluster. Next, we add a fictitious node in each cluster that connects to the start and end node in this cluster, weighing 0. After that, the task of the traveling salesman's algorithm is used. The pseudocode is built similarly to the second step. Finally, we remove all fictitious nodes that have been added.
To verify the method, the data representing the location of garbage in some small territorial community, consisting mainly of rural areas, are considered. The tonnage of a garbage truck is 10 tons. The capacity of garbage collection points and their coordinates are given in Table 1.
Using the built algorithm, we obtain a result that is visualized in Fig. 2. That is, first, all garbage collection points are divided into the optimal number of clusters that are highlighted in red. Next, the optimal route between clusters was found, which is indicated by green dotted arrows. After that, the first and last points in each cluster were found and fictitious points were added that were eventually deleted. Next, the optimal route is built inside each cluster. The proposed location of landfills is any permitted location along the route, represented by green dotted arrows.
The length of the shortest path by one garbage truck according to a given algorithm is 23 km. A garbage truck must be unloaded 3 times.

Discussion of results of studying the optimization of garbage collection within a territorial community
The proposed algorithm makes it easier to make financial decisions for local communities when solving the tasks of garbage collection and disposal within a certain territory. This is possible due to the integration into the study of modern methods of machine learning, specifically "unsupervised learning", one of which is the clustering method, which is termed a k-means method. In particular, to optimize the construction of clusters, namely their number, within a given territory, an improved k-means method is used, which includes consideration of prioritization of garbage removal from certain clusters. The result of clustering for some territorial community employing the data given in Table 1 is shown in Fig. 2 in red. The volume of each cluster is equal to the capacity of a garbage truck (10 tons) due to condition (4). By applying the task of a traveling salesman and the branch and boundary method, the optimality of constructing a path between clusters (the second step of the algorithm) was achieved. Schematically, this is shown in Fig. 2 by green dotted arrows. In the third step, the task of a traveling salesman and the branch and boundary method are also used to find the shortest way when calling at all garbage collection points in each cluster. Schematically, this is shown in Fig. 2 by blue arrows. The length of the entire path was 23 km. Optimizing a path using the task of a traveling salesman can lead to a decrease in the cost of fuel consumption, the number of required vehicles, the total duration of the route, etc. A feature in the second step is that the initial cluster is the one that is closest to the driver.
In the third step, the feature is the choice of the first and last points in each cluster to determine the optimal path not only in one cluster but along the route in general. It should be noted that the object can be any settlement, not necessarily a local community.
However, there are a number of limitations related to weather conditions, driver's condition, road condition, etc. Those conditions are difficult to influence when solving problems of optimizing garbage collection. Therefore, these restrictions were not considered.
The disadvantage of the proposed method, but at the same time, the prospect of further development of it is the assumption that landfills can be built, or already built on the route, as certain sanitary standards must be met. Here, we can consider optimizing the amount of waste that is dumped in one landfill on the way from cluster to cluster.
In addition, the level of filling a garbage container may be different. The algorithm can be improved once containers are equipped with devices to determine the level of filling. Fig. 2. The result of clustering and the shortest waste collection path 7. Conclusions 1. In order to reduce financial costs related to waste collection and disposal, an algorithm for optimizing garbage collection, which contains several stages, has been implemented. At the first stage, we applied an improved k-means algorithm for GCP clustering. The second stage is to use the task of a traveling salesman and the branch and boundary method to find optimal paths between the clusters. In the third stage, the optimal path is built within each cluster, based on a special selection of the desired points. With the help of the built algorithm, it was possible to achieve the minimization of several important values. One of these values is the length of the route traveled by a garbage truck.
The reduction of this value shows a decrease in the costs to ensure the operation of the relevant vehicle. There is also a decrease in the total time spent on one route. This also affects the optimization of the garbage collection process on the principle of "nodes".
2. The software developed makes it possible to find optimal ways for any dataset that represents a certain territorial community or settlement using a combination of several tools. These tools are an improved k-means algorithm for clustering data, a branch and boundary method to find optimal paths within a particular area, and an approach to combining the resulting optimal parts of the path into a singular one.