Development of a Dummy Guided Formulation and Exact Solution Method for TSP

A traveling salesman problem (TSP) is a problem whereby the salesman starts from an origin node and returns to it in such a way that every node in the network of nodes is visited once and that the total distance travelled is minimized. An efficient algorithm for the TSP is believed not to exist. The TSP is classified as NP-hard and coming up with an efficient solution for it will imply NP=P. The paper presents a dummy guided formulation for the traveling salesman problem. To do this, all sub-tours in a traveling salesman problem (TSP) network are eliminated using the minimum number of constraints possible. Since a minimum of three nodes are required to form a sub-tour, the TSP network is partitioned by means of vertical and horizontal lines in such a way that there are no more than three nodes between either the vertical lines or horizontal lines. In this paper, a set of all nodes between any pair of vertical lines or horizontal lines is called a block. Dummy nodes are used to connect one block to the next one. The reconstructed TSP is then used to formulate the TSP as an integer linear programming problem (ILP). With branching related algorithms, there is no guarantee that numbers of sub-problems will not explode to unmanageable levels. Heuristics or approximating algorithms that are sometimes used to make quick decisions for practical TSP models have serious economic challenges. The difference between the exact solution and the approximated one in terms of money is very big for practical problems such as delivering household letters using a single vehicle in Beijing, Tokyo, Washington, etc. The TSP model has many industrial applications such as drilling of printed circuit boards (PCBs), overhauling of gas turbine engines, X-Ray crystallography, computer wiring, order-picking problem in warehouses, vehicle routing, mask plotting in PCB production, etc.


Introduction
Network reconstruction is not a new idea. It was used in [1] to solve the traveling salesman problem. A minimal spanning tree was used to detect sub-tours. In that paper, the TSP network diagram was reconstructed in such a way that the sub-tours are eliminated. The challenge with that approach is that suppose a sub-tour is missed or failed to be detected. In the proposed approach, it is ensured that all sub-tours are eliminated. The fact that a minimum of three nodes are needed to form a sub-tour is used to eliminate all sub-tours when reconstructing the TSP network model. Vertical and horizontal lines are used to partition the TSP network problem so that there are no more than three nodes between either the vertical lines or horizontal lines. In this paper, a set of all nodes between any pair of vertical or horizontal lines is called a block. Dummy nodes are used to connect one block to the next one. The reconstructed TSP is then used to reformulate TSP as an integer linear programming problem (ILP), which is then solved efficiently by interior point algorithms to give an exact solution.
The traveling salesman problem is a problem whereby the salesman starts from an origin node and returns to it in such a way that every node in the network of nodes is visited once and the total distance travelled is minimized. It is assumed in this paper that all nodes have at least two arcs coming out of them as given in Fig. 1.     1 gives an example of a TSP network model. This TSP network model can have any number of nodes. The TSP model has many industrial applications such as drilling of printed circuit boards (PCBs), overhauling of gas turbine engines, X-Ray crystallography, computer wiring, order-picking problem in warehouses, vehicle routing, mask plotting in PCB production, etc.

Literature review and problem statement
Partitioning or clustering method is not a new idea for the traveling salesman problem. Clustering was used in [2] to solve the clustered generalized traveling salesman problem (CGTSP). The challenge with such a clustering approach is that it gives near optimal solutions and not the exact solution. The 2-Opt heuristic is a simple algorithm for finding a good approximate solution to the traveling salesman problem [3]. In that paper, it was proved that for the metric TSP with n cities, the approximation ratio of the 2-Opt heuristic is n 2 and that this bound is tight. Again, this heuristic gives a near optimal solution and it does not give the exact solution. The paper [4] presents an approach to improve the Miller-Tucker-Zemlin (MTZ) model for the symmetric traveling salesman problem (ATSP). This is a 2020 publication and it shows that the hunt for an exact algorithm is ongoing. The paper [5] is on the generalized traveling salesman problem with time windows (GTSPTW). The TSP problem in this case is partitioned into clusters whereby each cluster has only one depot. The proposed algorithm in this paper is aimed at finding a minimum cost tour starting and ending at the depot, such that each cluster is visited exactly once and time constraints are not violated. This algorithm takes time to solve large TSP problems. In [6], ant colony optimization (ACO) algorithm is used to solve the traveling salesman problem (TSP). The main challenge with this algorithm is that it is not exact and is very difficult to know how far the solution obtained is from the exact one. Some exact and approximation methods for the TSP are compared in [7]. Heuristics are fast in obtaining a near optimal solution to the TSP and exact methods obtain the optimal solution at unreasonable times. One cannot make quick decisions with these exact methods. The difference between exact and approximate solution for large towns such as Beijing or Tokyo is a very huge amount of money. The TSP model has many industrial applications and is NP-hard, making it very difficult to solve. There is a need for an efficient method that can handle very large TSPs. In this paper, an efficient exact method that incorporates interior point approaches is proposed. Interior point algorithms can handle very large practical problems.

The aim and objectives of the study
The aim of the study is to develop a dummy guided formulation for the traveling salesman problem. To achieve the set aim, the following objectives have been accomplished: -to partition the TSP network problem into blocks by means of vertical and horizontal lines; -to construct dummies so as connect neighboring blocks; -to formulate the dummy reconstructed TSP as an ILP; -to provide a numerical illustration.

1. Standard constraints
Suppose we are given any node r with k arcs emanating from it as given in Fig. 2. The standard constraint is very easy to formulate and can be formulated as (1): If a TSP does not have sub-tours, then the optimal solution of the relaxed will be an integer as presented in 4. 2.

2. Theorem
Let the coefficient matrix be A if the TSP is made up of standard constraints only. The matrix A is totally unimodular if it satisfies the following five conditions: a) all entries of A are 0, 1 or -1; b) the rows of A can be partitioned into two disjoint sets S 1 and S 2 ; c) every column of A contains at most two nonzero entries; d) if any column of A contains two nonzero entries of the same sign, then one is in a row of S 1 and the other in a row of S 2 ; e) if any column of A contains two nonzero entries of the opposite sign, then they are both in rows of S 1 or both in rows of S 2 .
The theorem is from [8] and more on it is well presented there.

3. Existence of sub-tours
Unfortunately, standard constraints on their own may result in sub-tours. So there is a need for a way to detect and eliminate sub-tours. The existence of sub-tours is illustrated in Fig. 3.

Fig. 3. Existence of sub-tours
The existence of sub-tours makes the traveling salesman problem appear to be very difficult to solve.
Examples of sub-tours are presented in Fig. 3 and there are four of them.
In practical problems, there can be any number of subtours in one TSP network model.

Vertical and horizontal parts
For a sub-tour to form, a minimum of three nodes are required and this fact can be used to partition a TSP network diagram.
Thus, vertical and horizontal lines can be drawn such that the distance between any two vertical lines or horizontal lines is not more than three nodes.
In other words, these horizontal and vertical lines cross arcs without touching a single node.
These lines do not necessarily have to be vertical and they do not necessarily have to be horizontal.
All those lines that are vertical or near vertical are treated as vertical.
Similarly, all those lines that are horizontal or near horizontal are treated as horizontal in this paper.
Any TSP network diagram can be made to face any direction.

Vertical line.
Vertical lines represented by γ s are presented in Fig. 4.

Fig. 4. Vertical and horizontal lines
Note that the distance between any two nearest vertical lines is not more than 3 nodes.

Horizontal lines.
Examples of horizontal lines (γ s ) are presented in Fig. 5. Note that the distance between any two nearest horizontal lines is not more than 3 nodes.
Note that there are no more than 3 nodes between the lines when going horizontally and no more than 3 nodes when going vertically.

5. Dummy nodes
In this paper, we define a block as a set of all nodes between any closest pair of vertical or horizontal lines. To make sure sub-tours are eliminated, there is a need to connect all the blocks. This can be done by introducing dummy nodes so as to enable connection of these blocks. A dummy node is an additional node or artificial node created to eliminate sub-tours. A dummy node connects the boundary or frontline nodes of any two neighboring blocks. A dummy is added as an additional node and all the original and given nodes remain there. Fig. 6 illustrates blocks and boundary or frontline nodes. In real life or practical problems, there are any number (i) of nodes on one side of the boundary (W 1 , W 2 , …, W i ) and any number ( j) nodes on the other side of the boundary (Y 1 , Y 2 , …, Y j ) as given in Fig. 7.
The frontline nodes after adding a dummy become as given in Fig. 8.

Fig. 8. Frontline nodes (general case+dummies)
Where D i r  is the dummy from the horizontal line γ r . In this case, there is a horizontal line, when the line γ r is vertical, D i r ν is used implying a dummy from the vertical line r.

6. Law of conservation of intermodal distance
The total distance (optimal distance) before introducing dummies and after introducing dummies does not change. It is because of the law of conservation the following equalities (3)-(10) are valid and used in this paper: Arcs coming from V 1 : Arcs coming from V 2 :

…
Arcs coming from V i : Dummy D 1 : Dummy D 2 : k k k i where , ,..., . Equality constraint (10) is the constraint that makes the TSP formulation very difficult. With the k u variable, the coefficient matrix is no longer unimodular. In this case, 2r u is used instead of just 2 because it is not certain the traveling salesman will use this dummy bridge or not. In other words, these equalities are formulated for every vertical or horizontal line.

2. Efficient exact solution of the general linear binary problem
In [9], it is shown that the general linear binary problem can be solved in polynomial time by interior point approaches. The linear binary form must be transformed into the convex quadratic problem. We introduce slacks such that 12 is satisfied.
To transform this into a convex quadratic problem, let: x s x s x s in in where  1 and  2 are very large in terms of their sizes compared to any of the coefficients in the objective function.
There are so many values of  1 and  2 that can make this to work. In this chapter, we select: ... . Since this is a minimization quadratic objective function, the objective function will be minimal when: i. e.
Similarly, the two quantities are equal if s ij = 0 or s ij = 1.

Convexity of f(X).
Since: This is because the variables in this case assume only binary variables.

3. Numerical illustration
Solve the problem given in Fig. 9, using the dummy guided formulation. The distances are in kilometres.  Step 1. Using vertical and horizontal lines to partition the problem we have Fig. 10.
Where γ 1 and γ 2 are the vertical and horizontal lines respectively.
Step 2. For γ 1 , the frontline nodes are 1, 4, 5 on one side and 6, 7, 10, 12, 15, 18 on the other side. As for γ 2 , the front-line nodes are 5, 9, 10 on one side and 12, 13, 14, 15, 18 on the other side. There are three dummies for the vertical line γ 1 and another three for the horizontal line γ 2 . The dummies are presented in Fig. 11, 12.
For boundary γ 1 : Such that:  10 10 2 Solving by the interior point algorithm, the optimal solutions in Fig. 13 or Fig. 14

Fig. 14. Alternate optimal solution
The optimal route is given in blue and note that there are no sub-tours. An alternate solution is given in Fig. 14.

Discussion of the solution method for TSP
The TSP network model has been partitioned in such a way that there are no more than 3 nodes in either direction so as to avoid sub-tours as given in Fig. 10. Three blocks are used to partition the TSP network diagram.
From the vertical line, 3 dummy nodes are used to force the traveling salesman to move from one block to the next one and another 3 dummy nodes are constructed from the horizontal line. In this illustration, a total of 6 extra nodes are required to avoid sub-tours.
From the numerical illustration, it can be noted that the number of variables is increased by approximately half i. e. 20 36 55 56 = ( ) . % which is a weakness. Similarly, the number of nodes is increased by a third i. e. 6 17 35 29 = ( ) . % approximately. There are no computational results for comparing the proposed method with other approaches, which is a limitation of the study. The numerical illustration is presented in Section 5. 2.
The proposed approach provides a formulation for the TSP where no sub-tours will occur. Interior point algorithms can be used to solve the formulated ILP efficiently as presented in Section 5. 1. The branching related algorithms such as the branch and cut, branch and price or branch cut and price have no guarantee that the number of sub-problems required to verify optimality will not explode to unmanageable levels.
Even though very powerful computers are now available, the use of brute force in solving the TSP model is still not possible for large and practical problems. The brute force for TSP models requires (n!) in the worst case i. e. (17!) sub-problems for this numerical illustration to verify optimality, which is not possible for large practical problems. Note that the number of nodes for the numerical illustration given in Section 5. 2 is 17.
Interior point algorithms provide a guarantee to solve optimization problems in polynomial time and display an early convergence in merely a few iterations, which is almost independent of the problem size [11]. Interior point algorithms are incorporated into this approach and can handle large problems implying that large TSP can now be solved efficiently.

Conclusions
1. The TSP network diagram was partitioned into blocks by means of vertical and horizontal lines. The TSP network diagram was partitioned into 3 blocks. A single horizontal line and a single vertical line were used in the partitioning process. The number of blocks depends on the size of the TSP network diagram or number of nodes. The bigger the number of nodes the bigger the number of blocks.
2. Dummies were used to connect neighboring blocks. Dummies come from horizontal and vertical lines. Dummies are artificial nodes introduced to eliminate sub-tours. Dummy nodes are there to connect blocks and do change the total distance travelled. This is because dummy arcs have lengths of zeros.
3. The dummy constructed TSP was formulated as an ILP. The strength of this formulation is that it can be solved efficiently to give the exact optimal solution. Interior point algorithms are incorporated in the proposed algorithms.
These interior point algorithms have the strengths that they can solve large sizes of problems. 4. A full numerical illustration was presented in the paper. The partitioning process using vertical and horizontal lines together with the construction of dummy nodes and arcs were explained. In addition, the formulation of the dummy reconstructed TSP network was formulated as a linear integer model and solved efficiently using an interior point algorithm to give the exact solution.