Solve the Traveling Salesman Problem (TSP) by Simulated Annealing (SA)

1.     Objective

The objective of this work is developing a generic simulated annealing (SA) algorithm to solve the traveling salesman problem (TSP), which is find the shortest route through N given cities such that every city is visited exactly once.

2.     Concepts and Programming

Simulated annealing algorithm, first described by Scott Kirkpatric [2], has its name and inspiration come from annealing in metallurgy, a technique involving heating and slowly cooling of a material to increase physical strength and/or reduce defects. When the temperature is hot, the atoms of the material piece gain high energy and wander randomly. The slow cooling gives them more chances of finding configurations with lower internal energy than the initial one [3]. In term of an optimization algorithm, each step of the SA algorithm replaces the current solution by a random "neighbor" solution, chosen with a probability that depends both on the difference between the corresponding function values (energy) and also on a global parameter T (called the temperature). The temperature is gradually decreased during the process. At the beginning,  T is large, the "upnhill" moves (go to higher energy states) are more likely accepted than when T is go to zero at the end of the process. This allows the SA algorithm move out of the local minimum at the early states.
Table 1: Pseudo code of SA algorithm [2]
s ← s0;                       // Initial state, energy.
e ← E(s)                
sbest ← s;        // "best" solution
ebest ← e 
k ← 0
while k < kmax and e > emax    
     // Trail move to ‘ neighbor’     
     snew ← neighbor(s)       
     enew ← E(snew)
     // Accept uphill move if ‘feel lucky’
      if P(e, enew, temp(k/kmax)) > random() then  
          s ← snew;
          e ← enew;
    // Tracking the best-ever    
     if enew < ebest then         
           sbest ← snew;
           ebest ← enew             
     k ← k + 1                                    
return sbest                                   

There are different schemes to accept the “uphill” move as well as to reduce the temperature, such as Fast Cauchy, Geometric, or Boltzmann’s methods as given in [1]- lecture 3, page 5.The pseudo code for generic SA algorithm is given in [2] as Table 1.
Actually, the "pure" SA algorithm as described in [1] does not keep track of the best solution found so far: it does not use the variables sbest and ebest, it lacks the second if inside the loop, and, at the end, it returns the current state s instead of sbest. While remembering the best state is a standard technique in optimization that can be used in any meta-heuristic, it does not have an analogy with physical annealing — since a physical system can "store" a single state only.
In TSP problem:
- Each state “s” is represented by a combinatory of N cities’ indexes.
- The energy E(s) is a function to compute the total travel distance of visiting N cities following the route or the state s.
- The initial sate or route s is selected randomly.
- The “neighbor” of s is implemented by swapping 2 random cities of the route s. There are 2 ways to swap: swap the two vertices or swap the two edges as described in [1]-lecture 3, page 17, and are both implemented in my code (See Code listing section)
- The probability to accept a move is P = exp(- DE/kT) = exp(-(enew-e)/kT)
- Update temperature T = αT, where 0<α<1

3.     Some Results

SA process

The convergence of SA

GUI (Graphic User Interface) written in Matlab
Aravind Seshadri already posted his code for solving TSP by SA in Mathworks website
Main thing is the speed. I am inspired by his work, but after downloaded, I found that his program could take "forever" for 101 cities. Could not sit and wait, (not a patient person) so I wrote this program. Most of the time, it just needs less than 10 seconds for 101 cities (including time for plot). For 500 cities, it may take 1 minutes. 
Next time, I will upload the code and talk about using SA for solving N-Queens problem.    


1.      Gary Yen. ECEN 5773 Intelligent Systems- Fall 2011 lecture notes. Oklahoma State University, 2011
2.      Kirkpatrick, S., C. D. Gelatt Jr., M. P. Vecchi, "Optimization by Simulated Annealing",Science, 220, 4598, 671-680, 1983. 
6. TSP formation and database http://www.tsp.gatech.edu/


Pair-wise Weighted Euclidean distance between 2 sets of vectors

In many tasks, we wish compute the pair-wise distance between two sets of vectors (in high p-dimension space). For example, in clustering problem we want to compute the distance between each point in the given set of N points (or vectors), represent by a matrix X of size N-by-p, to every other points. The simplest implementation in Matlab may look similar to this:

DD = zeros(N,N);
for i =1:N
    for k=1:N
         DD(i,k) = sum((X(k,:)-X(i,:)).^2);

(It actually compute the SQUARE of the distance, but in many case it is enough, e.g for just want to know which is the nearest point to a given point; otherwise we can easy say DD=sqrt(DD). Compute the square distance is significantly faster)
However, as you know, using FOR loop in Matlab is so slooow. People said "Avoid FOR loop as much as possible".
By the way, the square Euclidean distance between two vector a and b is: d = sum((a-b).^2); and the weighted version is simply: d = sum(wts.*(a-b).^2).
The inner for-loop in the above code can be easily eliminated, such as:

DD = zeros(N,N);
for i =1:N
   % inner for-loop of above code equivalent to:
    DD(i,:) = sum((X - repmat(X(i,:),N,1)).^2);

But we still need the outer for-loop. How about get rid all of it? YES, it is possible. We just need a small trick:
(a-b)^2= a^2+b^2-2ab
Yes it is easy for 2 vectors, but applying for two matrix A and B need quite an arrangement.
Check out the following code:


function DD = pdist2(A,B,Wts)
% Find pair-wise SQUARE EUCLIDEAN distance
% or 'Weighted square euclidean' distance
% between each point in A and B
% For 2 vector a, b
% Euclidean distance= d = sum((a-b).^2)
% Weighted version  = d = sum(wts.*(a-b).^2)
% ------------------------------
% Input:
%   A= m_by_p, m points in p-dimension
%   B= n_by_p, n points in p-dimension
%   Wts = 1_by_p, defaut = [1 1 ...]
% Results:
%   DD= m_by_n
% ------------------------------
% Facts: (a-b)^2= a^2+b^2-2ab
% ------------------------------
% trungd@okstate.edu, Feb 2011
% ------------------------------
% Ideal from: Piotr Dollar.  [pdollar-at-caltech.edu]

[m,p1] = size(A); [n,p2] = size(B);
if(p1~=p2) % check size
    error('Must have: ncol(X)=ncol(Y)=length(Wts)')
if nargin < 3 % standard euclidean
    AA = sum(A.*A,2);  % column m_by_1
    BB = sum(B.*B,2)'; % row 1_by_n
    DD = AA(:,ones(1,n)) + BB(ones(1,m),:) - 2*A*B';
    if p1 ~=length(Wts)
        error('Must have: ncol(X)=ncol(Y)=length(Wts)')
    sW = sqrt(Wts(:)'); % make sure row, square of Wts
    A = sW(ones(1,m),:).*A;
    B = sW(ones(1,n),:).*B; % modify A,B
    % Process the same as standard Euclidean
    AA = sum(A.*A,2);  
    BB = sum(B.*B,2)'; 
    DD = AA(:,ones(1,n)) + BB(ones(1,m),:) - 2*A*B';    


It runs fast, just for the record my laptop is average, not supper:

>> A = rand(1000,100);
>> B = rand(2000,100);
>> W = rand(1,100);
>> tic, D = pdist2(A,B,W);toc
Elapsed time is 0.101665 seconds.

As said, the idea is from Piotr Dollar.  [pdollar-at-caltech.edu], I modified to handle the weighted euclidean case (which I meet pretty offen).
Other type of distance such as Cosine, L1, Mahalanobis, ... can be handled, but some of them are not that handy.
Last thing, pdist2() is available in Statistic Toolbox of Matlab that can deal with much more general distance. If you have that toolbox, forget about the code here :)