TSP & NP-Completeness Visualizer

Traveling Salesman Problem — Complete Teaching Guide

The Traveling Salesperson Problem (TSP) is one of the most famous and extensively studied problems in theoretical computer science and operations research. It falls under the category of combinatorial optimization.

The Core Concept

The problem asks a simple question: Given a list of cities and the distances between every pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?

Key Components

To define a TSP, you generally need:

A Set of Cities: The nodes or vertices in a network.
Distances (Costs): The weight of the edges connecting the cities (can represent distance, time, or fuel cost).
The Hamiltonian Cycle: The mathematical term for a tour that visits every vertex exactly once and returns to the start.

The Key Concept: Minimizing "Wasted" Motion

Completeness: You must visit every single point on your list.
Efficiency: You must do so using the least amount of "cost" (which usually means distance, time, or money).

The Certificate (Verification)

What is a Certificate? In complexity theory, a certificate is the "proof" or "evidence" that allows a computer to quickly verify if a specific solution is correct. For the TSP, the certificate is simply an ordered list of the cities.

Example: [City A, City C, City E, City B, City D, City A].

The Verifier Check (Polynomial Time):

Completeness: Scans the list to ensure every city is visited exactly once.
Distance: Sums up the weights of the edges between each city in the list.
Comparison: Checks if the Total Sum <= your target budget B.

This verification takes O(N) time, which proves that TSP belongs to the class NP.

Real-Life Example: Logistics and "Last Mile" Delivery

Imagine a courier for a company like DHL, FedEx, or Aramex starting their day at a warehouse in a busy area like Dubai’s Al Quoz.

The Problem: The driver has 50 packages to deliver across the city—some in Dubai Marina, some in Deira, and others in Mirdif.
The Cost: Every extra kilometer driven is money lost on fuel, vehicle wear-and-tear, and driver wages. Missing a "delivery window" adds a time constraint.
The Solution: Companies use TSP-solving algorithms to calculate the exact sequence of turns. Saving just 1% of the distance results in millions of dollars saved annually.

Algorithmic Heuristics: Why Greedy is Often Poor

The Nearest-Neighbor (NN) algorithm often produces poor tours because it is a greedy heuristic. It makes choices based entirely on immediate, local convenience without considering long-term, global consequences.

Example: The Expanding Number Line

Imagine 5 cities at mile markers: 0, 1, -2, 4, and -8.

NN Route (Start at 0): 0 to 1 to -2 to 4 to -8 and finally back to 0.
Total Distance: 1 + 3 + 6 + 12 + 8 = 30.

Optimal Route: 0 to -2 to -8 to 1 to 4 and finally back to 0.
Total Distance: 2 + 6 + 9 + 3 + 4 = 24.

In this simple 5-city example, the greedy route is 25% longer simply because it lacked foresight.

How 2-Opt Iteratively Improves the Route

To fix the chaotic routes generated by greedy algorithms, we use 2-opt (2-optimization). It is a local search algorithm designed specifically to untangle crossed paths.

The Logic: In basic Euclidean geometry, a path that crosses itself is always mathematically longer than a path that does not. 2-opt systematically eliminates these crosses.
The Iterative Swap Process:
1. Select & Delete: The algorithm scans the tour, picks two random edges, and temporarily deletes them.
2. Reconnect: It reconnects the two disconnected curves in the only other valid way (which requires reversing the direction of the cities in the middle section).
3. Evaluate: If the new route is shorter (usually because a crossing was untangled), the swap is permanent. Otherwise, it is discarded.
4. Local Optimum: The process repeats until a full pass reveals zero improvements. The tour is now considered a "Local Optimum."

Formal Proof of NP-Completeness

To prove TSP is NP-Complete, we must demonstrate two things: 1. It is in NP (Verified by the Certificate above), and 2. It is NP-Hard (Proven by Reduction).

NP-Hardness: Reduction from Hamiltonian Cycle

Why Hamiltonian Cycle?

When proving a problem is NP-Hard, you want to find a known NP-Complete problem that is structurally as close to your target problem as possible to make the translation (the gadget) easy.

The Hamiltonian Cycle and the Traveling Salesperson Problem are essentially computational twins:

Hamiltonian Cycle asks: "Does there exist a continuous loop that visits every city exactly once?" (It only cares about the existence of the roads).
TSP (Decision) asks: "Does there exist a continuous loop that visits every city exactly once, and costs less than budget B?" (It cares about the weight of the roads).

Because TSP is just HC with a budget attached, HC is the most natural, direct path for a reduction. You don't need complex, mind-bending gadgets (like you would if you reduced from a logic puzzle like 3-SAT).

The Setup

The HC Instance (Input): You are given a standard graph G with N vertices. Question: Does there exist a cycle that visits every vertex exactly once?

The TSP Instance (Output): You create a complete graph G' with N vertices and edge weights, and a target budget X. Question: Is there a tour with a total distance <= X?

The Transformation Steps

Step 1: Copy the Vertices - Take all N vertices from your original graph G and copy them to your new graph G'.

Step 2: Make it a "Complete" Graph - In G', draw an edge connecting every single pair of vertices.

Step 3: Assign the Edge Weights - Weight = 1 if an edge existed in the original graph G. Weight = 2 if the edge did not exist in G (a "penalty" for using roads that weren't in the original problem).

Step 4: Set the Target Budget (X) - Set your maximum allowed distance to exactly equal the number of vertices: X = N.

Proof of Forward Direction

Step 1: Assume Source is a YES-Instance - If the original graph G has a Hamiltonian Cycle, then every edge in that cycle exists in G.

Step 2: Trace Path in TSP Instance - Travel the exact same sequence in G'. It is a valid TSP tour because G' is complete.

Step 3: Calculate Cost - Each of the N edges corresponds to an original edge in G, so each has a weight of 1. Total Cost = N * 1 = N. Since Cost is exactly equal to the budget X, the TSP instance is a YES-instance.

Proof of Backward Direction

Step 1: Assume TSP is a YES-Instance - If G' has a tour with total distance <= N. The tour connects N vertices, so it uses N edges.

Step 2: Cost Equation - Let k be the number of "penalty" edges (weight 2). The remaining (N-k) edges must have a weight of 1.
Total Cost = (k * 2) + ((N - k) * 1) = 2k + N - k = N + k.

Step 3: Apply Constraint - Since we assume Cost <= N, then N + k <= N, which implies k <= 0. Since k cannot be negative, it must be exactly 0.

Step 4: Conclusion - Because k = 0, the tour uses zero penalty edges. All N edges in the tour have weight 1, meaning they all exist in the original graph G. This forms a valid Hamiltonian Cycle in G.

Worked Example: Step-by-Step Reduction

1. Hamiltonian Cycle (Graph G)

Road C → A is missing.
Result: NO Hamiltonian Cycle.

2. TSP Translation (Graph G')

Target Budget X = 3.
Tour Cost (1+1+2) = 4.
Result: 4 > 3 (NO TSP tour).

Insight: Notice how the "missing" road in the HC problem was mathematically converted into a "penalty weight" in the TSP problem. This forces the TSP instance to return the exact same answer (NO) as the original HC instance, proving the problems are computationally linked.

The Construction Algorithm Breakdown

Step 1: Copy the Vertices - The computer looks at N vertices in G and creates N cities in G'.
Complexity: O(N)

Step 2 & 3: Create Complete Graph & Assign Weights - The computer draws an edge between every pair of vertices. In a graph with N vertices, there are exactly N(N-1)/2 possible pairs.
Complexity: O(N^2)

Step 4: Set the Budget Constraint - Assigns X = N.
Complexity: O(1)

The Final Tally: The total time is O(N) + O(N^2) + O(1), which simplifies to O(N^2). Since N^2 is a polynomial, the transformation is efficient.

Proof Justification: Why This Proves NP-Hardness

1. The Baseline: We already know as a mathematical fact that the Hamiltonian Cycle is NP-Complete. This means it is among the hardest problems in the class NP, and no fast (polynomial-time) algorithm is believed to exist for it.

2. The Translation (HC ≤_p TSP): We have proven that we can take any instance of the Hamiltonian Cycle and translate it into a TSP instance in polynomial time (≤_p).

3. The Implication: Because the translation is fast, if we ever discovered a fast algorithm to solve the TSP, we could run our translation and then use that TSP solver to instantly solve the Hamiltonian Cycle.

4. The Conclusion: Since we know the Hamiltonian Cycle is NP-Complete (and therefore universally accepted as "hard"), the fact that we can use TSP to solve it dictates that TSP must be at least as hard as the Hamiltonian Cycle.

Therefore, by the transitive property of computational complexity: because HC is NP-Complete, the successful reduction HC ≤_p TSP formally proves that TSP is NP-Hard.

Theoretical Performance & Approximation

1. Definition of Approximation Ratio (ρ)

When evaluating non-exact algorithms (heuristics) for NP-Hard problems, we measure their worst-case performance using the Approximation Ratio. It is defined as the maximum ratio between the cost of the tour produced by the algorithm (C) and the cost of the true optimal tour (C_OPT):

ρ = max ( C / C_OPT )

An algorithm with an approximation ratio of 1.5, for example, is guaranteed to never produce a route more than 50% longer than the mathematically perfect route.

2. The Nearest-Neighbor Limitation

While the Nearest-Neighbor (NN) algorithm is computationally fast (O(N²)), it is mathematically fragile. Crucially, Nearest-Neighbor does not have a constant approximation ratio for the general metric TSP. Instead, its performance grows worse as the number of cities (N) increases:

ρ_NN = Θ(log N)

This means there is no strict upper ceiling on how poorly the algorithm can perform. As the map grows larger, the worst-case penalty incurred by NN's greedy, short-term choices gets unboundedly worse compared to the optimal route.

3. Theoretical Benchmarks

To guarantee a constant approximation ratio in metric TSP, more sophisticated algorithms are required:

Minimum Spanning Tree (MST) Heuristic: Guarantees ρ = 2 (the route will never be worse than double the optimal).
Christofides Algorithm: The long-standing theoretical gold standard, guaranteeing ρ = 1.5 by elegantly combining an MST with perfect matching for odd-degree vertices.

Complexity Growth Curve

This chart visualizes why Brute Force becomes impossible even for small groups of cities. While Nearest Neighbor and 2-Opt scale smoothly, the Factorial complexity of Brute Force "explodes" almost vertically.

Brute Force O(n!)

Nearest Neighbor O(n²)

2-Opt O(n²)

Citation: D. Jungnickel, “A hard problem: the TSP,” in Algorithms and computation in mathematics, 1999, pp. 423–469. doi: 10.1007/978-3-662-03822-2_14.