a short take on genetic algorithms.
It’s 2022. Our smartphones have the computing power of PCs from 10 years ago or supercomputers of the ’90s. Regarding such a fast pace of growth in calculating capabilities of modern computers, it’s easy to assume that today’s most powerful machines can solve any mathematical problem, right? Well, unfortunately, it’s not that easy.
Let’s consider a very popular Travelling Salesman Problem (TSP). Imagine a guy working as a deliveryman who has to distribute some packages, but his pert boss told him that he’s allowed to visit every sport only once. Additionally, the company is short on fuel budget, so he has to figure out the shortest route. The solution isn’t complicated to come up with. All he has to do is list all possible ways and choose the shortest one. Easy, right? Actually, that’s a bit more complicated.
The method is correct; however, the number of possible tracks is the issue. If he was to deliver only two parcels, there is only one route, because it makes no difference which house he’s chosen to visit first, the distance would be the same. Three parcels give three varying paths. That’s easy to solve! Five gives sixty. Still a piece of cake, but wow, that increased quickly. And that is the source of all evil. The number of routes is based on a factorial of the number of spots to visit. And the factorial function is not to be messed around with. 10 parcels give 1 814 400 various distances, 20 give 1018 and let’s hope it’s not some busy time like Christmas. 60 parcels mean he has to choose the shortest distance among 4×1081 different possibilities. Such a big number is extremely hard to comprehend, so let’s compare it to something. The number of water drops in all the Earth’s oceans? No, it’s way more. The number of all organisms living on our planet? Nah, not even close. Distance between us and some galaxy far far away expressed in nanometers? Miss! 4×1081 is about as much as all the atoms of our observable universe combined. Or, considering more humble calculations, even way more.
So how do we solve such problems? I mean, we have a method to do it, right? Sure we do! And it’s inspired by nature itself. The theory of evolution, to be exact. Let’s give a shout-out to the most crucial figures. Charles Darwin described the fact that organisms adapt to their environment through natural selection, Gregor Mendel discovered the genome, and Hugo de Vries gave us the idea of mutation. Their work combined was used to develop the most versatile method of optimization – evolutionary algorithm, with the genetic algorithm being the most famous instance.
Let’s discuss the key fields of every genetic algorithm. The solution we are looking for is encoded as a list of numbers like our features are encoded in our genome. We have to prepare a function to determine which of those numbers gives a better answer, and we call it an objective. Finally, our solutions are subject to selection, crossover, and mutation. Okay, having the base knowledge covered, let’s dive deeper into our TSP example.
Our solution’s genome will be a list of consecutive spots to deliver the parcel, so it can look like [8, 4, 1, 9, 7, 2, 3, 6, 5] meaning “Visit house 8, then house 4, etc.”. Knowing the distances between all targets, we are able to calculate the total distance (our objective is to minimize it). Now, let’s create a pool of randomly generated answers and place them in our virtual population.
We have some answers which are more fitting than others. In nature, animals have their ways of choosing the strongest members of their bunch, and we have to mimic it to create a virtual selection. There are plenty of different ways to perform a selection, but the implementation is not important right now. What matters is that we end up with exactly half of our population, which is now going to be parents of the next generation.
The next generation is created by crossing two randomly chosen parents and repeating it until the population size is restored to its original form. It can look like this:
Parent 1: 5 1 2 8 3 4 9 7 6 Child 1: 5 1 2 8 3 7 6 9 4
Parent 2: 3 1 5 2 8 7 6 9 4 Child 2: 3 1 5 2 8 4 9 7 6
When we have our new generation born, the last thing to do is to mutate its individuals. However, the possibility of mutating each member has to be small to not spoil our existing results. Mutation can be carried out in plenty of different ways as well, but let’s present it as follows:
Before: 5 1 2 8 3 7 6 9 4 After: 5 9 2 8 3 7 6 1 4
Running such a process over and over eventually ends with the population of very well-fitted solutions to the given problem. However, there is one drawback to this method. Although it returns many optimized answers, genetic algorithms do not guarantee to find the best solution possible. It works heuristically, which means it focuses on working decently enough only with a certain chance of giving the perfect answer. Does it mean that the method is just theoretical and has no use in the real world? Absolutely not! In reality, we don’t really need the perfect solutions possible. We need decent ones, and we want them quick. Go back to our poor delivery guy. It doesn’t matter if his route will be 5 km longer than it could. What matters is that he was given that route in just several seconds. That’s why the genetic algorithm is such a powerful tool. And being based on the basic laws of nature makes it only more beautiful. Parent