A small note on GAFT optimization

A small note on GAFT optimization

Column

❈PytLab, columnist of the Python Chinese community. Mainly engaged in the application of scientific computing and high-performance computing, the main languages ​​are Python, C, C++. Familiar with numerical algorithms (optimization methods, Monte Carlo algorithms, etc.) and parallelization algorithms (multithreaded and multi-process parallelization such as MPI, OpenMP) and python optimization methods, and often use C++ to write extensions to python.

Know the column: The daily life of chemical dog code bricks

blog: http://pytlab.org

github: https://github.com/PytLab

Preface

Some time ago, I have been using the genetic algorithm framework written by myself to test the effect of the algorithm in optimizing the parameters of the force field, but the running efficiency is very slow, because the fitness function needs to call the force field program many times to calculate the energy, but it is still slower than I expected I also failed to profiling and optimize the program in time. Until the holiday, there was an issue on github that used gaft for SVM parameter optimization for children’s shoes. It said that a large number of fitness functions would be called during the gaft optimization process. This made me profiling gaft to find the program bottleneck during the National Day holiday And targeted optimization.

This article will record the process of profiling and optimization of gaft and the effect of optimization.

The text of GAFT performance analysis (Profiling)

Regarding how to perform performance analysis on Python programs, generate analysis reports and visualize analysis reports, I gave a detailed introduction in the previous blog "Python Optimization Step 1: Performance Analysis Practice", here I will analyze it directly.

In order to analyze different functions in gaft, with the help of Python's built-in cProfile and pstats modules, I wrote a decorator to facilitate analysis and generate different analysis statistics files.

I will briefly explain the decorator with parameters above. The two parameters filename and sortby of the decorator constructor do_profile respectively specify the file name of the analysis result report and the sorting method of the statistical result. It will decorate the function that needs performance analysis, and then generate a result report in the current directory after the function is run. For example, if I need to analyze the main loop of genetic algorithm iteration in gaft, I need:

At the same time, for convenience, I also need an environment variable PROFILING to start the analysis:

Analysis result

Here, in order to facilitate the viewing of the mutual calling relationship of functions, I used pyprof2calltree and then used QCacheGrind on Mac to visualize the analysis results:

Convert the Python profiling file and directly call QCacheGrind to easily view and analyze related information.

It can be seen from the call relationship graph that the min, max, mean and other functions of the initial version of gaft call best_indv and worst_indv multiple times to call the fitness function to compare each other, and usually the user-defined fitness functions are It is generally time-consuming to call external programs additionally. Therefore, the efficiency of gaft must be improved by optimizing the number of calls to fitness by best_indv and worst_indv.

Optimize the GAFT function return value cache

As you can see from the best_indv I wrote before, I use fitness as the key to get the maximum value. Python's built-in max function will internally call fitness to compare with each other to get the maximum value. At this time, an extra call is made to fitness. Because in the genetic algorithm, the individuals in the population of each generation will not change. We only need to call fitnessn times at the beginning of each iteration (n is the population size), and adaptation is needed again in each generation. Obtain the degree value directly. This requires us to lazily evaluate the individuals in the population, that is, to cache all fitness values. I have also used this kind of operation when optimizing my own catalytic kinetics program, which is called function return value caching.

But in gaft, this kind of caching is a bit more troublesome, because caching is not a cache that can be used all the time, it will need to recalculate the fitness of all individuals in the population and then cache again as conditions change.

Conditions that need to be met simultaneously to recalculate fitness values

1. All individuals in the population have not undergone any changes (if they have changed, the fitness value must be recalculated). 2. The fitness value of the existing cache (if it is the first time, then the fitness value of all individuals must be calculated once). 3. The fitness function for calculating the fitness value has not changed compared with the previous one (if the calculated fitness function is changed, of course the fitness value needs to be re-estimated).

Function return value cache descriptor

For this, I wrote a decorator to cache the return value of the function:

Dynamically monitor population changes

Well, above, we can cache the return value of the function through the descriptor, but once the population does not meet the above three conditions, the fitness value needs to be recalculated, then how do we monitor the change of the population?

I added a flag _updated to GAPopulation to mark whether the population has changed, and then our task is to try to update this flag in other places that can affect the population.

How can I update this tag more Pythonic?

The so-called population change means that the list of individuals in the population has changed. I put all the individuals in the population in a list. I need to monitor whether this list has changed in order to update the flag. What are the specific changes?

1. Whether the entire list has changed (assignment operation) 2. Whether the elements in the list have changed (assignment operations to the elements in the list, append and extend operations of the list, etc.)

Well, how do we implement it?

  1. For the first type, since the assignment operator cannot be overloaded in Python, we can handle it through the set of descriptors :

2. For the second case, we need to override the corresponding method of Python's List type

But, even if the interface of the list is rewritten, how to update the variables in the population? At this time, you need to use closures. Define the derived class of list in the GAPopulation constructor init , and instantiate it immediately. At this time, the derived class can obtain the population object, so the GAPopulation constructor can write:

Optimization effect

Through the optimization of the code above, let's see how effective our optimization is, and use the analysis descriptor to analyze the situation of the GAEngine.run running generation population, where the population size is 10. The following figure shows the comparison of the analysis report generated by cProfile:

It can be seen that the optimized running time of the first generation population is shortened to nearly 1/7 of the original! The optimization effect is still very obvious. Then take a look at the call relationship diagram: the number of calls to energy_fitness has dropped from 3807 to 621!

summary

This paper records a profiling and optimization process of the genetic algorithm framework GAFT. By caching the value, the number of calls of the fitness function is greatly reduced. In terms of time, the efficiency of running a generation population has increased by about 7 times.

Reference: https://cloud.tencent.com/developer/article/1033861 Genetic Algorithm Framework GAFT Optimization Notes-Cloud + Community-Tencent Cloud