Next Permutation in C

At one point, I was writing a Frequent Subgraph Mining program for an Independent Study in Data Mining. Frequent Subgraph Mining is a very interesting problem because it is built primarily on the Subgraph Isomorphism problem which has some very interesting mathematical properties and challenges to consider. If you like you can read more about it here: https://en.wikipedia.org/wiki/Subgraph_isomorphism_problem

I was mostly focused on making the task of computing all possible subgraphs of a given graph and classiffying them into isomorphism classes (in which all graphs were isomorphic to each other) and counting the number of graphs in that isomorphism class. The problem was very interesting, however, my first implementation did not bother with a couple of things that would've made my algorithm a lot faster.

An excellent example of this was when I needed to generate all possible permutations of a set of numbers. This was done by a brute-force process that sat there for a good couple of minutes and generated all possible permutations of a set of numbers and stored them into a matrix. This matrix had to be allocated, used, and deallocated which also consumed time. Furthermore, most of the time it would find a valid permutation and it would never actually read through the whole matrix. It is safe to say that the way I had it implemented was a complete waste of time and resources that one could scarcely afford when working on an NP-Hard problem.

Eventually I came to the conclusion that this situation needed to be fixed and I was wondering about it out-loud when a friend of mine pointed out that C++ has a function called next_permutation() which can be found here: http://www.cplusplus.com/reference/algorithm/next_permutation/ that, when given a range of elements, will return the next permutation of those elements using lexicographic order.

I immediately determined that I wanted to figure out how this function worked by myself and implement it without looking it up and so I embarked on a rather idiotic exercise in the limits of my own brainpower which is described below in detail for those who care or need to re-implement this function.

Lexicographic Ordering

Despite the scary-sounding words, lexicographic ordering is quite intuitive. It simply means that the words are ordered alphabetically/numerically. For example, if we wanted to take three letters "a", "b", and "c" and produce all possible permutations of those letters, the lexicographic ordering of those permutations would look like this:

"abc", "acb", "bac", "bca", "cab", "cba"

Please note that effectively this means that the permutations are ordered alphabetically. This actually assigns a very coherent and mathematically predictable order to the task of computing all possible permutations. For those who know that it is, this order is effectively the ordering that would be generated if this task was performed by a DFS (which it was in my original implementation described above).

Why Next-Permutation?

The advantage of a function like next_permutation() in this case is that I can use it to generate all possible permutations by simply calling it repeatedly on the same array of elements. It would effectively generate each permutation independently of the others and would do so in linear time relative to the number of elements per permutation. It would also save tonnes of time and memory since it would be an in-place algorithm and it would stop as soon as I found a valid permutation which was very valuable in my case thereby not wasting time considering permutations that I did not need.

How it Went Down

What followed my decision to implement this function myself could only be described as the most painful 7 hours of my programming career. This was easily the most annoying 44 lines of code that I have written up to this point. I had never been so hopelessly frustrated and at the same time so determined to actually complete the pointless task of implementing a function that people have done decades before.

Part of the reason for this frustration was that this problem is kind of unique in the sense that the way I approached it did not really fit it. Specifically, I started by looking at the patterns associtated with generating the lexicographically next permutation and discovered quickly that there were many patterns that could be used to generate all possible permutations of three elements, but some of those did not work for four elements, and even fewer worked for five. It seemed tantilizingly easy at first, and when I found an issue at a higher level I would just try another approach that would work for small numbers of elements but would eventually fall apart at larger quantities. I kept being on the verge of a functioning algorithm but never quite getting there.

Thankfully, eventually I figured it out (if someone actually finds a mistake in this please email me because that would be terribly embarassing, but this algorithm did work for me well enough for very large sized graphs so I believe it to be correct).

Algorithm Description

Pre-Requisite Functions

The algorithm first requires several very simple functions to do its job. These are: a comparison function for sorting the elements, a swap function to swap two elements inside of an array, and a function that finds the index of the minimum element inside of the array. These functions can be seen below and should be fairly self explanatory given a sufficient knowledge of C.

int cmpfunc (const void * a, const void * b)
{
   return ( *(int*)a - *(int*)b );
}

This function is a very standard comparator function used to sort arrays of integers. It simply returns a positive/negative value indicating which number is bigger than the other, and zero if they are equal.

void swap(unsigned int* a, unsigned int x, unsigned int y)
{
    unsigned int t = a[x];
    a[x] = a[y];
    a[y] = t;
}

A similarly simple function, here, given an array, the function takes the first element, stores it in a temporary variable, then replaces the first element with the second in the array, and finally places the contents of the temporary variable into the second element.

unsigned int idx_of_min_from(unsigned int* a, unsigned int s)
{
    unsigned int i = 0; 
    unsigned int m = 0xffffffff; 
    unsigned int mi = 0;
    for (i = 0; i < s; i++)
    {
        if (a[i] > a[0] && a[i] < m)
        {
            m = a[i];
            mi = i;
        }
    }
    return mi;
}

This function is only marginally more complicated than the ones above. It creates a minimum value that has the maximum possible value (all ones) from the start and then scans through the array looking for a value that is smaller. Every time it finds one it updates the minimum value and its index and then once its done it returns the index.

The Intuition

The intution behind the algorithm is as follows:

  1. Loop over each pair of values in the array from the back (i.e. the largest index) to the front (the smallest index) and check if any element is ever smaller than the one that follows it.

  2. If you can find an element that is smaller than the following one:

    1. Get the index of the minimum value in the array.

    2. Swap the current value (the element that is smaller than the following one) with this minimum value.

    3. Sort a part of the array after the new location of the minimum value using the comparison function.

    4. Return one on success.

  3. If this never occurs (i.e. all elements are in order from largest to smallest) then you are at the final permutation and you can return zero.

Yes, its embarassing how long it took me to figure out this exact pattern, but whatever, I can proudly say that learned something that day and figured it out on my own =D

The Implementation

int next_permutation(unsigned int* a, unsigned int s)
{
    int i = s - 2;
    int j;
    for (i = s - 2; i > -1; i--)
    {
        if (a[i] < a[i+1])
        {
            j = idx_of_min_from(a + i, s - i) + i;
            swap(a,i,j);
            qsort(a + i + 1, s - i - 1, sizeof(unsigned int), cmpfunc);
            return 1;
        }
    }
    return 0;
}

This particular implementation has a couple of notable things about it. Specifically, it is made to work with unsigned integers only since that is what I was using for my overall project. This is fine because the same exact method can be applied to letters and numbers so long as you can perform the comparison between them correctly. This should be fairly easy in C since integers can be treated the same way as characters in this scenario, but I cannot vouch for other languages. Similarly, one can change the algorithm to use a different comparison to reverse the order of the permutation if it is required that you go from the lexicographically last to first.

The Final Code

int cmpfunc (const void * a, const void * b)
{
   return ( *(int*)a - *(int*)b );
}

void swap(unsigned int* a, unsigned int x, unsigned int y)
{
    unsigned int t = a[x];
    a[x] = a[y];
    a[y] = t;
}

unsigned int idx_of_min_from(unsigned int* a, unsigned int s)
{
    unsigned int i = 0; 
    unsigned int m = 0xffffffff; 
    unsigned int mi = 0;
    for (i = 0; i < s; i++)
    {
        if (a[i] > a[0] && a[i] < m)
        {
            m = a[i];
            mi = i;
        }
    }
    return mi;
}

int next_permutation(unsigned int* a, unsigned int s)
{
    int i = s - 2;
    int j;
    for (i = s - 2; i > -1; i--)
    {
        if (a[i] < a[i+1])
        {
            j = idx_of_min_from(a + i, s - i) + i;
            swap(a,i,j);
            qsort(a + i + 1, s - i - 1, sizeof(unsigned int), cmpfunc);
            return 1;
        }
    }
    return 0;
}

Once this is written, one can use the function in a while loop like so:

...
while(next_permutation(array, array_size))
{
    ...
}
...

The fact that the function returns a zero on the last possible permutation means that the while loop will evaluate the conditional to false and stop running. Otherwise the function will return one and everything will continue.

Conclusion

That's all there is to it folks! as always please feel free to email me if you have any comments or questions and do not hesitate to let me know that there is a mistake in my algorithm (again, embarassing, but I would rather know about it). I hope that you found this page helpful and that the functions mentioned here are useful.