The last few weeks we've focused on efficiency, rightfully so. This week in lecture, Danny demonstrated how properly crafted code can mean the difference between 1000 years worth of calculation vs one second of calculations. The Fibonacci sequence, with it's many recursive calls served as a great example of how repetition can grown exponentially, efficiently making code useless. Brute force may get results eventually, but there are many much more clever and effective means to accomplish things.
This brings us to the topic of sorting algorithms. Who can forget the wonderful demonstration our lecturer gave using coffee cups to demonstrate how various sorts operate. If nothing else, it entertained me by showing how long and tedious some sorts are in comparison to some others. My favorite sort has to be bogosort by far. I'd say quantum bogosort, but it's just silly and I'd rather not mess with quantum randomization.
Why is bogosort my favorite? Because it's simply the most ridiculously terrible sort with an average run-time of O(n x n!)It's the procrastinator of sort algorithms. It'll sort your list eventually...
However on a more serious note I'd like discuss a few common sorting algorithms, starting with quicksort.
Quicksort is a recursive comparison sort. The sort operates by choosing a pivot point (the pivot point may be any point, from my understanding a position that is the median of the first, middle, and last elements is ideal, a point at the end for example would prove very inefficient if used on an already sorted list) and then comparing all the values in the given array to the pivot value. Values less than or equal to the value are stored in one array and values greater than the pivot are stored in another. The sorting algorithm is then applied to those two lists and an array in the form of [quicksort(lesserlist), pivotvalue, quicsort(greaterlist)] is returned.
|
Source: http://en.wikipedia.org/wiki/File:Sorting_quicksort_anim.gif |
The animation above does a good job of illustrating this. A random pivot value is chosen, the last value in this case, and each element in the list, and then the subsequent sublist, is compared to that lists pivot value. Afterwards the pivot value is moved between the two lists and after enough recursions, the list is sorted. It's an efficient sort averaging O(nlogn), though in a worst case scenario it can rarely reach n². There are various implementations such as in place quicksort that can increase efficiency even further, reduce memory usage, and make it a stable sort.
Merge sort is another example of a comparison sort. A merge sort works by dividing an unsorted list into sorted sublists of one element each since a list of one element is already sorted. Then the first sublist is compared with the next sublist and merged, and so on leaving several sublists of two elements. Then this process repeats with each set of sublists until only one sorted list remains. The animation below presents this process in an easy to understand manner.
|
Source: http://en.wikipedia.org/wiki/File:Merge-sort-example-300px.gif |
The last algorithm I want to mention is insertion sort. This is an extremely simple algorithm that builds the sorted list as it receives it. The first element is compared to the second element which is then either switched or becomes the new lead element if it's of a greater value than the first. This continues, with each new element being compared with each already sorted element until it finds an element it is greater than or reaches the end of the list. This sort is fine for small lists, however its run-time grown exponentially with larger list making it very inefficient in that regard, though at least it's a stable sort, in-place, and uses very little memory. It's very inefficient for large lists with an average run-time of O(n²).
|
Source: http://en.wikipedia.org/wiki/File:Insertion-sort-example-300px.gif |
What is the most efficient sort then? At first glance it would appear to be quicksort, though there's always that rare worst case run-time scenario and extra memory usage. Then maybe merge-sort? It has good run-time but bad memory usage. Definitely not insertion sort, except in the case of partially sorted and small lists. It really seems each algorithm has some advantages and disadvantages. An educated choice has to be made on which to use. Sorts can grow in complexity as well, sometimes leading to an improved run-time, but other times to more headaches. Then there exist some sorts that are combinations of two more basic sorts Timsort, the built in sort in Python is an example of this. In reality new sorts are still created even today and implementation really does depend on context to achieve the greatest efficiency.