Sunday, December 1, 2013

Algorithms and Efficiency

The last few weeks we've focused on efficiency, rightfully so. This week in lecture, Danny demonstrated how properly crafted code can mean the difference between 1000 years worth of calculation vs one second of calculations. The Fibonacci sequence, with it's many recursive calls served as a great example of how repetition can grown exponentially, efficiently making code useless. Brute force may get results eventually, but there are many much more clever and effective means to accomplish things.
This brings us to the topic of sorting algorithms. Who can forget the wonderful demonstration our lecturer gave using coffee cups to demonstrate how various sorts operate. If nothing else, it entertained me by showing how long and tedious some sorts are in comparison to some others. My favorite sort has to be bogosort by far. I'd say quantum bogosort, but it's just silly and I'd rather not mess with quantum randomization.
Why is bogosort my favorite? Because it's simply the most ridiculously terrible sort with an average run-time of O(n x n!)It's the procrastinator of sort algorithms. It'll sort your list eventually...
However on a more serious note I'd like discuss a few common sorting algorithms, starting with quicksort.
Quicksort is a recursive comparison sort. The sort operates by choosing a pivot point (the pivot point may be any point, from my understanding a position that is the median of the first, middle, and last elements is ideal, a point at the end for example would prove very inefficient if used on an already sorted list) and then comparing all the values in the given array to the pivot value. Values less than or equal to the value are stored in one array and values greater than the pivot are stored in another. The sorting algorithm is then applied to those two lists and an array in the form of  [quicksort(lesserlist), pivotvalue, quicsort(greaterlist)] is returned.
Source: http://en.wikipedia.org/wiki/File:Sorting_quicksort_anim.gif
The animation above does a good job of illustrating this. A random pivot value is chosen, the last value in this case, and each element in the list, and then the subsequent sublist, is compared to that lists pivot value. Afterwards the pivot value is moved between the two lists and after enough recursions, the list is sorted. It's an efficient sort averaging O(nlogn), though in a worst case scenario it can rarely reach n². There are various implementations such as in place quicksort that can increase efficiency even further, reduce memory usage, and make it a stable sort.
Merge sort is another example of a comparison sort. A merge sort works by dividing an unsorted list into sorted sublists of one element each since a list of one element is already sorted. Then the first sublist is compared with the next sublist and merged, and so on leaving several sublists of two elements. Then this process repeats with each set of sublists until only one sorted list remains. The animation below presents this process in an easy to understand manner.
Source: http://en.wikipedia.org/wiki/File:Merge-sort-example-300px.gif
The last algorithm I want to mention is insertion sort. This is an extremely simple algorithm that builds the sorted list as it receives it. The first element is compared to the second element which is then either switched or becomes the new lead element if it's of a greater value than the first. This continues, with each new element being compared with each already sorted element until it finds an element it is greater than or reaches the end of the list. This sort is fine for small lists, however its run-time grown exponentially with larger list making it very inefficient in that regard, though at least it's a stable sort, in-place, and uses very little memory. It's very inefficient for large lists with an average run-time of O(n²).
Source: http://en.wikipedia.org/wiki/File:Insertion-sort-example-300px.gif
What is the most efficient sort then? At first glance it would appear to be quicksort, though there's always that rare worst case run-time scenario and extra memory usage. Then maybe merge-sort? It has good run-time but bad memory usage. Definitely not insertion sort, except in the case of partially sorted and small lists. It really seems each algorithm has some advantages and disadvantages. An educated choice has to be made on which to use. Sorts can grow in complexity as well, sometimes leading to an improved run-time, but other times to more headaches. Then there exist some sorts that are combinations of two more basic sorts Timsort, the built in sort in Python is an example of this. In reality new sorts are still created even today and implementation really does depend on context to achieve the greatest efficiency.

Thursday, November 21, 2013

Coding Woes, and Comment Etiquette

Comments, some may argue that they're equally as important as the code itself. When I first learned to code it was mainly little projects done only for myself. I would automate little things or solve problems for Project Euler. Comments were for myself, and the simplicity of the code usually meant that I didn't do much commenting. Ah, that habit came back to bite me today.
Some comments on my A2 submission.
AH! A whole 3 points gone because of sparse commenting on my part. My code worked fine though! Isn't that what matters? No, comments are important, because in a professional or even an academic environment, the code you write isn't just for you. It's for you as well as the people that need to work on it as well, either to adjust it or maintain it. It seems like a small nuisance at first, though it's a habit that is an absolute must for a competent programmer. Success requires communication, something that fast uncommented code doesn't always do very well. So I'll try to stick to the habit of doing so, hopefully so will everyone else, and we can all enjoy good communication in the future.

Monday, November 11, 2013

Linked Lists and Memory

I was recently made aware of one of classmates blogs, quite excellent in my opinion. Besides being impressed with his choice to self host and code it all on his own, he brought up an interesting thing that I feel may not be getting enough attention, though I realize many of us know about it and it may even be covered in 108 for all I know, but I feel should  be something everyone knows. That being pointers and allocation of memory. My classmate made this point with linked lists though it applies to all data structures. Each value is simply a block of memory which points to another block of memory. All together values and pointers form all data structures, languages such as Python just have them built in and made easy for us. A complex understanding may not be immediately necessary with all the processing power available to us, though it's surely a must for more constrained environments.

Thursday, October 31, 2013

Regular Expressions!

Regular expressions! Joy! Let me just take a look at them for a second...Okay, let me take a look at them a bit longer...
Well it's safe to say they're a new concept to me, or at least using them is, though I don't seem to be alone in that. Reading up on it, thy really are extremely useful. They're used for searches, in spell check, anything that requires parsing really. I really should know more about them. And that's where A2 comes.

Monday, October 21, 2013

This week we continued on with recursion by exploring self-defined abstract data types. This time with binary trees. Traversing them turns out to be a more complex example of recursion than we've previously used. The exercise for the week proved difficult at first. It left me with a dumb feeling to have worked for hours on what amounted to only a few lines of code, trying to understand how to properly implement traversals.

Sunday, October 13, 2013

    This week saw the rapidly approaching deadline for assignment one, and even though I hadn't started as early as I possibly could have, I'm glad I was able to give myself a week to tackle the assignment. Seeing people still desperately rushing to finish on the night it was due just served to convince me I was doing an ok job of time management. The peace of mind of not having to touch my code the day it's due is so worth it.
    The assignment came together without major problems for me and my group for the most part, but personally recursion presented itself as the largest problem at first, because after all, it's just an unnatural way for humans to think. It's one of the reasons we employ computers computers to do so for us. What seemed to make things easiest, at least for now until I get better used to thinking recursively, was to chart out the computational operations to better visualize the process. In this way I was able to understand recursion, which is just structured repetition to achieve my goal.
    Object Oriented Programming (OOP) is one type of programming, the other that comes to mind is procedural. However, it's really the only type I really know, since Java and Python are both OOP languages. It seems to be an effective paradigm, offering the ability to encapsulate certain code in programs, and making programs more stable in that errors occur, in specific classes or methods rather than globally, which in turn makes them easier to fix. It may not necessarily the most efficient mode to do things in for the machine, but the sacrifice seems worth it in order to make code more manageable and understandable for developers other than the one that originally coded any piece of code.
    The week as a whole seemed focused on the upcoming term test and the assignment more so than anything else, though the lull in which to study and work, was much appreciated.