Sunday 5 April 2015

Week 12: Run time complexity

CSC148 has come to a glorious conclusion! But before the curtain closes, we still have one final very important topic to discuss. All roads lead to this topic. So without further ado.

“Run time complexity”, what is it? and why does it matter?

In class we were introduced to different types of functions. What separates the best function from the worst is the time it takes to accomplish some task. Whether the task is sorting elements in an array, finding the factorial of a prime number or otherwise, a well written function should provide an output in the fastest time possible. For this reason, the run time of a function is a hot topic around these parts.

Once the run time complexity of a function is established, we place it into one of several different categories. As you study the running time and data size tradeoffs in the graph below, you’ll notice that some of the least efficient functions would fall into the n^2 category, while some of the best would fall into the constant time category.


How to figure out a functions run time


So how do we go about figuring out which category a function belongs to? This is done by counting each time a function’s line is executed, under the worst case condition for some number of inputs n.

Let’s get a better sense of this counting method with a simple example.

Counting example - search_lst function


Our function:

def search_lst(lst, obj):
“”” Return index of obj if it exists in lst, otherwise return -1 “””

  1. i = 0
  2. while i < len(lst):
  3.   if lst[i] == obj:
  4.       return i
  5.   i = i + 1
  6. return -1

Let’s assume that the object obj we’re trying to find in an arbitrarily long list lst does not exist in lst.  In other words, this is our worst case. The while loop will iterate over every index of lst but it won’t find the desired obj.

Now that we’ve established the worst case scenario, we can begin thinking about how to count each line of code in search_lst. The length of lst is some arbitrary length that we’ll refer to as n. The table below enumerates the amount of times each line of code is executed.

Code line
Code
Times executed
1
i = 0
1
2
while i < len(lst):
n + 1
3
if lst[i] == obj:
n
5
i = i + 1
n
6
return -1
1

While most of the times executed counts are self explanatory, the one that confuses most people (including me) is the count for line 2. Line 2 of the while loop is executed n + 1 times because n’s the number of indices of lst and 1 additional time, that extra time being the loop guard, which makes sure to escape the loop once the inequality is no longer true (i greater than len(lst)).

Looking at the times executed column we can formulate a run time complexity for search_lst by using some basic arithmetic. We see that n appears 3 times and the integer 1 appears 3 times, so this gives us 3n + 3, which is the linear run time of search_lst, hallelujah! It’s that simple.

To conclude then, we’ve discovered that search_lst belongs in the linear category of time complexity. The time it takes to execute search_lst in the worst case is directly proportional to the size of the input (in our case the length of lst). Could the run time be improved for this function? Could it become constant time? perhaps, but this is where we’ll leave this for now. It’s been a pleasure!

No comments:

Post a Comment