Weidong's CSC148 Blog: March 2014

Saturday, March 29, 2014

Week 11: Sorting and Efficiency

    Last two weeks, we learned about sorting and efficiency. In terms of efficiency, the major concentration is on running time complexity. To compare different algorithms, we should make a choice on standards. For sorting algorithms, we choose the length of a list as one standard. And also, we ought to find a measurement counting for one 'step'. For example, in 'insertion sort' algorithm, we may choose one assignment statement as one 'step' or one comparison as one 'step'. Running time complexity of an algorithm may get various results based on different standards. For instance, we may have running time complexity of '5n^2+3' based on one standard and 'n^2+log n' based on another. That is the reason why big-oh notation so useful. It provides an asymptotic upper bound for an algorithm. In the previous example, both '5n^2+2' and 'n^2+log n' belong to O(n^2). That significantly reduce the troubles when comparing two algorithms' running time complexity. Follows I will write down some sorting algorithms I have learned.
    In CSC108, I learned three O(n^2) algorithms: bubble sort, insertion sort and selection sort. These two weeks I learn some more efficient sorting algorithms: quicksort(O(n log n)), merge sort(O(n log n)), count sort(O(n) but it has preconditions), Tim sort(O(n log n), used in Python built-in sorting method). To get detail information about these algorithms, refer to these links: http://interactivepython.org/courselib/static/pythonds/SortSearch/sorting.html
http://en.wikipedia.org/wiki/Timsort
http://en.wikipedia.org/wiki/Counting_sort
    The common characteristic of bubble sort, insertion sort and selection sort is that they check items one by one. So intuitively, it is correct to have same asymptotic upper bound. Specifically, bubble sort repeatedly compares neighbour pairs and swaps if necessary, insertion sort repeatedly adds new element to the sorted result and selection sort repeatedly picks the smallest element to append to the result. Quick sort, merge sort, Tim sort are all 'divide and conquer' algorithms. Basically, they split the whole list into two halves and run recursively until reach the base case. So similar with binary search, dividing a list to two halves requires running time 'log n'. Roughly, on length n, we have running time 'n log n'.
    I caught a good website of learning sorting algorithms from my classmate's SLOG(http://lbg940131.blogspot.ca/): http://bigocheatsheet.com/. I strongly recommend it. I also recommend this SLOG(http://lbg940131.blogspot.ca/). Its analysis is sound and in depth.

Sunday, March 23, 2014

Week9: Binary Search Tree Data Structure

       In this week's lecture and lab, we concentrated on the binary search tree data structure. A binary search tree is a binary tree whose data in left subtree is less than that in the root, which in turn is less than that in right subtree. Also there is no duplicate node in the binary tree. With these properties, searching becomes more efficient.
    To find a data in a binary search tree, we compare the data with the root value. If the data is greater than the root value, we will search the right subtree recursively and if the data is less than the root value, we will search the left subtree recursively with appropriate base cases.

Insertion is similar, we compare the data with the root value. If the root is None, then we create a new binary search tree node. If the data is greater than the root, than search the right subtree and if the data is less than the root, search the left subtree. Eventually, we will reach a external node to add a new node to the binary search tree.

Deletion is a little bit complicated. There are roughly three cases: deleting a leaf, deleting a node with no left child, deleting a node with no right child.

    Algorithm for _delete:
      1. If this node is None, return that
      2. If data is less than node.data, delete it from left child and return this node
      3. If data is more than node.data, delete it from right child and return this node
      4. If node with data has fewer than two children, and you know one is None, return the other one
      5. If node with data has two non-None children, replace data with that of its largest child in

the left subtree, and delete that child, and return this node

In this week's lab, we wrote a function count_less in binary search tree using several different ways. Recursion is a good way to deal with binary search tree, and looping is also a good way.

In Wikipeia, there is more information about binary search tree that I can get approach.

Reference: http://en.wikipedia.org/wiki/Binary_search_tree

Sunday, March 9, 2014

Week8: Linked List Data Stucture

    In these two weeks lecture, we learned another data structure, linked list. Linked lists are made up of nodes, where each node contains a object and a reference to a linked list. So, linked list is a recursive definition. It is either empty or a node that contains an object and a reference of another linked list. Also if the reference is itself, we make a infinite list.
    We use linked list to implement some abstract data types. In theses two weeks' labs, We implement the ADT queue and stack using linked list instead of using Python built-in type list weeks ago. We found it hard to implement queue's 'enqueue' and 'dequeue' method without methods similar in list. The basic approach to the problem is to store two attributes: a reference to where you add new linked list elements, and a reference to where you remove the oldest linked list element. Stack is easier to implement because its methods, push and pop, deal with the very last item.
    Linked list methods. List have a lot of methods. So we can write these methods in class LinkedList. Recursion is greatly used in writing these methods to deal with indexes. Also in this process, I learned some equivalent ways to call these method. __contains__, __len__, __getitem__, __setitem__, __delitem__ have equivalent ways to call them in Python. See lab handout for more details. In addition to that, when I talked to TA, we know the methods 'prepend' and 'decapitate' have a problem that when we implement them, a piece of memories become 'garbage' without reference to it. For example, if we implement decapitate on a linked list, the rest becomes the new head and the rest.rest becomes the new rest. But the problem is that the previous head becomes 'garbage'. When I google it, I found they are called 'garbage collection' which has both advantages and disadvantages.

    Some useful reading materials:    http://openbookproject.net/thinkcs/python/english3e/linked_lists.html
http://en.wikipedia.org/wiki/Linked_list
http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)

Sunday, March 2, 2014

Week 7: Recursion

    We have been working with recursion since the fourth week of this course on lectures, labs, exercises and assignments. Here I will write down some usage of recursion including some examples which I encounter during my study. Even now, I cannot confidently say I have master all the ideas of recursion, but at least part of it. Write down them and remind myself in the future.
    The basic idea of recursion is that we define a base case for a function, then step one by one. We get any of what we want. To write a recursive function, the very first thing is to think about the base case. Sometimes the base case may not be written individually, but more time we will write them explicitly. Next is the recursive step. What we will do is just assume we are done with the function and it is definitely correct, and then try to get the production.
    Here for some examples.

This example is from the lecture and it was my first time to feel the power of recursion when encountering it. It returns the depth of a nested list. It seems difficult to tackle this problem. But if we assume we have done the function and things get much simple. For base case, when L is not a list, it should return 0. And then we use a list comprehension and built-in function 'max' to get the maximum depth of each element in L. Plus one and get the result. It is not hard when we look back.
Next example:

    This function is trying to get the height of a Tree. Tree is a newly kown data structure. Using recursion to deal with Tree problems is quite helpful.
    In additon, Assignment 1 uses a lot of recursion. Also some examples in the reading materials.
http://openbookproject.net/thinkcs/python/english3e/recursion.html .They may be helpful in the future.
    In week 7's lab, we use recursion to write some methods of linked list ADT.