HW 5

1 Theory

  1. 5.1 (from text)
  2. 5.2 (from text)
  3. 5.3 (from text)
  4. 5.4 (from text)
  5. See the programming portion below. Provide your observed behavior of the MakeHeap algorithm.
  6. One way to sort numbers is to build a heap and then pop the numbers off the heap. Since each element is the least element left on the heap when you pop it, the stream of numbers popped off must be in sorted order. Analyze the time and space complexity of this method of sorting numbers. (Don't forget to include the time it takes to build the heap!) How does this compare to merge sort (O(n lg n) time and O(n) space)?
  7. (Sample exam question, but too long for a real midterm.) In a C program, you have a data structure with satellite data defined as follows:
    struct my_data {
       float height;
       char *name;
       double computed value;
    };

    Because this is satellite data, you must access it through some other data structure. (That is, you don't have an array of just these satellite data.) Your most convenient access is through iterator functions: you can call first() to get a pointer to the first my_data structure and next() to get a pointer to the next structure. Both functions return a pointer to a my_data structure or else NULL if no more are available. Due to other processes happening at the same time, you are not guaranteed that two iterations will return the data in the same order, but each time you iterate through the data you are guaranteed to get each item exactly once.

    The height and name have already been filled in. You want to run an O(n lg n) time and O(n) space algorithm A on that satellite data to compute computed value. The algorithm A only needs the height, it ignores the name.

    Sadly, A is very difficult to implement, but you have access to an implementation, impressive_A(), of A that expects an array of floats and returns an array of doubles. You may assume that impressive_A() is O(n).

    Describe an algorithm (pseudo-code and high-level description, not C) that uses first and next, the above structure definition, and the function impressive_A() to fill in the computed value members. Analyze the time and space complexity of your algorithm, including the fact that your algorithm must call impressive_A().

    In summary, you must copy the height data to an array, run impressive_A(), and then copy the doubles back to the satellite data. (Hint: You will need an auxiliary data structure to remember the correspondence between the satellite data you read on your first iteration and the array indices so that you can write the computed_values back on a second iteration. Analyze the time and space complexity of your solution in terms of O(impressive_A()). If you need to "invent" functions whose implementation you don't know, give the best time bounds you can for those invented functions.

  8. (More realistic sample exam question.) Design a queue using only two stacks. What is the time complexity for insert? For remove()? What is the space complexity?
  9. (Sample exam question, but too long for a real midterm.) The client-server file synchronization program rsync compares two versions of a file (typically on different machines) and updates one copy to be identical to the other, hopefully only transferring the part that is different. The section of the algorithm that looks for differences works roughly like this:

    The server computes a checksum for each disjoint 512-byte block of its copy of the file: b(1, 512), b(513, 1024), b(1025, 1536), etc. It sends this short list of checksums to the client.

    The client computes every 512-byte checksum for its copy of the file: b(1, 512), b(2, 513), b(3, 514), b(4, 515), etc. The client's list of checksums is generally longer than the file itself!

    The client now looks for 512-byte blocks in the server's copy that might be present in the local copy. The idea is that if a checksum matches, the blocks might be identical, but if the checksums don't match, the blocks are surely not identical.

    Design an algorithm, given the above two lists of blocks, that comes up with a list of blocks on the server for which no block on the client can possibly match. (We are not asking you to describe how rsync actually works, only to describe an algorithm that could work.)

2 Implementation

This is to be done on the CS machines (tux.cs.drexel.edu , or one of the lab machines in UCross 151), and submitted using submit_cli

  1. (Makefile target: prob1) Find an upper bound for the make_heap() algorithm.

    Implement the make_heap algorithm, described in class. I.e., given an array of elements, look at it as a binary tree. Start at the last interior node, and downheap it. Then downheap the previous interior node, and continue in this fashion, up to the root.

    What you are doing is building little heaps from the bottom up, then merging them, so that, by the time you've downheaped the root, you have a heap with n elements.

    Create trees of various sizes, time your make_heap function (at least 10 values of n), and provide your answer in the written homework, above.

    Your run target in the makefile will cause these experiments to be run, and output a 2-column table, n and T(n)