Okay, just follow along, questions will be denoted with a
**Q**.

- Make some appropriate subdirectory for this lab and go there
- Copy ~kschmidt/public_html/CS265/Labs/Performance to your lab directory

- big-o.pdf and big-o.ppt - a quick lecture on Big-O (a upper bound, not necessarily tight)
- recurrenceRelations-substitution.pdf and recurrenceRelations-substitution.ppt - solving recurrence relations
- findingTheta.html - an example of dividing T(n) by f(n) to discover f(n) (using Maple)

We're going to use `time` to gather runtime data about our
quicksort, on strings.

We will sort inputs of 10,000, 20,000, ... , 100,000 words, and graph our results.

- Compile your sort:
- Find (or create) input files, ins.
*n*, where*n*is each value 10,000, 20,000, etc. - For
*each*input file,`ins.`, get the time to execute:*n*0000`time ./sort1 <`*file*> /dev/nullI would use sum of the reported

and**user**times.**sys** - We want to find good (useful, descriptive) upper- and lower-bounds for T(
*n*) . (Our goal is to find a Theta equivalence, but the data is likely not sufficient for that.)- Put your points (
*n*, T(*n*)) in a table - Did the values T(
*n*) increase with*n*? Then T(n) is bound below by a constant. That is, T(*n*) ∈ ω(1)

- Put your points (
- We wish to find f(
*n*), such that T(*n*) ∈ Θ( f(*n*) )Again, we likely can't. So, we'll find the best (highest) lower bound we can, and the lowest upper bound we can.

- Guess at f(
*n*). E.g., start at f(*n*) =*n* - Add a column to you table, T(
*n*)/f(*n*) - If the values in that column still increase with
*n*, then f(*n*) is a lower bound for T(*n*). We say T(*n*) ∈ Ω(f(*n*)) - If the values in that column are decreasing, then f(
*n*) is an upper bound for T(*n*). We say T(*n*) ∈ O(f(*n*))

- Guess at f(
- We try to refine our upper and lower bounds, as long as we're comfortable w/what the data is saying

gcc -o sort1 sort1.c quicksort.c

E.g.:

n |
T(n) |
T(n) / f(n) |
||
---|---|---|---|---|

f(n) = n |
f(n)=n^{3} |
f(n)=n^{2} |
||

10 | ||||

20 | ||||

30 | ||||

40 | ||||

50 | ||||

60 |

Clearly T(*n*) is increasing. So, it is bound below (but not
tightly) by a constant. We say T(*n*) = ω( 1 ).

Note:Big-O is an upper bound, which may or may not be tight. Little-O is alooseupper bound, which isnottight. Similarly, Big-Omega is a lower bound, which may or may not be tight, and little-omega, ω, is alooselower bound,nottight.

When we divide T(*n*) by f(*n*) = *n*, we see these values apparently still increasing to ∞.
So, T(*n*) is bound below by a line. We say T(*n*) ∈ Ω(*n*). In other words, T(*n*)
grows at least as fast as a line.

So, we choose f(*n*) = *n*^{3}, and we see that T(*n*)/*n*^{3} is probably
decreasing to zero (we need to look around a bit more to be sure). This means that *n*^{3} is an
*upper* bound. I.e., T(*n*) grows no faster than *n*^{3}. So, T(*n*) ∈
O(*n*^{3}).

We now choose f(*n*) = *n*^{2}. Well, I should've made
this example a little more interesting. We can surmise that T(*n*) is,
in fact, 5.2 *n*^{2}. The important point is that
*n*) / *n*^{2}*n*) ∈ Θ( *n*^{2} ).

**Q 1** Supply your chart (that is just tabular
data, *n*, T(*n*), T(*n*)/f(*n*) for various choices of
f(*n*)), *not* a graph, in your gradesheet, and your conclusions

Scale each column as convenient. That is, remove leading (misleading) and trailing zeroes. If you use scientific
notation, do **not** change exponents. Make it easy on the eyeballs.

And get the columns to line up nicely. (No tabs.) Limit yourself to 120 columns.

Guys, keep conclusions succinct. I don't need a bed-time story, nor color commentary. I don't need your procedure. Just give me your thoughts for each column, as a simple statement, using Big-Oh notation, then summarize, the best you can say. Succinctly. Using Big-Oh notation.

C has a `clock` command in its library. Other languages have
similar ideas. See
sortr.c for an example.

It gives us a little better granularity, about what we're timing (we can skip overhead, etc.). Essentially, there
is a clock that starts at 0 when your program starts, and should only tick while your program is executing (as opposed
to being sliced out). Each call to `clock()` just grabs that time. So, you grab a start and end time, take the
difference.

`clock()` returns the # of tics, which is system-dependent, both the value and the granularity. Dividing by
`CLOCKS_PER_SEC` will give you the time, in seconds. But, for graphing/evalution purposes, tics is fine. Don't
divide away significant digits.

For another approach at the same problem, we're going to count the # of swaps performed, rather than measure raw time.

We'll still be working w/the same set of inputs: 10,000, 20,000, ... , 100,000 words, and graph our results.

- Compile your sort for gprof to use:
- For
*each*input file,`in.`, get the number of swaps:*n*0000- Run the program:
- Notice the file in your directory,
`gmon.out` - Run
`gprof`on this data: - The report is in a file called
. There is a description of the data in the report**gprof.out** - Find the # of times
`swap()`was called

`sort1 <`*file*> /dev/null`gprof sort1 gmon.out` - Play the same game as above. Find good (useful) upper and lower bounds.

gcc -p -o sort1 sort1.c quicksort.c

**Q 2** Supply your chart (just the tabular
data, no graphs), and your conclusions in your lab sheet.