Python Starter Kit


This page will serve as a foundation for what you should feel comfortable with when writing code for assignments in the CS260 Data Structures class




Navigation

Numbers and Strings
Lists
Dictionaries
Conditionals
Loops
Functions
Scoping
Input/Output
Notes
Source Code


Interactive Python

Python can be invoked from the command-line by simply typing python at the prompt.
There are several advantages to running in an interactive environment, especially when learning new material.

Here are a couple of example sessions:


Numbers and Strings

i686:0:/home/jesse > python
Python 2.3.3 (#1, May 7 2004, 10:31:40)
[GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
In python, variables can store any data and be reassigned at will without notice.
>>> 1+3
4
>>> 'a' + 'b'
'ab'
>>> a = 1
>>> b = 3
>>> a + b
4
>>> a = 'a'
>>> b = 'b'
>>> a + b
'ab'
>>>
>>>
Memory allocation for strings is done for you; which is nice.
But you can not assign to an index of a string.
>>> animal = 'an asp'
>>> animal[5]
'p'
>>> animal[5] = 's'
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: object doesn't support item assignment
>>>
>>>
Passing a string to * is perfectly valid.
It simply creates n copies of the string and concatinates them to a single string.
>>> a = 'word'
>>> b = a*3
>>> b
'wordwordword'
>>>
>>>
Operations on strings such as split, find, rstrip happen without loops or effort on your part.
>>> a = 'first.second.third'
>>> a.split('.')
['first', 'second', 'third']
>>>
i686:0:/home/jesse >

Lists

i686:0:/home/jesse > python
Python 2.3.3 (#1, May 7 2004, 10:31:40)
[GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
A list is a container that holds an arbitrary amount of arbitrary data.
A list can hold any data type.
A list can hold other lists.
Calling dir() on a list object (or any other object for that matter) will provide you with a list of operations on that object.
>>> list = []
>>> list
[]
>>> dir(list)
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__delslice__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__setitem__', '__setslice__', '__str__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
>>>
>>>
Another useful feature of the language is the __doc__ string.
Most methods in the language have a __doc__ string. To find information regarding an operation listed from dir(), simply print the __doc__ string for that method.
Note: You can get a slightly nicer output from the __doc__ value by typing help([object]).
More on __doc__ strings in the functions section.
>>> print list.append.__doc__
L.append(object) -- append object to end
>>> print list.extend.__doc__
L.extend(iterable) -- extend list by appending elements from the iterable
>>>
>>>
As with strings, all memory allocation is handled by the language and creating lists is just a easy. As a side note, the len() function can be called on lists, dictionaries and strings to get an integer representing the length. Similarly, the str() function can be called on just about any data type to get a string representation of that type (useful for output).
>>> list = ['a', 'simple', 'list']
>>> len(list)
3
>>> str(list)
"['a', 'simple', 'list']"
>>> list[0]
'a'
>>> list[-1]
'list'
>>>
>>>
append() and extend() work differently on lists, it is important you see the difference. You may also use shorthand notations such as +=. Experiment and see what you like.
>>> s = 'name'
>>> list.append(s)
>>> list
['a', 'simple', 'list', 'name']
>>> list.extend(s)
>>> list
['a', 'simple', 'list', 'name', 'n', 'a', 'm', 'e']
>>>
>>>
Slicing a list is taking part of a list from one index to another.
The notation for slicing a list is [n:m]. Where the indicies n up to (but not including) m are copied.
Python allows negative indexing. -1 represents the last item in the list.
>>> list = ['a', 'simple', 'list']
>>> list
['a', 'simple', 'list']
>>> list[0:1]
['a']
>>> list[0:2]
['a', 'simple']
>>> list[1:2]
['simple']
>>> list[:]
['a', 'simple', 'list']
>>> list[0:-1]
['a', 'simple']
>>>
>>>
Careful how you use =. You might end up altering data that you didnt intend to.
>>> dup = list
>>> dup
['a', 'simple', 'list']
>>> dup.remove('simple')
>>> dup
['a', 'list']
>>> list
['a', 'list']
>>>
>>>
>>> list = ['a', 'new', 'list']
>>> dup = list[:]
>>> dup
['a', 'new', 'list']
>>> dup.remove('new')
>>> dup
['a', 'list']
>>> list
['a', 'new', 'list']
>>>
i686:0:/home/jesse >

Dictionaries

i686:0:/home/jesse > python
Python 2.3.3 (#1, May 7 2004, 10:31:40)
[GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Like lists dictionaries hold arbitrary data types, only they are in key:value pairs.
Like lists and strings, all memory management is handled for you.
>>> dict = {}
>>> dict
{}
>>> dir(dict)
['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__str__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values']
>>>
>>>
>>> print dict.items.__doc__
D.items() -> list of D's (key, value) pairs, as 2-tuples
>>>
>>>
Any key:value combination is valid. For instance, True can be used as a key.
Dictionaries are indexed by their keys.
>>> dict = {True:'Boolean', 'letters':'String', 42:'Integer', 'int':21}
>>> dict[True]
'Boolean'
>>> dict['letters']
'String'
>>> dict[42]
'Integer'
>>> dict['int']
21
Unless an integer is a key in the dictionary, trying to index the object with it will result in an error.
>>> dict[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
KeyError: 0
>>>
>>>
>>> dict.items()
[('int', 21), (True, 'Boolean'), (42, 'Integer'), ('letters', 'String')]
>>> dict.pop(True)
'Boolean'
>>> dict
{'int': 21, 42: 'Integer', 'letters': 'String'}
>>>
>>>
Again, the = operator should be used with caution, as a duplicate list is not created. Any operations on the resulting assignment happen to the original list as well.
>>> dup = dict
>>> dup
{'int': 21, 42: 'Integer', 'letters': 'String'}
>>> dup.pop('int')
21
>>> dup
{42: 'Integer', 'letters': 'String'}
>>> dict
{42: 'Integer', 'letters': 'String'}
>>>
>>>
>>> dup = dict.copy()
>>> dup
{42: 'Integer', 'letters': 'String'}
>>> dup.pop(42)
'Integer'
>>> dup
{'letters': 'String'}
>>> dict
{42: 'Integer', 'letters': 'String'}
>>>

Conditionals

i686:0:/home/jesse > python
Python 2.3.3 (#1, May  7 2004, 10:31:40)
[GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> tests = {0:0, 1:'0', 2:'', 3:False, 4:True, 5:None, 6:'False'}
>>> len(tests)
7
>>>
>>>
More on defining (def) functions and loops later, but just examine the output from the function here. See the dictionaries section if you do not understand the tests assignment above.
>>> def check_it(val):
...     if val:
...             print 'Passed'
...     else:
...             print 'Failed'
...
>>>
>>>
>>> for i in range(len(tests)):
...     check_it(tests[i])
...
Failed
Passed
Failed
Failed
Passed
Failed
Passed
>>>
i686:0:/home/jesse >

Loops

i686:0:/home/jesse > python
Python 2.3.3 (#1, May 7 2004, 10:31:40)
[GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> list = [['one', 'fish'], ['two','fish'], ['red','fish'], ['blue','fish']]
>>> len(list)
4
>>> len(list[0])
2
>>>
>>>
Loops are zero indexed.
So this loop will start at 0 and loop up to, but not include, 4.
>>> for i in range(4):
...     print i
...
0
1
2
3
>>>
>>>
You can use a loop to iterate over a list (or dictionary).
>>> for i in range(len(list)):
...     print list[i]
...
['one', 'fish']
['two', 'fish']
['red', 'fish']
['blue', 'fish']
>>>
>>>
By why would you when you can just use an iterator.
>>> for i in list:
...     print i
...
['one', 'fish']
['two', 'fish']
['red', 'fish']
['blue', 'fish']
>>>
>>>
As the above list is actually a list of lists, trying to print each element would involve iterating over each sub-list as well.
This is getting kinda messy - it's starting to look like C.
>>> for i in list:
...     for j in i:
...             print j
...
one
fish
two
fish
red
fish
blue
fish
>>>
>>>
Python provides a much more elegant way of doing such things.
Notice the [] surrounding the argument to join()? That means the result is going to be a list.
The funny '%s %s' is a format string, much like C's printf format strings.
The (a, b) are the arguments to the format string.
And we are assigning a, b for each element in list, which just so happens to be a two-element list.
>>> print '\n'.join(['%s %s' % (a,b) for a, b in list])
one fish
two fish
red fish
blue fish
>>>
i686:0:/home/jesse >

Functions

i686:0:/home/jesse > python
Python 2.3.3 (#1, May 7 2004, 10:31:40)
[GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Do yourself a favor and always document you code.
>>> def foo():
...     """
...     This is a multi-line comment that should be at the beginning of
...     each function. It details what the function does. All functions
...     in the standard library have this __doc__ string to describe the
...     functionality of the, well, function ;)
...     To read this string at run-time, simply type:
...             print foo.__doc__
...     """
...
>>>
>>>
>>> print foo.__doc__
 
        This is a multi-line comment that should be at the beginning of
        each function. It details what the function does. All functions
        in the standard library have this __doc__ string to describe the
        functionality of the, well, function ;)
        To read this string at run-time, simply type:
           print foo.__doc__

>>>
>>>
Later, when you've forgotten what you wrote the stinkin thing for, you can always look it up and save yourself the struggle.
Remember that you can also use help(foo) to get at the same information.
>>> list = []
>>> dir(list)
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__delslice__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__setitem__', '__setslice__', '__str__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
>>>
>>>
>>> print list.__add__.__doc__
x.__add__(y) <==> x+y
>>>
>>>
>>> s = 'string'
>>> dir(s)
['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getslice__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__str__', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'replace', 'rfind', 'rindex', 'rjust', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
>>>
>>>
>>> print s.rstrip.__doc__
S.rstrip([chars]) -> string or unicode
 
Return a copy of the string S with trailing whitespace removed.
If chars is given and not None, remove characters in chars instead.
If chars is unicode, S will be converted to unicode before stripping
>>>
>>>
Define a function by using the def keyword.
As all scoping in python is keyed off the indentation, you will have to indent the body of the function.
>>> def foo(val):
...     print str(val)
...
>>> a = False
>>> b = 'string'
>>> c = 43.78
>>> d = None
>>> foo(a)
False
>>> foo(b)
string
>>> foo(c)
43.78
>>> foo(d)
None
>>>
>>>
Passing an indexed object does not change much in the definition stage of the function.
>>> def foo(list):
...     list[2] = 'changed'
...
>>> l = ['original']*3
>>> l
['original', 'original', 'original']
>>> foo(l)
>>> l
['original', 'original', 'changed']
>>>
>>>
You will see examples of scoping later on, but examine how the following code acts.
>>> def foo(list):
...     list = ['changed']*3
...
>>> l = ['original']*3
>>> l
['original', 'original', 'original']
>>> foo(l)
>>> l
['original', 'original', 'original']
>>>
>>>
Returning items from a function is just as simple as anything else. We can do this with the return keyword.
>>> def foo(str):
...     return str.split(':')
...
>>> s = 'one:two:three:four'
>>> foo(s)
['one', 'two', 'three', 'four']
>>> val = foo(s)
>>> val
['one', 'two', 'three', 'four']
>>>
>>>
The syntax is loose.
You can assign a list in many different ways.
>>> a, b, c, d = foo(s)
>>> a
'one'
>>> b
'two'
>>> c
'three'
>>> d
'four'
But be careful for unexpected side-effects.
>>> a, b, c, d, e = foo(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: unpack list of wrong size
>>>
>>>
Remember that strange (a, b) for a, b in list in the loops section? Well, now you can create your own versions of that very same behavior, only doing it from the inside will better help you understand what is happening.
>>> def foo():
...     return [[1,2],[2,4],[4,8]]
...
>>> for x, y in foo():
...     print str(x) + ' --> ' + str(y)
...
1 --> 2
2 --> 4
4 --> 8
>>>
i686:0:/home/jesse >

Scoping

i686:0:/home/jesse > python
Python 2.3.3 (#1, May 7 2004, 10:31:40)
[GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>>
>>> value = 42
>>> value
42
>>>
locals() and globals() return a dictionary of the currently scoped variables when they are called.
>>> locals()
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__', '__doc__': None, 'value': 42}
>>>
>>>
>>> globals()
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__', '__doc__': None, 'value': 42}
>>>
>>>
Get comfortable with the scoping rules.
Experiment yourself, and see what happens.
>>> def foo(x):
...     x = 99
...
>>> foo(value)
>>> value
42
>>>
>>>
>>> def foo():
...     value = 99
...
>>> foo()
>>> value
42
>>>
>>>
>>> def foo(x):
...     x += 1
...
>>> foo(value)
>>> value
42
>>>
>>>
>>> def foo():
...     value += 1
...
>>> foo()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 2, in foo
UnboundLocalError: local variable 'value' referenced before assignment
>>>
>>>
>>> def foo(x):
...     print locals()
...     print globals()
...
>>> foo(value)
{'x': 42}
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__', 'foo': <function foo at 0xf7017844>, '__doc__': None, 'value': 42}
>>>
>>>
>>> def wrong():
...     d = globals()
...     d['value'] = 99
...
>>> wrong()
>>> value
99
>>>
i686:0:/home/jesse >

Input/Output

Working with files, like with the rest of these examples, is rather easy.
Here is a text file called t.t that I prepared ahead of time.
i686:0:/home/jesse > cat t.t
Man who run behind car get exhausted
Man who stand on toilet is high on pot
Man who drop watch in toilet have shitty time
Man who cooks carrots and peas in same pot unsanitary
Elevator smell different to midget
War not determine who right, war determine who left
i686:0:/home/jesse >
i686:0:/home/jesse > python
Python 2.3.3 (#1, May 7 2004, 10:31:40)
[GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
To use that file in my program, I could do something like this...
You need to use rstrip() on the lines as they retain the newline character when read in from the file.
>>> infile = file('t.t', 'r')
>>> for line in infile:
...     print line.rstrip('\n')
...
Man who run behind car get exhausted
Man who stand on toilet is high on pot
Man who drop watch in toilet have shitty time
Man who cooks carrots and peas in same pot unsanitary
Elevator smell different to midget
War not determine who right, war determine who left
>>>
>>>
Writing to a file can be done in several ways, this is just one example.
>>> outfile = open('out.dat', 'w')
>>> for i in range(5):
...     outfile.write(str(i) + '\n')
...
>>>
i686:0:/home/jesse > cat out.dat
0
1
2
3
4
i686:0:/home/jesse >

i686:0:/home/jesse > python
Python 2.3.3 (#1, May 7 2004, 10:31:40)
[GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
Obtaining input from the user (or stdin) can be done with a call to input() the argument to which is the prompt you wish to use.
Do note, however, that input() tries to evaluate the string passed in as a code block, so it is dangerous and error prone.
It is safer to use raw_input().
>>> val = input(' > ')
 > a string
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<string>", line 1
a string
      ^
SyntaxError: unexpected EOF while parsing
>>>
>>>
>>> val = input(' > ')
 > 'a string'
>>>
>>> print val
a string
>>>
>>>
>>> val = input(' > ')
 > x,y,z
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<string>", line 0, in ?
NameError: name 'x' is not defined
>>>
>>>
>>>
Using the same input situations with raw_input() is a smoother operation.
>>> val = raw_input(' > ')
 > a string
>>> print val
a string
>>>
>>>
>>> val = raw_input(' > ')
 > x,y,x
>>> print val
x,y,x
>>>
>>>
Everything is an object...
So calling split on the string returned from the call to raw_input is perfectly valid, and based on the input, might be a simpler solution to say reading and parsing after the fact
>>> list = raw_input(' > ').split(',')
 > type,words,seperated,by,commas,here
>>> list
['type', 'words', 'seperated', 'by', 'commas', 'here']
>>>
i686:0:/home/jesse >

Notes

TBA

Source code

  1. first.py
  2. html_str.py
  3. lexer.py (sample input, and engine)
  4. matrix_print.py (example use)