Lab - Regular Expressions

The directory for this lab is ~kschmidt/public_html/CS265/Labs/Regexp


For this lab you will supply regular expressions that return the requested matches. You may test your answers out on the input file bright_side_of_life .

I would recommend egrep (grep -e). Utilities implement REs slightly differently. In some, special (meta-) characters that we've discussed have their special behavior, and must be escaped for the literal. In others (like grep and vim), some of the meta-characters, by default, are literal, and must be escaped to invoke their special meanings.

So, for example, if I want to print all lines that contain the string kurt in the file bright_side_of_life, it would look like this:

$ egrep 'kurt' bright_side_of_life

You could use AWK to do this:

$ awk '/kurt/' bright_side_of_life

Or in sed:

$ sed -n '/kurt/p' bright_side_of_life

Whichever why you go, please use the same for all questions in the lab, and specify above question 1, quite clearly, which you used . Otherwise, it'll be assumed you used grep.

Note that egrep is just grep -E, using extended regular expressions.


For each of the following questions, provide a regular expression that returns the requested matches.

Q 1: match all lines that contain the string the

Q 2: match all lines that contain the word the (not as a substring of a larger word)

Hint: AWK (and grep and vi) all use \< and \> as word anchors (beginning and end, respectively); AWK doesn't understand the \b that vi does.

Q 3: match all lines that contain the word Just

Q 4: match all lines that contain Just or just

Q 5: match all lines that start w/the word Just or just

Q 6: match all lines that contain the word bad or mad

Q 7: match all lines that contain the word death or breath

Q 8: match all lines that end with you. Trailing puncuation is acceptable (so, possibly followed by a period or a comma)

Q 9: match lines that have leading whitespace?

Q 10: match blank lines?

Q 11: How many are there?

[what'd he say?]