Finding corpus examples

As you know from Assignment 6, the version of the coded corpora that you search with grep consists of only coding strings. If you open an Emacs window containing the file with coding strings, you will see that it contains no text whatsoever. The discussion below uses the PPCEME as the example corpus.

What if you want to look at the actual examples that correspond to a particular coding string? You might want to do so in order to follow up on a hypothesis or to find illustrative examples for your paper. In that case, you need to look at the coded corpus itself.

emacs ppceme.cod
In the .cod file, each coding string is embedded in the clause that it belongs with. In addition, the beginning of the .cod file contains the CorpusSearch coding query that generated the file. When you open a .cod file, you will see the query. Page past it, and eventually you will see the coded corpus.

To speed things up, you can use C-s to search for the string CODING. See the Emacs tutorial for instructions on how to search in an Emacs window.

In addition to searching for ordinary strings, you can also search for regular expressions within an Emacs window (see Searching and counting with grep). Instead of using C-s, use Esc C-s. In an ordinary search, the last line in your Emacs window says I-search: In connection with a regular expression search, the last line will say Regexp I-search:

If Esc doesn't give the desired result, try Alt.

Searching for coding strings within a .cod file is essentially the same as searching the file. However, since the coding strings in the .cod file are not at the beginning of a line, you shouldn't begin your regular expression searches within the .cod file with a caret to indicate beginning of line. Rather, you need to "anchor" your searches with the string (CODING. For instance, in order to search for all instances of the old grammar in the negative context in ppceme.cod file, your regular expression would be:


(The above search doesn't include main verb be or have. Why not?)