Using CorpusSearch on babel

contents of this chapter:

your .cshrc file
your query/output directory

your .cshrc file

add these lines to your .cshrc file:

prepend         PATH /pkg/java-1.2ea6/bin
setenv CLASSPATH /pkg/ling/MIDENG/PPCME2/clean_search
set mecorpus = /home/ataylor/MIDENG/PPCME2/SearchMe

The line beginning "prepend PATH" enables your account to run java programs.

The line beginning "setenv CLASSPATH" ensures that java will be able to find CorpusSearch when you call it from any directory in your account.

The line beginning "set mecorpus" saves typing. Instead of typing "/home/ataylor/MIDENG/PPCME2/SearchMe" (where the corpus is stored) in your java command, you can type "$mecorpus" to get the same result.

your query/output directory

Make a new directory in your account; you might call it "corpus_stuff". This directory will hold your query files (ending with ".q"), and your output files (ending with ".out").

Here's a typical java command (I'm using a query file called "inv_pro_subj.q"):

java CorpusSearch inv_pro_subj.q $mecorpus/*

This command will search the entire corpus (because of the "/*" after "$mecorpus"). The output will appear in a file called "inv_pro_subj.out".

Be patient; a search of the entire corpus currently takes about 10 minutes, depending on the complexity of the query. To run a search in the background, write "&" at the end of your command:

java CorpusSearch inv_pro_subj.q $mecorpus/* &