MANUAL

     DONE - add section tags for disfluent text (BREAK, ELAB, FS, REP, TAG)

     DONE - reconcile HREF and NAME

     DONE - copy working copy of manual as reference manual to /hist-corpora

DONE - PRELIMINARIES

      NEW - for PPCMBE stage-2, make sure that the QTXT issue
      (discrepancies between blaise text and POS-tagged text) is
      resolved before continuing

      DONE - review (X, (XX, -XXX

      DONE - run sanity-check-ref on .ref files

      DONE - in working-files directory, divvy up all-[123] files into .ref files

      DONE for ME2 and MBE - compare page numbers in .ref file and matching .info file

      DONE - names for parsed files and info file match?

      DONE - names for parsed files and filenames in info file match?

      DONE - ensure consistency for bible files

DONE - RENUMBERING

      DONE - CST reformat *.ref

      DONE - find highest index number

      DONE - edit DEBUG1 and DEBUG2 to run on *.ref.fmt files

      DONE - run DEBUG1

      DONE - run DEBUG2

      DONE - fix errors in *.ref files

DONE - PSD FILES

      DONE - edit ref-to-psd

      $Q="/home/migration/other/MIDENG/PPCMBE/queries"
      $UT="/home/migration/other/MIDENG/PPCMBE/ut"

      DONE - CST $Q/reformat.q *.ref

      DONE - $UT/fmt-to-current *.ref.fmt

      DONE - $UT/re-ref *.ref

      DONE - $UT/ref-to-psd *.ref

      DONE - sanity checks
            DONE - cs $Q/SANITY-CHECK/5/2id.q *.psd; grep ZZZ *.psd.out
            DONE - cs $Q/SANITY-CHECK/5/bad-id-1.q *.psd; grep ZZZ *.psd.out
            DONE - cs $Q/SANITY-CHECK/5/bad-id-2.q *.psd; grep ZZZ *.psd.out
            DONE - cs $Q/SANITY-CHECK/5/missing-id.q *.psd; grep ZZZ *.psd.out
	    DONE - $UT/sanity-check-ref *.psd
	    DONE - $UT/sanity-check-psd *.psd
	    DONE - $UT/sanity-check-psd-2 *.psd
	    DONE - $UT/sanity-check-tags-psd
            DONE - diff-psd
    	    DONE - cat *.psd > ALL
            DONE - in ALL, count-matches '( ('
	    DONE - C-x C-f should agree with count-matches
	    DONE - after running trivial.q on *.psd, count-matches and CorpusSearch should agree

      DONE - CST $Q/reformat.q *.psd

      DONE - $UT/fmt-to-current *.psd.fmt

      DONE - tail -1 *.psd | grep -v '(ID'

DONE - POS FILES

      DONE - replace '/' in corpus by ''

      DONE - edit and run psd-to-pos

      DONE - sanity checks
            DONE - sanity-check-pos
            DONE - diff-pos

      DONE - compare id's for tagged version against parsed version, using get-id-psd and get-id-pos

DONE - TXT FILES 

      DONE - edit pos-to-txt

      DONE - run pos-to-txt

      DONE - compare id's for text version against parsed version, using get-id-txt

WORD COUNTS

     generate word counts for each file, using pos-to-wordcount

     post word count to info files

     compare word count info in info files and using pos-to-wordcount

     generate WORDCOUNT-PPCMBE file from info files, using "paste"
     (name, date, wordcount, genre)

HIST-CORPORA DIRECTORY

     toc-short
     titlepage
     corpus description
     philological information
     corpusSearch
     PPCME2 home
     PPCEME home

     order information
     modification date

     delete any tilde files
     grep -i "Helsinki"
     grep "<font color"
     delete any comments <!-- ... -->

     deal with .htm and .html confusion
     update .html files
     reconcile HREF and NAME

MAKE THE RELEASE

     MIDENG="/home/migration/other/MIDENG"
     HIST="/htdocs/hist-corpora"
     Q="/home/migration/other/MIDENG/PPCMBE/queries"
     
     ppcme2 info files are already html, so no need for info-to-html

     cd $HIST/PPCME2-RELEASE-3/info
     /bin/rm *.html WORDCOUNT-*
     cp $MIDENG/PPCME2/info/*.html .
     cp $MIDENG/PPCME2/info/WORDCOUNT-* ../.
     
     cd $MIDENG/PPCEME/info
     $UT/info-to-html *.info
     cd $HIST/PPCEME-RELEASE-2/info
     /bin/rm *.html WORDCOUNT-*
     cp $MIDENG/PPCEME/info/*.html .
     cp $MIDENG/PPCEME/info/WORDCOUNT-* ../.
     
     cd $MIDENG/PPCMBE/info/stage-1
     $UT/info-to-html *.info
     cd $HIST/PPCMBE-RELEASE-1/info
     /bin/rm *.html WORDCOUNT-*
     cp $MIDENG/PPCMBE/info/stage-1/*.html .
     cp $MIDENG/PPCMBE/info/stage-1/WORDCOUNT-* ../.
     
     double-check that the descriptions for the corpora reflect 
     the current WORDCOUNT file
     
     make the psd, pos, and txt files by doing the following in each 
     of the working-files (or subdirectories)
     
     $UT/ref-to-psd *.ref
     $UT/sanity-check-psd *.psd
     $UT/sanity-check-psd-2 *.psd
     $UT/re-ref *.psd
     CST $Q/reformat.q *.psd
     $UT/fmt-to-current *.psd.fmt
     $UT/sanity-check-psd *.psd
     $UT/sanity-check-psd-2 *.psd
     $UT/psd-to-pos *.psd
     $UT/sanity-check-pos *.pos
     $UT/pos-to-txt *.pos
     
     sanity checks on psd tags (diff-psd), pos tags (diff-pos), ids (get-id-all)
     
     prepare archive directory
     
     cd $ARCHIVE
     OLD directory is out of the way, right?
     /bin/rm -r $ARCHIVE/*
     cp -r $HIST/* $ARCHIVE/.
     
     cd PPCME2-RELEASE-3
     mkdir corpus with subdirectories psd, pos, txt
     mv the appropriate files
     
     cd PPCEME-RELEASE-2
     mkdir corpus with subdirectories psd, pos, txt
     in each, mkdir helsinki, penn1, penn2
     mv the appropriate files
     
     cd PPCMBE-RELEASE-1
     mkdir corpus with subdirectories psd, pos, txt
     mv the appropriate files
     
     sanity checks
     ffind "*~"
     ffind ".Zap*"
     ffind "*rest*"
     ffind "[A-Z]*"