(1) a. ( (IP-MAT ...
))
b. (IP-MAT ...
)
A command like the one just above will either go to completion or result in an error message. If the latter, you know that the mismatched paren is somewhere between where you started the command and the expected end.C-u 400 C-x C-f
(2) a. ( (IP-MAT (NP-SBJ (PRO$ My) (N neighbor))
(VBD told)
me ← bare word (missing preterminal over terminal)
(. .)))
b. ( (IP-MAT (NP-SBJ (PRO$ My) (N neighbor))
(VBD told)
(NP-OB2 (PRO m e)) ← terminal contains space
(. .)))
c. ( (IP-MAT (NP-SBJ (PRO$ My ← preterminal isn't unary-branching
(N neighbor)))
(VBD told)
(NP-OB2 (PRO me)))
(. .))
Some scripts (like the one that restores the subtags in the Cordial corpus) don't handle accented characters (regardless of the font encoding). The workaround is to translate the accented characters into the corresponding html character entities, to run the script on the resulting ASCII text file, and to restore the accented characters in the output. The relevant translation scripts are utf8-html and html-utf8.