This course is intended to give undergraduate linguistics majors sustained hands-on experience carrying out linguistic research. The course involves two projects, one related to syntax and the other to phonology. In keeping with the research strengths of the Penn linguistics department, the projects focus on analyzing naturally occurring data with computational and quantitative methods, and they address issues of structure, diachrony, and acquisition in an integrated way.
The character of the course is a cross between a lab and a seminar, leaning towards the lab side.
Most of the work for the tutorial will be done on a laptop during class. You will need the following resources:
- the current version of AirPennNet
It's recommended that you install the most recent version.
- Unix or Linux
Not all of the software that we use in this class is invoked via a graphical user interface (GUI), and you will often need to interact with your computer's operating system by typing commands from a command line rather than by pointing and clicking a mouse. In particular, for the syntax project, you will need to issue commands from an operating system called Unix (the proprietary name) or Linux (the open-source equivalent of Unix).
All recent Macintosh computers run Unix as their operating system, and they come with a command line interface (CLI) called Terminal. You can put Terminal in the dock for convenience just like any other application.
Windows computers come with their own operating system, which is neither Unix nor Linux. There are various ways of getting access to Linux, but a very convenient way is to download Virtual Box, which allows you to run a version of Linux called Ubuntu Linux under it as if it were a Windows application, maintaining full access to your Windows applications.
It is also possible to install Ubuntu Linux without Virtual Box, but this forces you to choose between running your computer as a Windows machine or as if it ran Linux natively.
- a text editor
In the second half of the syntax project, you will be submitting scripts to Unix/Linux, which must be plain text files. You can't use Word to create such files because Word introduces various invisible control characters that Unix/Linux can't interpret. Instead, you'll have to use a standard text editor, such as (in alphabetical order):
- emacs
- Notepad (Windows, proprietary)
- Notepad++ (Windows, open source; environmentally friendly (!))
- TextEdit (Mac)
- a spreadsheet program, such as:
- Excel (available as part of Penn's supported Office Suite)
- (for Mac users) NeoOffice
- Open Office (the download page is very slow to load!)
To judge from the experience of past classes, Excel is the least buggy spreadsheet program.
Readings
Given the goal of the class, the class does not focus on readings from the literature. Nevertheless, as the occasion arises, you may be asked to read background literature or the results of other researchers' work as they are presented in the primary literature. Links to the readings will be posted on the syllabus.Assignments
Your grade will depend on completing several assignments, which come in three types:
- A assignments are prerequisites for other assignments - for instance, reading background material, installing software, or familiarizing yourself with online resources or tools. These assignments are not submitted and carry no credit.
- B assignments have two main functions: they allow me (and you) to see if you understand the work, and they form the empirical basis of the research. The progress of the class sometimes depends on the quality and timely submission of these assignments. Therefore, each B assignment counts 5 points towards your grade. I reserve the right to give partial or no credit for late, incomplete, or otherwise unsatisfactory B assignments.
- C assignments include your final reports (10-15 pages each) on the two topics that we investigate, and possibly one or two additional shorter reports. C assignments are graded and furnish the remaining points in your grade.
Extra credit
There is none. It's unfair to the other students in the class.
Academic integrity
As with any other class at Penn, your work for the tutorial is subject to Penn's Code of Academic Integrity.If I have reason to believe that you are violating this code, I will contact the Office of Student Conduct (OSC) to initiate an investigation. If the investigation finds that you have violated the Code of Academic Integrity, you will fail the class.
Guidelines for submission
- Unless otherwise noted, assignments are due by 11:59 p.m. of the deadline date.
- Please submit your assignments by email. In order to facilitate record-keeping, the subject line of your email and the name of any attachments should include the following information:
- the class (Ling 300)
- the assignment number
- your last name
Example: Ling 300 - A1 - Bloomfield
Guidelines for content and style
- The two substantial papers that you submit will likely each be 10-15 pages long. You are welcome to draw on the readings or other relevant literature, but the paper should not be a literature review. Rather, the point of the paper is to present your own quantitative findings as cogently as you can.
- Number tables and figures for ease of reference.
- Unless there is a special reason, label the y-axes on any graphs you submit starting at 0% and ending at 100%. This facilitates the comparison of results within the same paper and across different papers.
- Round percentages to the nearest integer (or, if reporting decimals, to two significant decimal places). Given the size of the datasets we are working with, any greater accuracy is spurious.
- For each graph that reports percentages, include an associated table with the Ns underlying each percentage. This enables the reader to assess the reliability of the percentages, notably outliers. For an example, see Kroch 1989, Figure 6 and the associated Table 3.
- It is fine to explicitly omit data points based on small total Ns (say, where N < 5).
- In reporting a historical development, don't report both the increase and the decrease if the one is simply the inverse of (predictable from) the other. Reporting both datasets is confusing because it gives the false impression that there is information in the second set beyond the information contained in the first.
- When citing a specific point in a reference, please include the page number(s). It makes the relevant passage much easier to find for the person reading your paper.
- Use any commonly accepted style sheet for formatting any bibliographical references.