Summer 2010 Week 1

Summer 2010: Intro

Home General: Data Structure ->


This is the website for the Summer 2010 meetings of the UPenn R Study Group. It has been organized by Josef Fruehwald and attended mostly by researchers affiliated with the Linguistics and Psychology Departments at Penn.

Check out the Google Group for e-mail updates:

Agenda for the R Study Group 2010

Hopefully, this Study Group will operate more like a workshop. That is, it should be maximally flexible and participatory. It would be most useful if we brought datasets which we care about to work with.

Goals for Data Structure and Manipulation

The first half of the workshop will devoted to structuring, manipulating and aggregating your data. Time spent figuring out how to do these things on your own for the first time constitutes an enormous time suck. I think that improving our abilities in these areas will vastly improve our data analysis efficiency.

  • General: Discuss the ideal and most flexible formats for data.
  • General: Discuss data collection and storage.
  • R: Reshaping Data. (reshape package)
  • R: Manipulating And Subsetting Data.
  • R: Data Aggregation. (reshape and plyr packages)
  • R: Efficient workflows.

Goals for Graphical Display

The second half of the workshop will be devoted to generating appropriate and effective graphics, focusing especially on using the ggplot2 package.

  • General: Discuss appropriate kinds and designs of graphics.
  • R: Designing graphics in ggplot2
  • R: Fine tuning graphics in ggplot2

Sources and References

Data Manipulation

  • Data Manipulation with R, Phil Spector
    This is a pretty good, and broad book on how to do data manipulation
  • Plyr webpage
    A plyr course
    Plyr documentation[pdf]
    Plyr is an excellent data aggregation package, and also allows for powerful batch application of a function over subsets of your data.
  • Reshape webpage
    Intro to Reshape
    Reshape is a package for reshaping your data.

Graphical Display