Over the last few months, I’ve been focussing more and more on development of the PuffR package. What’s PuffR? It’s currently an R package that streamlines the process of running CALPUFF models. What’s CALPUFF? That’s an atmospheric dispersion model, and a reasonably advanced one. Getting estimates of pollutant levels at a given location at some specified time is greatly aided by running a model like this.
CALPUFF is really a collection of modeling utilities. It can be compiled for a number of platforms given that the Fortran source code is made available (pre-built binaries for Windows are also available). So far, it’s sounding rather inexpensive, right? It’s true, and I’ve always prepared the model inputs by using several free or relatively inexpensive software tools (e.g., GIS software, FTP software, text-editing software, etc.). I manually edited text files, extracted terrain heights from DEMs, collected and processed meteorological data, and made sure that the inputs were valid and consistent (if I wanted the models to run).
My Interest in Working on this R Package
Does the process I’ve described sound a bit janky? It is. All that moving about between different applications is fine, but, it can be time-consuming. Worse, it is easy to make mistakes within those different applications. Or with file-handling. Or when transcribing values. As such, it can be difficult to trace the sources of errors (and this may mean re-doing earlier analysis steps since the model likely won’t run).
Developers did get into the space and now there are several frontends to CALPUFF. These applications all have GUI-driven interfaces and they all provide a workflow for CALPUFF. Sounds great, right? Well, they are a little expensive:
Whoa… Aside from those prices (likely a deal-breaker for many), using the point-and-click paradigm for conducting an analysis doesn’t easily allow for reproducibility of the modeling exercise. Each of the steps isn’t likely to be documented, and, sharing a dispersion modeling project with someone else would require the same paid software.
This is where PuffR comes in. The goal is to use the advantages that come with using R (free, scriptable, thousands of free packages, etc.) and create a zero-cost interface for atmospheric dispersion modeling with CALPUFF. R’s got pretty much everything we need to use collect data, process that data, run the models, analyze the output, and visualize it. And we should be able to do it without wads of cash.
What I’ve done so far
I created a GitHub repository for this project quite awhile back but development on this has been more active lately. Aside from all the source files in the repo, there is a fairly detailed (but admittedly incomplete) README. Yeah, it’s a bit long but tried to make it a fun read. More recently (late last night), I finished up a presentation that explains the purpose and use of the PuffR package. Warning: that presentation, available on SlideShare, is also quite lengthy (hey! I’ve got a lot to say!) but if you download the Keynote edition you’ll be treated to some spiffy transitions a la Magic Move1.
Last thing I’ll talk about here is the name: PuffR. Not terribly original and I know that. If you haven’t guessed, it’s a riff on CALPUFF (yes, I think it’s always capitalized like that) and R. CALPUFF, itself, pulls together ‘California’ and ‘puff’ (for puff model). These are both highly unusual portmanteaux2. Anyhow, throwing in an R—either uppercase or lowercase—in a package is still considered cool. I did it without hesitation.