R Reporting Part 1: Tools

This is the first of a series of article on how to use R, RStudio and TexMaker to prepare presentations and batch jobs for automated reporting on a web server or Microsoft SharePoint server. The series is based upon the presentation that I did at the February 27, 2016 Dallas R User Group Meetup. Because the presentation was primarily a demonstration, there really isn’t a presentation to distribute; this series covers the topics from the presentation/demonstration. The series will eventually include the following articles as I complete them over the next couple of weeks:

The series of articles describes the process for daily batch jobs that generate the Daily Econometric Graphs web page which includes links to the same econometric charts in several formats, all generated through the same R code:

All of the examples are based upon the knitr R package; you should reference the knitr documentation, as this article is not a replacement for the knitr documentation.

Example source is available in images/documents/econometric_source.zip.

Software to Install

To use R for presentations and batch reports, there are a number of software applications that you will need install on your desktop and your web server. The sections that following describe the installation of the various packages that you will need. All of the software applications in this section are available on Linux, Windows and OS X.

R and Related Packages

First you will need to install R from CRAN. Install files and instructions are available on the various CRAN mirrors.

RStudio

RStudio is a popular integrated development environment (IDE) for R, although is not required for any of the presentations here, it makes a number of things very convenient. Other R IDEs are Emacs Speaks Statistics, which has the advantage of working with other statistical and programming languages. Eclipse users should look at StatET for R. The examples for this series are done in RStudio, but would work in the other environments with minimal modification.

Once you have R and RStudio installed, you will need to install knitr, and sweave for any presentation or reporting use, and you will need the fImport, ggplot2 packages to run the examples in this series of articles. Use the following command to install the packages:

install.packages(c("knitr","sweave","ggplot2"))

The HTML Editor of Your Choice

Although RStudio is a great IDE for R, it does not do a good job of HTML sytax highlighting and spell checking. Once you have the R portions of your code working well, you will want to use a dedicated HTML editor to do the writing, and HTML formatting in your Rhtml documents. You can use any editor that you want; Bluefish is available on Linux, Windows and OS X.

LaTeX

If you need heavy formating, tables of contents, figure cross references and bibliography management in your presentations and reports, you will want to use LaTeX. It was developed primarily for accedemic writing for math and science and thus does a very good job of handling mathmatical symbols and equations, automatic tables of contents, indexing, cross referencing, bibliography, and citation. It is a tag language like HTML.

In Linux, most package managers will allow the easy install of the TeXLive distribution, although not necessarily the most recent one. On Windows, MikTeX is the preferred way to install LaTeX, but you can also install it via Cygwin. On OS X, MacTeX and MacPorts are perhaps the most convenient ways to install the LaTeX distribution. You should install the Beamer package; it is not part of the default installation.

TexMaker

If will be using LaTeX for presentations and articles, you will want to use embed your R code in LaTeX documents; although RStudio has great capabilities for the R portion of this workflow, at the point that you start working on the writing tasks, you will want to begin using a LaTeX IDE. Texmaker runs on Linux, Windows and OS X; it allows you to run R using sweave and knitr in the same way that RStudio does, but has spell checking features that make working with the LaTeX document easier.

By default, Texmaker uses sweave as shown in Figure 1. For most uses today and particularly for the examples in this article, you will want to switch it to knitr by changing the Sweave command to /usr/bin/Rscript -e "require('knitr'); knit('%.Rnw')"

as shown in Figure 2. The final configuration step is changing the Quick Build (F1) key to run Sweave/Knitr before running pdflatex as shown in Figure 3.

Figure 1. Texmaker configuration panel showing default use of Sweave.
Sreenshot showing default Texmaker configuration to call sweave.
Figure 2. Texmaker configuration panel showing changed command to call Knitr instead of Sweave.
Screenshot showing command to change Texmaker so that it will use Knitr instead of Sweave
Figure 3. Texmaker configuration panel showing change to Sweave/Knitr workflow for Quick Build (F1).
Screenshot showing Texmaker configuration panel with Sweave/Knitr selected for Quick Build (F1)

Secure File Transfer Utilities–scp, rsync or Something Else

For batch reporting through cron or some other scheduler, you will almost invariably need some way to transfer files between systems. Secure copy or scp is probably the most universal way to do this. It is installed by default on most Linux and OS X systems. On Windows, scp is available as part of Cygwin. In a corporate Windows environment, you should talk to you IT group about what tools to use on your network; in Windows environments, shared drives are a common way to handle file copies. Another alternative is rsync which routinely available on Linux; for OS X, it can be installed via MacPorts while on Windows it can be installed via Cygwin.

For scp and rsync, you will want to use ssh-keygen to allow secure connections without using passwords and potentially ssh-agent for additional security.

Optipng and Other Image Compression Tools

The PNG and other images that R generates are not compressed as fully as is possible. To speed up web pages, you will want to compress images before uploading them using optipng or some other compression optimization tool. Optipng is available routinely in Linux, Cygwin and MacPorts. To call it in R use

system("optipng images/figures/*.png")

where images/figures/*.png is the path to the image files that your R script created.

ImageMagick Image Resizing and Conversion Tools

For web applications, you will probably want additional image sizes for use in links that are specific for different social media sites. ImageMagick is the most convenient tool for converting and resizing images in a script. It is available for Linux, Windows (Cygwin) and OS X (MacPorts). To use it in R to create an 450 pixel image for use in Facebook or some other social media site in R code, you would use something like

system("convert images/figures/ncid_daily_plot-1.png -resize 450x images/figures/ncid_daily_plot-1_shrink.png") system("optipng images/figures/*.png")

Calibre and latex2html for E-book Tools

To create ebooks in EPUB for most e-readers and AZW3 for Kindle e-readers, you will need latex2html (or some other LaTeX to HTML converter) and Calibre. LaTeX2html is not being actively maintained so it is not a good choice for a production environment, but it is available for all platforms.

Server Side Include Software on Web Server

If you are posting your R document to a webserver running a content management system like Wordpress, Joomla or Drupal, you will need an extension to enable server side includes. This will probably require higher administrative rights than is typically given to normal authors, so check with your CMS administrator before you start on a big project. On Joomla, Sourcerer is one of several extensions that allow server side includes. It uses the syntax

{source} <?php include("images/interactive/econometric_charts_home_page.html"); ?> {/source}
## Warning in file(filename, "r", encoding = encoding): cannot open file
## 'optipng images/figures/*.png': No such file or directory
## Error in file(filename, "r", encoding = encoding): cannot open the connection