R Reporting Part 2: Choosing the Right Markup for the Task
This is the second of a series of article on how to use R, RStudio and TexMaker to prepare presentations and batch jobs for automated reporting on a web server or Microsoft SharePoint server. The series is based upon the presentation that I did at the February 27, 2016 Dallas R User Group Meetup. Because the presentation was primarily a demonstration, there really isn’t a presentation to distribute; this series covers the topics from the presentation/demonstration. The series will eventually include the following articles as I complete them over the next couple of weeks:
- R Reporting Part 1: Tools
- R Reporting Part 2: Choosing the Right Markup for the Task
- R Reporting Part 3: Using Rhtml for Batch Web Reporting
- R Reporting Part 4: Using Markdown for Interactive Presentations
- R Reporting Part 5: Using LaTeX/Beamer for PDF Presentations
- R Reporting Part 6: Using LaTeX for PDF Articles
- R Reporting Part 7: Converting R Documents to E-books
- R Reporting Part 8: Using LaTeX to Create Posters
The series of articles describes the process for daily batch jobs that generate the Daily Econometric Graphs web page which includes links to the same econometric charts in several formats, all generated through the same R code:
- Econometric graphs in PDF form for use as slides on a projector
- Econometric graphs in PDF form for printing
- Econometric graphs in EPUB format
- Econometric graphs in Kindle AZW3 format
- Econometric graphs in A0 poster format
All of the examples are based upon the knitr
R package; you should reference the knitr
documentation, as this article is not a replacement for the knitr
documentation.
Example source is available in images/documents/econometric_source.zip.
Choosing the Right Markup Language for the Task
There are several markup languages in which you can embed R code to make it easier to write and maintain presentations, articles and web pages. This article discusses when each of the various tools work well, and when they do not. There are three basic markup languages that are available in RStudio and other R-based integrated development environments (IDEs), with a couple of variants for each:
- Markdown
- Presentation
- Document
- MS Word Document
- HTML
- LaTeX
- Presentation
- Article
- Poster
- E-book
Table 1 shows a simplified summary of the features for each of the markup languages. The sections that follow discuss the features of each of the markup languages in more detail. Generally speaking, Markdown is the easiest to learn and is good for documents that are not going to be printed, Rhtml is the best choice for something that will be included in a web page, while LaTeX is best for something that is large, and complex. LaTeX was originally developed for mathematical and scientific publishing and has tremendous capabilities for automated Table of Contents maintenance, cross referencing, citation, bibliography maintenance and indexing.
Markup Language | Presentations with Static Graphics | Presentations with Interactive Graphics | Presentations with Running Table of Contents | Include Mathematics (Portable) | Include Mathematics (non-Portable) | Include Complex Tables | Print Articles with Table of Contents, cross reference and bibliograpy | Web Articles | E-book | Poster | MS Word .docx |
---|---|---|---|---|---|---|---|---|---|---|---|
Markdown | Y | Y | Y | Y | Y | Y | |||||
Rhtml | Y | Y | Y | Y | |||||||
LaTeX | Y | Y | Y | Y | Y | Y | |||||
LaTeX/Beamer | Y | Y | Y | Y | Y | Y | Y | Y |
Markdown
Markdown is by far the easiest of the markup languages and has some important advantages and disadvantages over the Rhtml and LaTeX:
- Markdown can output directly to a Microsoft Word document and is the best choice by far if this is your target; you can convert both Rhtml and LaTeX to
.docx
but it is not simple nor are the results necessarily pleasing. - Markdown can embed some of the HTML and JavaScript based interactive graphics. This is possible for Rhtml with some knowledge, but this is impossible in LaTeX.
- Markdown cannot do the automated Table of Contents, cross referencing, and other large-publication features possible in LaTeX.
Mathematics
Markdown can use MathJax to render mathematics using LaTeX mathematics syntax, but this is not necessarily portable from one machine to another. By default, MathJax retrieves JavaScript libraries from https://www.mathjax.org/
; if the presentation machine does not have an Internet connection, the mathematics rendering will fail unless MathJax is installed locally on the machine. This is especially important to understand if you are using Markdown for a presentation at a conference; you may be able to transfer your presentation file to another machine, but the math may not render on that machine unless it has a working Internet connection.
If you are using Markdown for a presentation at a conference, you should generate a PDF version in case the conference machines do not have Internet connections. You can do this reasonable conveniently by changing the output type within the Markdown file.
Interactive Graphics
Markdown can embed most of the interactive graphic types, and is the best choice for interactive graphics unless you are specifically planning to incorporate the output into a web article. If you plan to include interactive graphics in a web article, make sure that you can use the iframe
on the web server. For a variety of good security reasons, web site administrators do not give the authority to use this tag routinely; you do not want to get to the end of a long development project only to learn that you cannot get the necessary authorities to display your results.
Presentation Output
Markdown can output to three different presentation types:
- HTML/ioslides
- HTML/slidey
- PDF (LateX/Beamer)
You can put all three output types into you Markdown file and comment them out as needed. For interactive graphics, some types work well in one of the HTML formats, but not the other format, so you should probably experiment a little with your particular presentation. For a conference, you should always run a copy using Beamer output so that you have something that will always work on another machine.
MS Word Output
If you need to generate MS Word output (a .docx
file), Markdown is by far the best choice, as you can easily change the output type to MS Word format and the output looks quite good with little tweaking.
Document Output
Markdown can generate PDF output–it uses LaTeX internally–but is probably not the best choice if you are doing a longer document and need cross referencing.
Rhtml
Rhtml allows you to create whatever you want in an HTML file and is the preferred choice if the plan is to include the document into a web article. All of the data analysis articles on tis web site were generated as Rhtml documents. You can include interactive graphics into an Rhtml document, but it requires an iframe
tag which may not be available without special security authorities on the target web site.
Mathematics
Rhtml cans use MathJax for rendering math symbols using LaTeX syntax. For web use portability really is not an issue, so it is reasonable to use the MathJax CDN URLs to load the MathJax JavaScript files, though this can occasionally create problems if there your markup is somehow incompatible with a new version of MathJax. You can choose to install MathJax locally in this case.
Document Output
Although the table does not list Microsoft Word as an output type, you can import HTML into Word to generate a Word document.
LaTeX
LaTeX was created in the 1990’s to make it easier for academic math and science authors to create publication-quality output that included mathematics and to manage bibliographies, citations and cross references through the related BibTeX program. It is still far and away the preferred tool for all quantitative PhD dissertations and most publications; this is why RStudio and other tools make it easy to work with LaTeX. If you plan to use R in your career, you should develop at least a minimal proficiency in LaTeX, and especially the mathematics markup, as LaTeX math syntax is used by MathJax, which is the preferred way to markup mathematics and is used for Markdown and other tools.
LaTeX has document types for articles, dissertations, books, presentations, posters–everything that a math or science professor would need to do.
Beamer Presentation Document Type
The most widely used LaTeX presentation document type is Beamer, which has a feature for an easy to generate running table of contents that is difficult or impossible to do in Microsoft PowerPoint or in Markdown’s presentation formats. LaTeX/Beamer produces PDF output; if you plan to present on both 4x3 and 16x9 projectors, you will need to create PDF files for both sizes in order to get a good looking presentation on both projector types.
Beamer supports all of the LaTeX math syntax, but because the fonts are much larger, equations that work in articles and books may not work in a Beamer presentation. All of LaTeX’s cross referencing and citation capabilities work in Beamer.
Article Document Type
If you need to use extensive cross referencing and bibliography for a web article, the LaTeX article document type may have some advantages over the Rhtml format; there are translators that will take LaTeX input, compile it and generate an HTML file. Graphics do not necessarily work perfectly the first time, but this approach is probably the way to go if you need a lot of citations.
Poster Document Type
Many accedemic conferences have “poster sessions” where an author creates an A0-sized poster (roughly 33x47 inches or 841x1189 mm). If you need to generate a poster for a conference, a conference room display, a cafeteria display or any other poster, LaTeX’s poster capabilities may well be the easiest way to do this.
Summary
The remaining articles in the series discuss specific examples of the output types discussed above. The various articles show how to create the components of the Daily Econometric Graphs article on this web site.