R programming language gaining ground on traditional statistics packages

Joab Jackson

The R programming language is quickly gaining popular ground against the traditional statistics packages such as SPSS, SAS and MATLAB, at least according to one data statistician who teaches the language.

"It is very likely that during the summer of 2014, R became the most widely used analytics software for scholarly articles, ending a spectacular 16-year run by SPSS," wrote Robert Muenchen, in a blog post summarizing his analysis.

Muenchen gauged the popularity of statistical software packages by tracking how often they have been used for published scientific research and the number of mentions they get in online discussion forums, blogs, job listings and other sources.

Scholarly citations are a "good leading indicator of where things are headed," Muenchen wrote. Students who learn to use these software packages later go on to use them in their professional careers, either in academia or industry.

In his latest survey, Muenchen found that researchers continue to do most of their work on traditional software packages, namely SAS's and MATLAB's self-named package, as well as IBM's SPSS.

SPSS led the pack with over 75,000 citations in scientific papers, which were culled through a search on Google Scholar. SAS follows in second place with almost 40,000 citations. R was used in well over 20,000 research projects.

Moreover, when Muenchen examined the number of citations since 1995, he found that SPSS citations have declined since 2007. SAS trailed SPSS in usage, peaking in 2008. The use of R, in contrast, has been growing dramatically, faster than other packages such as Statistica and Stata.

"Extending the downward trend of SPSS and the upward trend of R make it likely that sometime during the summer of 2014 R became the most dominant package for analytics used in scholarly publications," Muenchen wrote. "Due to the lag caused by the publication process, getting articles online, indexing them, etc. we won't be able to verify that this has happened until well into 2015."

R is an open-source functional programming language designed for statistical computing and graphics .

Muenchen, a certified statistician who manages the research computing support at the University of Tennessee, may not be the most impartial person to declare a victory for R -- he also works as an R instructor on behalf of Revolution Analytics. But he has also been long recognized as an expert in computer analytics, contributing code to SAS, SPSS and various R packages. He has also served on the advisory boards of SAS and SPSS before it was acquired by IBM in 2009.

Muenchen did not speculate in the blog post summarizing his findings about why R is gaining popularity.

1  2  Next Page