June 9, 2014

What makes the R Language Good? What makes it Bad?

I've yet to see any qualities that make it great. It seems like the arguments for why it's "good" are generally weak and amount to "well, we had nothing before."

It's slow. The language is clumsy and cumbersome, and it silos statisticians away. Its approach to OO is foreign and impractical to for vast majority of R users. The official documentation gives up trying to explain it.

People with no computer science background are encouraged to write half their code in C.

I find that to be utterly insane. For edge-cases, fine. But it's often trivial, day-to-day code that the language should be able to optimize away. The "language of statistics" does a terrible job of documenting statistical algorithms.

Am I wrong in making these accusations? What are the redeeming qualities of R, itself?

I agree with this point,R is not a good programming language. R’s syntax is not only ambiguous for non-professional, but also strange to experts. I guess when mathematicians were inventing R, they never thought it possible R language would be commercialized for others. And I don’t know why its interpreter runs so slowly. As for current performance in R, many people once said to implement big data analytics in R, now look it is purely nonsense. 

But, other than that, R is so popular, of course, it also has its own advantages. No free language had by far provided so much rich statistics function libraries, mathematicians either use free R, or expensive SPSS, MATLAB, there is no other choice. Mathematicians are often admired by everyone, what they advocate will become something we all appreciate.

Mathematician often needs to compute such stuff about probability and statistics, but not everyone does. Day-to-day data analysis and processing will not involve the esoteric operations, nothing but filtering, aggregating, grouping, etc..

What annoys you is just far more process steps, where Python or esProc is good at these things (Now Python tends to replace R, especially when the statistics package panda is provided in Python), because both are superior to R by several times with more understandable syntax styles.

If you are doing structured data analytics, esProc will be more convenient, because it offers more powerful data object than data frame in R, also supports file data and more simple parallel computing.