Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    unusual trading activity
    Signal Or Noise? A Decision Tree For Evaluating Unusual Trading Activity
    3 Min Read
    software developer using ai
    How Data Analytics Helps Developers Deliver Better Tech Services
    8 Min Read
    ai for stock trading
    Can Data Analytics Help Investors Outperform Warren Buffett
    9 Min Read
    media monitoring
    Signals In The Noise: Using Media Monitoring To Manage Negative Publicity
    5 Min Read
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: What is R?
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > What is R?
Data MiningPredictive Analytics

What is R?

DavidMSmith
DavidMSmith
8 Min Read
SHARE
What is R? It seems like a simple question, but I fear this is going to be a long post.

My main motivation is to clear up a bit of confusion around the distinction between R itself and user-contributed packages for R. It was prompted by a recent discussion about R on the MedStats mailing list, which included this comment:

After all the positive comments, I would like to raise some concern about some of the non-standard R packages. I have twice experienced a serious error in R packages (not a bug, an error in the algorithm). The authors of the first one did not reply; the authors of the second one said they know about it but do not have the time to fix it. I wonder how many — especially of the PhD/postdoc written packages, which I am sure work for their project — are really working correctly in all situations? Not all of them work on their packages as hard and great work as e.g. D. Bates with his lme4 (GLMM) package, and he and users still discover bugs and flaws in it. I do not want to criticize R; I am using it and I believe that the core packages are as valid as from commercial software (or better). But as I said, I have  doubts …

What is R? It seems like a simple question, but I fear this is going to be a long post.

More Read

Webinar on the ROI of business rules in decision management
From Decision Support to Action Support
Predictive analytics FAQ #1: Prerequisites for success
Technologies and Analyses in CBS’ Person of Interest
Transforming retail with analytics and decision management
My main motivation is to clear up a bit of confusion around the distinction between R itself and user-contributed packages for R. It was prompted by a recent discussion about R on the MedStats mailing list, which included this comment:

After all the positive comments, I would like to raise some concern about some of the non-standard R packages. I have twice experienced a serious error in R packages (not a bug, an error in the algorithm). The authors of the first one did not reply; the authors of the second one said they know about it but do not have the time to fix it. I wonder how many — especially of the PhD/postdoc written packages, which I am sure work for their project — are really working correctly in all situations? Not all of them work on their packages as hard and great work as e.g. D. Bates with his lme4 (GLMM) package, and he and users still discover bugs and flaws in it. I do not want to criticize R; I am using it and I believe that the core packages are as valid as from commercial software (or better). But as I said, I have  doubts about some hardly used ones.

It’s a fair point, but needs some clarification for readers not familiar with R. The distinction is one between “official R” and user-contributed code (which is what the commenter above is discussing).

By “official R” I mean the R project, under the control of the R Core Group. This is what you get when your download R from the CRAN website, and what’s included in REvolution R distribution. This includes both the R interpreter (the code that implements the language at the heart of R), and the various statistical functions included in the official R distribution. These components and functions are all managed under a strict software development lifecycle, and have the highest reputation for accuracy and reliability. This is what makes R suitable for all statistical analysis applications where you need the utmost confidence in the result, such as the analysis of clinical trial data. This is R. 

Now, R isn’t just a closed statistical analysis environment. It’s also designed to be a platform for other individuals to create their own methods and applications. Research institutions, academics and, yes, students, use R to implement brand-new statistical methods as part of research projects (or, sometimes, just for fun). They collect these new functions into collections called “packages” and upload them to section of CRAN dedicated to user contributions. (This is distinct from the area in CRAN where the official R distribution is found.) Some of these user-contributed packages are major bodies of work in their own right, regularly maintained and tested by their respective authors. Some are student projects, long since abandoned. Just as when using a SAS macro downloaded from a website, or installing a third-party Excel add-in, you’ll need to rely on the reputation of the author (or the recommendation of trusted peers) when deciding whether to use such third-party code.

If you’re in the habit of downloading packages from CRAN, how do you tell if a function you’re using is an official R function, and not a user-contributed one? One easy way is to use the function find, which will tell you which package the function comes from.  For example, let’s check the function nls (nonlinear least squares):

> find(“nls”)
[1] “package:stats”

This tells me that the nls function comes from the stats package. The official R distribution includes a number of standard packages. (These packages are divided into two groups — the “base” and “recommended” packages — but the distinction isn’t important here as they all fall under the same software development lifecycle and are all part of “official R” as defined above.) If the comes from any of the following packages, it’s considered official:

Official R packages (Base and Recommended)

base, boot, class, cluster, codetools, datasets, foreign, graphics, grDevices, grid, KernSmooth, lattice, MASS, methods, mgcv, nlme, nnet, rpart, spatial, splines, stats, stats4, survival, tcltk, tools, utils

This list has grown as R has matured, but the list above is applicable to R version 2.7.2 and above, and REvolution R version 1.2.3 and above. 

So, to sum up: R, drawing on the expertise and control of the R Core Group, has an excellent reputation for accuracy and reliability, on par with or even exceeding that of commercial software packages like SAS or SPSS. It’s suitable for any statistical analysis where you must rely on the results. All of this applies to the R distribution on CRAN, and to the REvolution R distribution, both of which comprise the official packages listed above. When it comes to user-contributed packages you download and install yourself, you’re no longer using code under the control of the R Core Group, in which case — as with all third-party code — you must rely on the reputation of the author of that package.

That’s a long answer to a seemingly simple question. But I hope it clears things up.

TAGGED:revolution r
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

Hidden AI, a risk?
Hidden AI, Real Risk: A Governance Roadmap For Mid-Market Organizations
Artificial Intelligence Exclusive Infographic
unusual trading activity
Signal Or Noise? A Decision Tree For Evaluating Unusual Trading Activity
Analytics Exclusive Infographic
Ai agents
AI Agent Trends Shaping Data-Driven Businesses
Artificial Intelligence Exclusive Infographic
Why Businesses Are Using Data to Rethink Office Operations
Why Businesses Are Using Data to Rethink Office Operations
Big Data Exclusive

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

Packages for By-Group Processing in R

2 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots
ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?