Sort
Profile photo for Adrian Olszewski

I have employed R for creation of both web-based and windows-based, full featured reporting systems and data transformation (HL7, XML, databases, data from sensors and laboratory machinery) adapters many times. Both as a freelancer and contracted programmer. The CRO company I currently work for uses R extensively this way. But yes - this is still - some method of data analysis and presentation :)

When it comes to create a small reporting system (50 concurrent users at max), I create it directly in R. I strongly prefer OpenCPU for this task. R gives me everything I need in my everyday work, incl

I have employed R for creation of both web-based and windows-based, full featured reporting systems and data transformation (HL7, XML, databases, data from sensors and laboratory machinery) adapters many times. Both as a freelancer and contracted programmer. The CRO company I currently work for uses R extensively this way. But yes - this is still - some method of data analysis and presentation :)

When it comes to create a small reporting system (50 concurrent users at max), I create it directly in R. I strongly prefer OpenCPU for this task. R gives me everything I need in my everyday work, including the ability to:

  • query all possible database engines, Active Directory, GPIO bus on Raspberry PI in order to read data from sensors,
  • call external web services,
  • interact with the host operating system,
  • produce documents in all commonly used formats (docx, xlsx, pptx, rtf, open document, pdf, postscript, static and dynamic html), both static and dynamic (knitr, RMarkdown, sweave, odfWeave)
  • generate high-quality, complex graphs (often hard to obtain in typical graphing libraries), which can be turned into dynamic, JS-based presentations
  • create complete, bootstrap-based web pages or embeddable partial HTML views
  • call code written in many other languages: C++ (in this case I can even mix R and C via Rcpp), C#, Java. There are adapters (bridges) between R and majority of programming languages, including Python and Perl.
  • exchange data with R via various channels, including DDE, COM+, TCP/IP. Thank to this R can be accessed from, say, Excel, Word or other applications allowing the user the send commands and receive data this way.
  • expose written code as a RESTful web service with OpenCPU; create interactive (and responsive) web-applications with Shiny
  • talk with SAS, SPSS and other statistical packages

Such approach reveals some significant advantages:

  • lack of additional layers (tiers) which reduces the system performance (translation of R objects into PHP/C++/C#/Java/Python structures takes resources),
  • avoiding external programming languages; R programmers are likely to happen in CRO company, but C++ or PHP - not necessarily :) It’s easier to keep consistent programming environment.
  • perfect integration with R - well, in fact, it’s build on a top of R. It matters, when the majority of a process is done by R. In this case, setting up additional run time frameworks or launchers seems questionable.

In certain cases, when I need more flexibility or when I’m going to create something more advanced, I combine R and ASP.NET MVC framework. I followed this pattern multiple times with good results on both Windows and Linux (Debian) systems.

One just has to feel the fine line between using proper tools for given tasks and using hammer as a screwdriver all the time.

Let me also share a personal feeling about one thing. For me, being a C++ and C# programmer for over 15 years, it’s a really strange experience to find R syntax much more handy when it comes to play with data. Now it happens that I launch R much more often than the Visual Studio to play with algorithms or mangle data. With just 5 lines of code I can query two completely different data sources and write the result into a third one. Sometimes, the lack of a full-control over a process (provided by “true languages”, like C++) is nothing wrong. It’s just a matter of needs. Sometimes we really don’t need anything more. But, actually, I would hate to lose my beloved strongly-typed languages for more serious projects :)

Profile photo for Sairaam Varadarajan

Okay, R with NO ML. Lets see the related-topics of R Programming language in Quora

  1. R-Programming
  2. SAS
  3. RStudio
  4. SAS Software
  5. Data Analysis
  6. Data Science (ML!!)

Ref: R (programming language)

I believe the list seen in this Quora link is ordered with a magnitude of affinity among R and its related topics.

Lets move on and look at the top downloaded packages with R. The 5 most popular R packages

  1. dplyr (data manipulation)
  2. devtools (generic!)
  3. Foreign (read data)
  4. cluster (Damn ML)
  5. ggplot2 (Visualization)

I wish data.table package in the top-15.

Lastly, Let's go to stackoverflow and look at the related tags along with R

Okay, R with NO ML. Lets see the related-topics of R Programming language in Quora

  1. R-Programming
  2. SAS
  3. RStudio
  4. SAS Software
  5. Data Analysis
  6. Data Science (ML!!)

Ref: R (programming language)

I believe the list seen in this Quora link is ordered with a magnitude of affinity among R and its related topics.

Lets move on and look at the top downloaded packages with R. The 5 most popular R packages

  1. dplyr (data manipulation)
  2. devtools (generic!)
  3. Foreign (read data)
  4. cluster (Damn ML)
  5. ggplot2 (Visualization)

I wish data.table package in the top-15.

Lastly, Let's go to stackoverflow and look at the related tags along with R: Newest 'r' Questions

  1. ggplot2
  2. dataframe
  3. plot
  4. shiny
  5. data.table
  6. dplyr

My takeaway is that more than ML, R is extensively used for visualization and data manipulation. ML is a small piece of it. Data professionals spend more time on data cleansing and manipulation rather than ML.

With the time spend in R, defines the packages and exploration available with R.

I would think that the sample of questions which you saw in Quora in relation to ML-R is skewed. It may be because you happen to up-vote ML related questions and R-related questions and you got that specific blend in your view.

Nevermind, Let's discuss other applications of R rather than ML, data manipulation and visualization. I got an opportunity to use it for:

  • Shiny dashboards
  • SparkR(hive queries and data manipulation with sparkler- dplyr )
  • I had a short-lived blog hosted with R-Markdown.
  • web-scraping
  • All my presentations are with R-Markdown. Some of my presentations doesn't have any R-Script but still I tend to compile ppt via rmd
  • ETL jobs(Not recommended for production-run)

I had seen R-scripts to merge google and outlook calendar thereby notify the overlaps via email.

  • Drop-box integration with R
  • IFTTT with R
Are you prepared for retirement? Request our free guide for help with important investing decisions.
Profile photo for Jody Diaz

Yes, R is used outside of traditional statistics and data analysis. It is widely utilized in fields like bioinformatics, finance, and social sciences for tasks such as data visualization, reporting, and even machine learning applications. Its versatility makes it a valuable tool for various types of data-related tasks. For more insights on R's applications, check out my Quora Profile!

Profile photo for Jeremy Miles

I’m not much of a programmer, but I’m better in R than anything else. So if I need to write a program to do something, and it’s possible to do it in R, that’s where I’m going.

Couple of examples:

  1. Need to move a whole bunch of files around into separate folders, based on file name? R.
  2. Need to scrape PDFs from the web, convert them to text, and save as csv files? R.
Profile photo for Assistant
Assistant

Yes, R is used outside of traditional statistics and data analysis in several fields and applications. Here are some notable areas where R is applied:

  1. Machine Learning: R has numerous packages (like caret, randomForest, and xgboost) for building predictive models and performing machine learning tasks.
  2. Bioinformatics: R is widely used in bioinformatics for analyzing biological data, including genomics and proteomics. Packages like Bioconductor provide tools for analyzing genomic data.
  3. Finance: In finance, R is used for risk analysis, portfolio management, and financial modeling. The quantmod and T

Yes, R is used outside of traditional statistics and data analysis in several fields and applications. Here are some notable areas where R is applied:

  1. Machine Learning: R has numerous packages (like caret, randomForest, and xgboost) for building predictive models and performing machine learning tasks.
  2. Bioinformatics: R is widely used in bioinformatics for analyzing biological data, including genomics and proteomics. Packages like Bioconductor provide tools for analyzing genomic data.
  3. Finance: In finance, R is used for risk analysis, portfolio management, and financial modeling. The quantmod and TTR packages are popular for quantitative trading and analysis.
  4. Marketing Analytics: R is leveraged for customer segmentation, A/B testing, and analyzing marketing campaign effectiveness, often using visualization packages like ggplot2.
  5. Social Sciences: Researchers in sociology, psychology, and political science use R for survey analysis, experimental data analysis, and social network analysis.
  6. Geospatial Analysis: R has strong capabilities for handling spatial data with packages like sf, sp, and raster, making it useful in geography and environmental science.
  7. Web Development: Through packages like Shiny, R can be used to build interactive web applications for data visualization and analysis.
  8. Text Mining: R can be used for natural language processing and text mining, with packages like tm and text.

In summary, R's versatility and rich ecosystem of packages make it a valuable tool across various disciplines beyond just statistics and data analysis.

Profile photo for Fiverr

The reason you should hire a digital marketing freelancer is that it can be very overwhelming trying to do this on your own–which is why so many people and businesses outsource that work. Fiverr freelancers offer incredible value and expertise and will take your digital marketing from creation to transaction. Their talented freelancers can provide full web creation or anything Shopify on your budget and deadline. Hire a digital marketing freelancer on Fiverr and get the most out of your website today.

Profile photo for Rishabh Thukral

Recently, I was reading about Steganography

and I thought about creating my own tool that can hide data in form of text inside images without causing much damage to the quality of image. I chose R for creating this entire project because of the large number of Image Processing packages available in R. Moreover, the documentation available for R and it’s packages is very neat which substantially reduces the development time.

With the platform and idea, I developed this software that hides the textual data into images by encoding them into intensity values of the pixels in the images.
Here are s

Footnotes

Recently, I was reading about Steganography

and I thought about creating my own tool that can hide data in form of text inside images without causing much damage to the quality of image. I chose R for creating this entire project because of the large number of Image Processing packages available in R. Moreover, the documentation available for R and it’s packages is very neat which substantially reduces the development time.

With the platform and idea, I developed this software that hides the textual data into images by encoding them into intensity values of the pixels in the images.
Here are some screenshots of the app:-

Consider this image that we will use for hiding our data in.

Then, we execute the application step by step:-

  1. It asks us to select an input image.

2.Then ,the system asks for the message you want to hide inside the image.

Note : The maximum limit on number of characters is dependent upon the resolution of the input image.

3. It performs certain operations on image and produce an output image. It shows a successful message once the process is over. It also shows the percentage loss at pixel level in quality of image. The user specifies the save location for output image and the system writes the image file onto that location.

After saving the image, we can see that there is not much difference visible in the new image.

Note : This image now stores some textual information as well. The quality seems pretty much the same due to low information loss which happened because of short message. Quality of image will decrease for longer texts but the difference will not be clearly visible to eyes.

Obtaining the data back from the image is the exact reverse of the above-mentioned process.

1.Select the image you wish to decode.

2.Based on the technique used for hiding the data in image, we extract it out from the image.

We obtain the same message back.

Steganography has an advantage over cryptography in terms of security which is that it doesn’t attract attention of malicious attackers.

This entire application was developed in RStudio using R. If you want to look at the code, then the source code is available at my Github repository, the link to which is provided below:-
supercool276/StegnographyR

Footnotes

Profile photo for Rakesh Kumar

I’ve used ‘R’ for really weird reasons. Started to use R as a tool to learn data science, was fascinated with tidyverse package. Got good hold of all the data manipulating tools within ‘R’. One fine (not so fine) day there was a big production issue with wrong price displayed (You might guess I work for a Big retail company) to the customers on site. Data being highly complicatedly modeled, none of them could understand how to identify and correct all pricing. I used all my data manipulation skills and was able to figure out all the differences within like 15 minutes. Most of them thought it w

I’ve used ‘R’ for really weird reasons. Started to use R as a tool to learn data science, was fascinated with tidyverse package. Got good hold of all the data manipulating tools within ‘R’. One fine (not so fine) day there was a big production issue with wrong price displayed (You might guess I work for a Big retail company) to the customers on site. Data being highly complicatedly modeled, none of them could understand how to identify and correct all pricing. I used all my data manipulation skills and was able to figure out all the differences within like 15 minutes. Most of them thought it was magic and some folks did not even trust. Fortunately or Unfortunately management had no choice but to use my analysis to correct the data. SURPRISE all prices were recovered in 2 hrs, we were able to mitigate lot of potential loses. Now I have my R Programs running between multiple systems for data consistency checks.

Folks still cannot understand how dplyr joins between multiple tables with million of records executes in matter of seconds. And I proudly say, ‘It’s magic’.

Thanks to Hadley Wickham for the tidyverse suite of packages

Where do I start?

I’m a huge financial nerd, and have spent an embarrassing amount of time talking to people about their money habits.

Here are the biggest mistakes people are making and how to fix them:

Not having a separate high interest savings account

Having a separate account allows you to see the results of all your hard work and keep your money separate so you're less tempted to spend it.

Plus with rates above 5.00%, the interest you can earn compared to most banks really adds up.

Here is a list of the top savings accounts available today. Deposit $5 before moving on because this is one of th

Where do I start?

I’m a huge financial nerd, and have spent an embarrassing amount of time talking to people about their money habits.

Here are the biggest mistakes people are making and how to fix them:

Not having a separate high interest savings account

Having a separate account allows you to see the results of all your hard work and keep your money separate so you're less tempted to spend it.

Plus with rates above 5.00%, the interest you can earn compared to most banks really adds up.

Here is a list of the top savings accounts available today. Deposit $5 before moving on because this is one of the biggest mistakes and easiest ones to fix.

Overpaying on car insurance

You’ve heard it a million times before, but the average American family still overspends by $417/year on car insurance.

If you’ve been with the same insurer for years, chances are you are one of them.

Pull up Coverage.com, a free site that will compare prices for you, answer the questions on the page, and it will show you how much you could be saving.

That’s it. You’ll likely be saving a bunch of money. Here’s a link to give it a try.

Consistently being in debt

If you’ve got $10K+ in debt (credit cards…medical bills…anything really) you could use a debt relief program and potentially reduce by over 20%.

Here’s how to see if you qualify:

Head over to this Debt Relief comparison website here, then simply answer the questions to see if you qualify.

It’s as simple as that. You’ll likely end up paying less than you owed before and you could be debt free in as little as 2 years.

Missing out on free money to invest

It’s no secret that millionaires love investing, but for the rest of us, it can seem out of reach.

Times have changed. There are a number of investing platforms that will give you a bonus to open an account and get started. All you have to do is open the account and invest at least $25, and you could get up to $1000 in bonus.

Pretty sweet deal right? Here is a link to some of the best options.

Having bad credit

A low credit score can come back to bite you in so many ways in the future.

From that next rental application to getting approved for any type of loan or credit card, if you have a bad history with credit, the good news is you can fix it.

Head over to BankRate.com and answer a few questions to see if you qualify. It only takes a few minutes and could save you from a major upset down the line.

How to get started

Hope this helps! Here are the links to get started:

Have a separate savings account
Stop overpaying for car insurance
Finally get out of debt
Start investing with a free bonus
Fix your credit

Profile photo for Håkon Hapnes Strand

My first introduction to R was through a university course called Statistical Modeling and Simulation. It was taught with the textbook Statistical Computing with R by Maria Rizzo. The book does not touch machine learning even once in its approximately 400 pages, nor did the course.

I have later used techniques from that course in my job in at least two projects. One where I used Markov chains to model a queue, and another where I used kernel density estimation and simulations to model future workflows in a machine shop.

In fact, I rarely use R for machine learning, but I use it all the time for

My first introduction to R was through a university course called Statistical Modeling and Simulation. It was taught with the textbook Statistical Computing with R by Maria Rizzo. The book does not touch machine learning even once in its approximately 400 pages, nor did the course.

I have later used techniques from that course in my job in at least two projects. One where I used Markov chains to model a queue, and another where I used kernel density estimation and simulations to model future workflows in a machine shop.

In fact, I rarely use R for machine learning, but I use it all the time for things that could be considered part of data science, like visualization, statistical analysis and simulations.

Profile photo for Alket Cecaj

Yes it can bè used for other things as for example Web development. There is a package Valledoria shiny that you can use for development web applications.

Am I the only one who never knew this before?
Profile photo for Quora User

I use R regularly for finance. In fact, our entire portfolio optimization procedure is in R, as is our dashboard of recession indicators.

I regularly test pet theories, and whip up econometric regressions. R is VERY powerful for finance, and I have long since abandoned Excel.

I have tried (and so far failed) to develop a ML algo for bubble detection. However, my first choice for this attempt is R.

Here are some sample outputs:

I use R regularly for finance. In fact, our entire portfolio optimization procedure is in R, as is our dashboard of recession indicators.

I regularly test pet theories, and whip up econometric regressions. R is VERY powerful for finance, and I have long since abandoned Excel.

I have tried (and so far failed) to develop a ML algo for bubble detection. However, my first choice for this attempt is R.

Here are some sample outputs:

Profile photo for Quora User

The most recent R program I ran was a simulation of stochastic larval dispersal (283 dual socket Broadwell compute nodes, took about five hours to finish).

The biggest program prior to that was an exploration of whether 4200 Broadwell processors could be distinguished based solely on their performance. The answer is “probably yes”, but there’s still quite a bit of work to do for that result to be useful.

I also do all of my data visualization in R.

Machine learning? Never tried it.

Profile photo for Sandeep Kale

R is a programming language that was originally developed by Ross Ihaka and Robert Gentleman in the 1990s. It is used mostly for statistical analysis and data manipulation. R is one of the most popular programming languages in use today, with over 2 million users worldwide.

R is fading away as a machine learning language though it is being used in statistical functions normally? Why?

In my opinion, R is fading away as a machine learning language because there are better options out there for machine learning: Python, Julia, and MATLAB. These languages are more suited for machine learning because

R is a programming language that was originally developed by Ross Ihaka and Robert Gentleman in the 1990s. It is used mostly for statistical analysis and data manipulation. R is one of the most popular programming languages in use today, with over 2 million users worldwide.

R is fading away as a machine learning language though it is being used in statistical functions normally? Why?

In my opinion, R is fading away as a machine learning language because there are better options out there for machine learning: Python, Julia, and MATLAB. These languages are more suited for machine learning because they were designed specifically for the purpose of creating algorithms and models.

Yes, R is fading away as a machine learning language. Although it is being used in statistical functions normally, there are other languages that have taken its place in the field of machine learning.

The reason why R is fading away as a machine learning language is that there are many other languages that have more functionalities and can do the same thing as R. Also, R has been around for quite some time now and there are newer languages that are more capable than R because they were built with machine learning in mind.

Profile photo for David Johnston

Yes. It was never that popular to begin with with data scientists that actually deliver applications to production. It's fine for exploratory work and has good visualizations. But it's not great for large datasets and isn't well integrated into the rest of modern software stacks. While it has loads of useful packages, it is really only good when there is very little programming to do. R is a serviceable scripting language for gluing together some library API calls. But it's not a good, modern general purpose language like Python is.

A data scientist needs to be more than someone who can read in

Yes. It was never that popular to begin with with data scientists that actually deliver applications to production. It's fine for exploratory work and has good visualizations. But it's not great for large datasets and isn't well integrated into the rest of modern software stacks. While it has loads of useful packages, it is really only good when there is very little programming to do. R is a serviceable scripting language for gluing together some library API calls. But it's not a good, modern general purpose language like Python is.

A data scientist needs to be more than someone who can read in a csv file, plug data into libraries and make charts. They need to be software developers to some degree and well versed into modern software stacks. Python is the alternative that is used almost exclusively for that. R users who want to cross that divide and become full stack data scientists need to make that leap. Leaving R behind is part of that process.

The R community is quite insular. R users tend to be the types that don't want to learn other languages and don't want to really ever cross the divide to become real software developers. The ideal users of R actually aren't data scientists. It's academic researchers whose real interest is their science not scientific computing or analytics. And R is great for them most of the time just as Matlab is. But when you become more serious about delivering data science solutions to business production environments you gotta graduate from that and embrace Python and general purpose computing.

Profile photo for Thomas Subia

Nelson in a previous post wrote: “No. R wasn't intended to be used to do anything but statistics. Everything else is a hack.”

I’m not sure what motivated that response but R can be used for many things not related to statistics. Here is a good example.

I have to create SPC charts from Excel files. Since there are literally thousands of files to comb through, cutting and pasting that data into an Excel file might take literally weeks.

Fortunately, R can do this easily, Here is how this is done.

Let’s say our data exists in cell B9 of a spreadsheet. We want R to go through all the Excel files, copy

Nelson in a previous post wrote: “No. R wasn't intended to be used to do anything but statistics. Everything else is a hack.”

I’m not sure what motivated that response but R can be used for many things not related to statistics. Here is a good example.

I have to create SPC charts from Excel files. Since there are literally thousands of files to comb through, cutting and pasting that data into an Excel file might take literally weeks.

Fortunately, R can do this easily, Here is how this is done.

Let’s say our data exists in cell B9 of a spreadsheet. We want R to go through all the Excel files, copy and paste this data into a file.

# You will need these libraries

library(plyr)

library(readxl)

# read in all Excel flies from the directory

files <- list.files(pattern="*.xls", full.names = FALSE)

View(files) # this ensures that one can check if all the files were read in correctly

# Extract Work Order

WO <- lapply(files, read_excel, sheet="Sheet1", range=("B9"))

WO_list <- as.data.frame(WO)

trans_WO <- t(WO_list)

write.table(trans_WO ,"WO.txt")

# Reading through more than 300 files took less than 10 seconds to run.

While Nelson claims that R was intended for use as solely a statistical analysis tool, R can be used as a efficient and time saving solution for data collecting and storage. Efficient and time saving solutions are hardly a hack.

Your response is private
Was this worth your time?
This helps us sort answers on the page.
Absolutely not
Definitely yes
Profile photo for Jon Wayland

Not in the slightest.

I think any perceived popularity loss is due in part by the wide adoption of those with CS-before-stats backgrounds investing their time in Python — a language that has a more general adoption in this community to begin with.

On the other hand, those with stats-before-CS backgrounds are typically introduced to R first, and thus form their most productive skills with this language.

It really boils down to what language someone dedicated most of their time creating valuable data solutions in. For those who studied CS in school, this tends to be Python. For those who studied st

Not in the slightest.

I think any perceived popularity loss is due in part by the wide adoption of those with CS-before-stats backgrounds investing their time in Python — a language that has a more general adoption in this community to begin with.

On the other hand, those with stats-before-CS backgrounds are typically introduced to R first, and thus form their most productive skills with this language.

It really boils down to what language someone dedicated most of their time creating valuable data solutions in. For those who studied CS in school, this tends to be Python. For those who studied stats in school, this tends to be R.

In the working world, many tech-focused industries prefer Python. Conversely, many companies whose product isn’t primarily digital in its delivery, such as healthcare or theme parks, prefer R.

One thought I’d like to end with: I have found that more and more entry-level candidates are coming out with stats degrees — no doubt because of their interest in data science — and are entering the workforce with R as their primary tool. If stats degrees continue to be popular for aspiring data scientists, I would bet that R’s popularity will only increase.

Profile photo for Christopher Stern

For numerical work MATLAB is a much better choice than R. R has some support for matrices and such but outside of stats it's slow and incomplete comparared to more apropriate tools. I'm not sure what area of pure math you are looking into, perhaps you're not yet either; but being at least somewhat aquated with some of the most commonly used tools in your field is just part of 'speaking the language'. You don't have to be in expert in everything, few people are, but it's strange not to have expertise in 1 or 2. If everyone around you is sharing MATLAB recipes you don't want to be trying to rei

For numerical work MATLAB is a much better choice than R. R has some support for matrices and such but outside of stats it's slow and incomplete comparared to more apropriate tools. I'm not sure what area of pure math you are looking into, perhaps you're not yet either; but being at least somewhat aquated with some of the most commonly used tools in your field is just part of 'speaking the language'. You don't have to be in expert in everything, few people are, but it's strange not to have expertise in 1 or 2. If everyone around you is sharing MATLAB recipes you don't want to be trying to reinvent everything in R.

Profile photo for Craig Slinkman

I know this answer will conflict with a prior answer to this question. His advice is to stick with Python and to develop your Python skills at a deeper level.

You should be aware that when you ask this question in an online forum you will get biased answers. Since I use R, I will answer R. If a Python user answers this she or he will say Python.

My real answer is that you should know both. If you are gathering data by screen scraping, for example, I would recommend Python. If you are building a regression model to predict a response variable, I would recommend R.

I started my computing career in

I know this answer will conflict with a prior answer to this question. His advice is to stick with Python and to develop your Python skills at a deeper level.

You should be aware that when you ask this question in an online forum you will get biased answers. Since I use R, I will answer R. If a Python user answers this she or he will say Python.

My real answer is that you should know both. If you are gathering data by screen scraping, for example, I would recommend Python. If you are building a regression model to predict a response variable, I would recommend R.

I started my computing career in 1967. The perfect computer languages that have existed over that time are FORTRAN IV, PL/I, Pascal, LISP, c, c++, and Python. They come and go. If you are new to the profession I guarantee that you need to learn other languages besides Python and R. The important thing is to have enough ambition to be willing to learn and study the language. This means either getting your organization to send you to professional courses or spending nights writing code and checking your results. P.S. You need to know the correct answers! Thus, you should have a book by a really competent author.

Before my fellow R users call for my head I should point out that R is an environment. This is especially true when you use tools like RStudio.

Be aware that there is no such thing as a perfect all-purpose language. You should be intimately involved with both languages as they are the hear of data science practice at the current time.

Incidentally, it is important to know SQL and database management.

Don’t bet your career on a single technology. Be adaptable and be able to adapt as the technical environment changes. If for no other reason, this is the reason to learn a second language.

Profile photo for Jeremy Deats
  1. R was created for the purpose of Statistical Computing. Prior to R, FORTRAN or C would have been the most popular choice, but R’s syntax was designed around this purpose and it has rich graphing/visualization features baked in.
  2. R is Free and available on multiple OS environments
    (Windows, Mac OS, Linux)
  3. R has been adopted by Universities to teach Data Science. What’s used in the classroom tends to bleed over to the work environment.
  4. R has a massive repository of community created and supported libraries with a group that manages the repo and insures quality of the packages that are hosted in it.
  1. R was created for the purpose of Statistical Computing. Prior to R, FORTRAN or C would have been the most popular choice, but R’s syntax was designed around this purpose and it has rich graphing/visualization features baked in.
  2. R is Free and available on multiple OS environments
    (Windows, Mac OS, Linux)
  3. R has been adopted by Universities to teach Data Science. What’s used in the classroom tends to bleed over to the work environment.
  4. R has a massive repository of community created and supported libraries with a group that manages the repo and insures quality of the packages that are hosted in it. This last point is very important because having access to the right library for the job and a rich community to support those libraries can greatly increase productivity which in the business world translates into saving money.
Profile photo for Albert de Koninck

I will answer in terms of pure, not applied mathematics. (For applied mathematics, I would recommend either R or MATLAB.)

It depends on what you want a language for. To do number theory and some general calculations, I use Pari Droid (e.g., it can do 1001! instantaneously with complete precision). If you want to perform set theoretical calculations and programs, then Setlx will do the job. Both applications are available for Android as well as Linux and Windows. Finally, if you want to use a completely different programming paradigm similar to the old APL, then J is your language. J is an array

I will answer in terms of pure, not applied mathematics. (For applied mathematics, I would recommend either R or MATLAB.)

It depends on what you want a language for. To do number theory and some general calculations, I use Pari Droid (e.g., it can do 1001! instantaneously with complete precision). If you want to perform set theoretical calculations and programs, then Setlx will do the job. Both applications are available for Android as well as Linux and Windows. Finally, if you want to use a completely different programming paradigm similar to the old APL, then J is your language. J is an array-based language available for IOS, Android, Linux and Windows and requires completely rethinking what you know about programming.

Profile photo for Debdatta Chatterjee

I just love this question. I think this is a plausible and trending debate topic amongst data science enthusiast I suppose. It is hard to pick one out of those two amazingly flexible data analytics languages. Both are free and open source, and were developed in the early 1990s — R for statistical analysis and Python as a general-purpose programming language. For anyone interested in machine learning, working with large datasets, or creating complex data visualizations, they are absolutely essential. This answer can be derived logically following these points.

Process of Data Science

Now, it is t

I just love this question. I think this is a plausible and trending debate topic amongst data science enthusiast I suppose. It is hard to pick one out of those two amazingly flexible data analytics languages. Both are free and open source, and were developed in the early 1990s — R for statistical analysis and Python as a general-purpose programming language. For anyone interested in machine learning, working with large datasets, or creating complex data visualizations, they are absolutely essential. This answer can be derived logically following these points.

Process of Data Science

Now, it is time to look at these two languages a little bit deeper regarding their usage in a data pipeline, including:

  1. Data Collection
  2. Data Exploration
  3. Data Modeling
  4. Data Visualization

Data Collection

Python

Python supports all kinds of different data formats. You can play with comma-separated value documents (known as CSVs) or you can play with JSON sourced from the web. You can import SQL tables directly into your code.

You can also create datasets. The Python requests library is a beautiful piece of work that allows you to take data from different websites with a line of code. It simplifies HTTP requests into a line of code. You’ll be able to take data from Wikipedia tables, and once you’ve organized the data you get withbeautifulsoup, you’ll be able to analyze them in-depth.

You can get any kind of data with Python. If you’re ever stuck, google Python and the dataset you’re looking for to get a solution.

R

You can import data from Excel, CSV, and from text files into R. Files built in Minitab or in SPSS format can be turned into R data frames as well. While R might not be as versatile at grabbing information from the web like Python is, it can handle data from your most common sources.

Many modern packages for R data collection have been built recently to address this problem. Rvest will allow you to perform basic web scraping, while magrittr will clean it up and parse the information for you. These packages are analogous to the requests and beautiful soup libraries in Python.

Data Exploration

Python

To unearth insights from the data, you’ll have to use Pandas, the data analysis library for Python. It can hold large amounts of data without any of the lag that comes from Excel. You’ll be able to filter, sort and display data in a matter of seconds.

Pandas is organized into data frames, which can be defined and redefined several times throughout a project. You can clean data by filling in non-valid values such as NaN (not a number) with a value that makes sense for numerical analysis such as 0. You’ll be able to easily scan through the data you have with Pandas and clean up data that makes no empirical sense.

R

R was built to do statistical and numerical analysis of large data sets, so it’s no surprise that you’ll have many options while exploring data with R. You’ll be able to build probability distributions, apply a variety of statistical tests to your data, and use standard machine learning and data mining techniques.

Basic R functionality encompasses the basics of analytics, optimization, statistical processing, optimization, random number generation, signal processing, and machine learning. For some of the heavier work, you’ll have to rely on third-party libraries.

Data Modeling

Python

You can do numerical modeling analysis with Numpy. You can do scientific computing and calculation with SciPy. You can access a lot of powerful machine learning algorithms with the scikit-learn code library. scikit-learn offers an intuitive interface that allows you to tap all of the power of machine learning without its many complexities.

R

In order to do specific modeling analyses, you’ll sometimes have to rely on packages outside of R’s core functionality. There are plenty of packages out there for specific analyses such as the Poisson distribution and mixtures of probability laws.

Data Visualization

Python

The IPython Notebook that comes with Anaconda has a lot of powerful options to visualize data. You can use the Matplotlib library to generate basic graphs and charts from the data embedded in your Python. If you want more advanced graphs or better design, you could try Plot.ly. This handy data visualization solution takes your data through its intuitive Python API and spits out beautiful graphs and dashboards that can help you express your point with force and beauty.

You can also use the nbconvert function to turn your Python notebooks into HTML documents. This can help you embed snippets of nicely-formatted code into interactive websites or your online portfolio. Many people have used this function to create online tutorials on how to learn Python and interactive books.

R

R was built to do statistical analysis and demonstrate the results. It’s a powerful environment suited to scientific visualization with many packages that specialize in graphical display of results. The base graphics module allows you to make all of the basic charts and plots you’d like from data matrices. You can then save these files into image formats such as jpg., or you can save them as separate PDFs. You can use ggplot2 for more advanced plots such as complex scatter plots with regression lines.

Profile photo for Adrian Dușa

Well yes of course…

Statistical analysis is closely associated with the term “quantitative analysis” that involves numbers, correlations, regressions etc.

In the social sciences, there is something else called “qualitative analysis” that involves in-depth case studies, text analysis, the kind of stuff nobody would think R is suitable for. And yet, there are packages such as QCA for instance, which stands for “Qualitative Comparative Analysis” and it has absolutely nothing to do with the traditional quantitative analysis.

Among the 16000+ packages on CRAN, more than one would expect are not necess

Well yes of course…

Statistical analysis is closely associated with the term “quantitative analysis” that involves numbers, correlations, regressions etc.

In the social sciences, there is something else called “qualitative analysis” that involves in-depth case studies, text analysis, the kind of stuff nobody would think R is suitable for. And yet, there are packages such as QCA for instance, which stands for “Qualitative Comparative Analysis” and it has absolutely nothing to do with the traditional quantitative analysis.

Among the 16000+ packages on CRAN, more than one would expect are not necessarily related to statistics but to general programming. R can analyse text, and process images, and make animated GIFs, harvest data from the web, these are the sort of things that don’t necessarily spring to mind when thinking about statistical analysis.

Profile photo for Quora User

I use R for doing statistical analysis. Often I first create and manipulate data sets using Python since it’s so much easier.

R and Python and whatever else is out there are mere tools. It’s like a carpenters tool box with hammers, saws, drills, clamps, glue and so on. When I need to pound a nail I use a hammer, when I need to drill a hole I use a drill.

Python and R are no different. R has great statistical capabilities and I use that when I need to. Python is great for building data files, scanning the web and all that stuff.

Don’t limit yourself, learn it all.

Profile photo for Varsha Nayak

Data analysis encompasses a wide range of statistical techniques that help researchers and analysts make sense of data. The choice of technique depends on the nature of the data, the research objectives, and the questions being asked. Here are some common statistical techniques used for data analysis:

  • Descriptive Statistics:Mean: Calculates the average value of a dataset. Median: Identifies the middle value in a dataset, separating it into two equal halves. Mode: Identifies the most frequently occurring value in a dataset.Variance: Measures the spread or dispersion of data points.Standard Devia

Data analysis encompasses a wide range of statistical techniques that help researchers and analysts make sense of data. The choice of technique depends on the nature of the data, the research objectives, and the questions being asked. Here are some common statistical techniques used for data analysis:

  • Descriptive Statistics:Mean: Calculates the average value of a dataset. Median: Identifies the middle value in a dataset, separating it into two equal halves. Mode: Identifies the most frequently occurring value in a dataset.Variance: Measures the spread or dispersion of data points.Standard Deviation: Indicates how much individual data points deviate from the mean.Range: Measures the difference between the maximum and minimum values in a dataset.
  • Inferential Statistics: Hypothesis Testing: Determines if there is a significant difference between groups or conditions.Confidence Intervals: Estimates a range of values within which a population parameter is likely to fall.Regression Analysis: Examines the relationship between one or more independent variables and a dependent variable.Analysis of Variance (ANOVA): Compares means between three or more groups to assess if they are statistically different.Chi-Square Test: Analyzes categorical data to determine if there is an association between variables. T-Tests: Compares means between two groups to assess if they are statistically different.
  • Exploratory Data Analysis (EDA):Histograms: Visualizes the distribution of data.Box Plots: Displays data distribution, including outliers.Scatter Plots: Shows the relationship between two continuous variables.Heatmaps: Visualizes relationships and correlations in large datasets.
  • Time Series Analysis: Time Series Plot: Visualizes data points over time.Moving Averages: Smooth out fluctuations in time series data.Seasonal Decomposition: Separates time series data into trend, seasonal, and residual components.Autocorrelation: Measures the correlation between a time series and its lagged values.
  • Multivariate Analysis: Principal Component Analysis (PCA): Reduces the dimensionality of data while retaining important information.Cluster Analysis: Groups similar data points together.Factor Analysis: Identifies underlying factors that explain patterns in data.Discriminant Analysis: Distinguishes between two or more groups based on predictor variables.
  • Non-parametric Tests: Mann-Whitney U Test: A non-parametric alternative to the t-test for comparing two groups.Kruskal-Wallis Test: A non-parametric alternative to ANOVA for comparing multiple groups.
  • Survival Analysis: Kaplan-Meier Survival Curve: Estimates survival probabilities over time.Cox Proportional-Hazards Model: Examines factors affecting survival time.
  • Bayesian Analysis: Bayesian Inference: Uses prior knowledge and probability distributions to update beliefs about parameters.
  • Machine Learning: Various machine learning algorithms, such as decision trees, random forests, support vector machines, and neural networks, are used for predictive modeling and classification tasks.

The choice of statistical technique depends on the research question, data type, sample size, and assumptions made about the data. Often, a combination of these techniques is used to gain a comprehensive understanding of the data and draw meaningful conclusions.

Profile photo for John Frain

I use R because, in my opinion, it is the best tool for the statistical analysis I want to do. I should point out that I also use gretl, Octave (a free Matlab-like program), and Maxima when it is easier to work with these. I have looked at Python and used it some time ago. As I already knew R I and time is limited I did not invest much time learning Python. Some people think that Python is the best and they are probably correct

To advise you on what statistical software is appropriate for you I would need to have some idea of

  1. Your knowledge of statistics
  2. The discipline that you are working in
  3. Some

I use R because, in my opinion, it is the best tool for the statistical analysis I want to do. I should point out that I also use gretl, Octave (a free Matlab-like program), and Maxima when it is easier to work with these. I have looked at Python and used it some time ago. As I already knew R I and time is limited I did not invest much time learning Python. Some people think that Python is the best and they are probably correct

To advise you on what statistical software is appropriate for you I would need to have some idea of

  1. Your knowledge of statistics
  2. The discipline that you are working in
  3. Some idea of the level and amount of statistical analyses that you are doing
  4. your experience in computing
  5. your experience in programming
  6. What statistical packages are supported in your school or organization,

If you are learning statistics, start with a simpler package. I recommend gretl if you are studying econometrics or time series. You can always move to R or Python or Stata or SPSS or SAS or Matlab or Mathematica or …. later if that is what is best for you

Profile photo for Hariharan Sampathkumar

The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into lifeless data. The results and inferences are precise only if proper statistical methods are used.

These below-mentioned methods are used to analyze the different variables in the data (univariate Analysis) and the combination between them (Bivariate Analysis). The Analysis used also differs depends upon the variable type - Continuous and Categorical variable. The following are some of the methods which I commonly use in any analysis.

Univariate Analysis: Analyzing one variable at a time.

  • Continuous Var

The statistical analysis gives meaning to the meaningless numbers, thereby breathing life into lifeless data. The results and inferences are precise only if proper statistical methods are used.

These below-mentioned methods are used to analyze the different variables in the data (univariate Analysis) and the combination between them (Bivariate Analysis). The Analysis used also differs depends upon the variable type - Continuous and Categorical variable. The following are some of the methods which I commonly use in any analysis.

Univariate Analysis: Analyzing one variable at a time.

  • Continuous Variable: The most commonly used statistical method to analyze continuous variables is descriptive statistics. This gives a statistical summary of data such as mean, median, quartiles, etc.
  • Categorical Variable: Tabular method, also known as the frequency table, is used to analyze the distribution of the categorical variables.

Bivariate Analysis: Analyzing two variables at a time.

  • Continuous - Continuous Variable: The method which is used to find the relationship between two continuous variables is a correlation matrix. It gives the value between -1 and 1. (where -1 signifies strong negative correlation and 1 signifies a strong positive correlation)
  • Continuous - Categorical Variable: Student’s t-test is used for the analysis in these situations. It is used to test the null hypothesis that there is no significant difference between the means of the two groups.
  • Categorical - Categorical Variable: Two-way table and chi2 test are used to understand the relationship here. The two-way table gives the frequency or relative frequency distribution of the two categorical variables under consideration. The chi2 test is used to test the null hypothesis that there is no relationship exists between the categorical variables, that they are independent.

This is just the tip of the iceberg and there are still many methods out there that could be used depending upon the data and the business context.

Profile photo for Shrey

Yes, R is a statistical programming language. Which is widely used for MAchine Learning, Statistical analysis, and many other data science applications. Unlike other programming languages like Python, C++, R is relatively easier you do not need heavy programming background to learn and implement R. Just a basic knowledge of programming, loops, and function will be fine.

Profile photo for Business Science Solutions

R is a great one stop shop for many fundamentals related to statistics, machine learning and data mining, data analysis etc.. R is originally a tool created from the S language and has been geared more towards statisticians.

We can break down a couple kinds of data science and analytics projects you can do in R with either base R or add on packages (which are incredibly easy to install).

1. Data vis

R is a great one stop shop for many fundamentals related to statistics, machine learning and data mining, data analysis etc.. R is originally a tool created from the S language and has been geared more towards statisticians.

We can break down a couple kinds of data science and analytics projects you can do in R with either base R or add on packages (which are incredibly easy to install).

1. Data visualizations - Using Base R, lattice, ggplot2 and many others, professional graphics can be created from your data. Some packages are expanding these capabilities to allow interactive visuals as well. Visualizations are key for any data analyst.
2. Statistical Hypothesis Testing - Statistical tools such as the Student t-Test and ANOVA are easily done with the Base R stats package.
3. Data cleaning tools - Hadley Wickham has led a transformation in R by helping build an ecosystem of tidy data tools. Packages such as dplyr and tidyR are examples of ways that data can be shaped, rearranged, joined etc..This makes for ease of working when importing a data set that may not be in the cleanest of shape.
4. Machine Learning and Data Mining - This is less for an analyst, but is a huge asset in R. Packages such as Caret allow for machine learning algorithms to be ran rather quickly out of the box. There are very few algorithms that do not exist in R.
5. Operations Research - R allows you do to linear programming, markov chains and non-linear modeling with various add on packages.
6. Text mining - While this function is not the core competency of R, it is possible to do text mining in R and use markov chains, corpus creations and n-grams.
7. Time series analysis - R has packages for forecasting and other time series related analyses.
8. Reporting and presenting - R allows the creation of prese...

Profile photo for Quastech

Statistics data analysis and data science are closely related fields that share some common principles but also have distinct differences in their focus and scope.

Statistics Data Analysis: Statistics data analysis is primarily concerned with the collection, organization, interpretation, and presentation of numerical data to draw meaningful conclusions and make informed decisions. It involves applying statistical methods to analyze data and uncover patterns, relationships, and trends. Statistical techniques are used to summarize and describe data, make inferences about populations based on samp

Statistics data analysis and data science are closely related fields that share some common principles but also have distinct differences in their focus and scope.

Statistics Data Analysis: Statistics data analysis is primarily concerned with the collection, organization, interpretation, and presentation of numerical data to draw meaningful conclusions and make informed decisions. It involves applying statistical methods to analyze data and uncover patterns, relationships, and trends. Statistical techniques are used to summarize and describe data, make inferences about populations based on sample data, and assess the reliability of conclusions.

Key aspects of statistics data analysis include:

  1. Descriptive Statistics: Summarizing and presenting data using measures such as mean, median, mode, standard deviation, etc.
  2. Inferential Statistics: Making predictions and drawing conclusions about a population based on a sample, often involving hypothesis testing and confidence intervals.
  3. Probability Theory: Studying uncertainty and randomness, which underlies many statistical methods.
  4. Statistical Software: Utilizing specialized software (e.g., R, SAS, SPSS) to perform data analysis and calculations.

Data Science: Data science is a broader interdisciplinary field that encompasses various aspects, including statistics, machine learning, data engineering, and domain expertise. It involves extracting knowledge and insights from large and complex datasets to solve real-world problems. While statistics plays a crucial role within data science, data science goes beyond traditional statistics to incorporate a wider array of techniques and skills.

Key aspects of data science include:

  1. Data Collection and Cleaning: Gathering and preprocessing data from various sources, dealing with missing values and outliers.
  2. Machine Learning: Utilizing algorithms to build predictive models, classify data, and automate decision-making processes.
  3. Big Data Processing: Handling and analyzing large volumes of data using technologies like Hadoop and Spark.
  4. Data Visualization: Creating visual representations of data to communicate findings effectively.
  5. Domain Knowledge: Understanding the context and domain-specific insights to provide actionable recommendations.
  6. Feature Engineering: Selecting, transforming, and creating relevant features from raw data for machine learning models.
  7. Business Impact: Focusing on creating value for organizations by addressing specific business challenges.
Profile photo for Quora User

The answer to this question is extremely vast. However, let's look into some business problems which can be addressed by R

Solution example: customer churn

One of the most canonical uses for prediction science is customer churn. Customer churn is defined as

the number of lost customers divided by the number of new customers gained. As long as you’re

gaining new customers faster than you’re losing them, that’s a good thing, right? It’s not—

for multiple reasons. The primary reason customer churn is a bad thing is that it costs far more to gain

a customer or regain a lost one than it does to keep an e

The answer to this question is extremely vast. However, let's look into some business problems which can be addressed by R

Solution example: customer churn

One of the most canonical uses for prediction science is customer churn. Customer churn is defined as

the number of lost customers divided by the number of new customers gained. As long as you’re

gaining new customers faster than you’re losing them, that’s a good thing, right? It’s not—

for multiple reasons. The primary reason customer churn is a bad thing is that it costs far more to gain

a customer or regain a lost one than it does to keep an existing customer. Over time, too much

customer churn can slowly drain the profits from a company. Identifying customer churn and the

factors that cause it are essential tasks for a company to stay profitable.

Interestingly, customer churn extrapolates out to other users, as well. For instance, in a hospital, you

want customers to churn—to not come back. You want them to stay healthy after their hospital visit.

In this example, we’ll show you how to calculate and locate customer churn by using R and SQL Server

data.

Solution example: predictive maintenance and the Internet of Things

It is critical for businesses operating or utilizing equipment to keep those components running as

effectively as possible because equipment downtime or failure can have a negative impact beyond

just the cost of repair. Predictive maintenance is defined as a technique to forecast when an in-service

the machine will fail so that maintenance can be planned. It includes more general techniques

that involve understanding faults, failures, and timing of maintenance. It is widely used across a variety

of industries, such as aerospace, energy, manufacturing, and transportation and logistics.

New predictive maintenance techniques include time-varying features and are not as bound to

model-driven processes. The emerging Internet of Things (IoT) technologies has opened up the door

to a world of opportunities in this area, with more sensors being installed on devices and more data

being collected about these devices. As a result, data-driven techniques now promise to unleash the

potential of using data to understand when to perform maintenance.

In this example, we'll show you different ways of formulating a predictive maintenance problem and

then show you how to solve them by using R and SQL Server.

Solution example: forecasting

Forecasting is defined as the process of making future predictions by using historical data, including

trends, seasonal patterns, exogenous factors, and any available future date. It is widely used in many

applications and critical business decisions depend on having an accurate forecast. Meteorologists use

it to generate weather predictions; CFOs use it to generate revenue forecasts; Wall Street analysts use

it to predict stock prices, and inventory managers use it to forecast demand and supply of materials.

Many businesses today use qualitative judgement–based forecasting methods and typically manage

their forecasts in Microsoft Excel, or locally on an R workstation. Organizations face significant

challenges with this approach because the amount and availability of relevant data have grown

exponentially. Using SQL Server R Services, it is possible to create statistically reliable forecasts in an

automated fashion giving organizations greater confidence and business responsiveness.

SOURCE: Data Science with Microsoft SQL Server 2016

Profile photo for Aleksandras Urbonas

It has been a while since I HAD to use R now that we have Python.

Back in 2015, R was my entry point to modelling. I had stumbled upon Titanic dataset analysis. This dataset allows many different things to be tested: data processing, feature engineering, charts and statistics. R had good packages for many machine learning algorithsm and has biostatistics, remember Iris dataset? And from there I went to credit risk, clustering, and deep learning with MNIST. I also liked having a development framework called RStudio, which was perfect to manage projects. There are cons, of course, but that is a s

It has been a while since I HAD to use R now that we have Python.

Back in 2015, R was my entry point to modelling. I had stumbled upon Titanic dataset analysis. This dataset allows many different things to be tested: data processing, feature engineering, charts and statistics. R had good packages for many machine learning algorithsm and has biostatistics, remember Iris dataset? And from there I went to credit risk, clustering, and deep learning with MNIST. I also liked having a development framework called RStudio, which was perfect to manage projects. There are cons, of course, but that is a separate question.

Profile photo for Adam Hood

There are really two branches of tools for analytics, data science, and statistical data analysis. 1 branch is for those who are more programming or developer minded and the other is for the business user that has a deep understanding of stats. That said, the tools, as they should, align with their primary users with a few exceptions.
First, lets look at the developer track. Tools in this area include things like R, Python, Perl, Java, in-database coding (think stored procedures, UDFs..),etc. This track is really taking off because of the flexibility in implementation and integration. However

There are really two branches of tools for analytics, data science, and statistical data analysis. 1 branch is for those who are more programming or developer minded and the other is for the business user that has a deep understanding of stats. That said, the tools, as they should, align with their primary users with a few exceptions.
First, lets look at the developer track. Tools in this area include things like R, Python, Perl, Java, in-database coding (think stored procedures, UDFs..),etc. This track is really taking off because of the flexibility in implementation and integration. However, the learning curve can be steep and to non-developers, the results appear to come from a black box.
Second are the tools for business users who have a background in statistics. These tools are focused on user interfaces and the scientific process workflow. Tools in this space include large vendors like SAS, SPSS, and then some up and coming vendors like RapidMiner, Alteryx (which utilizes R), and others.
All that said, I think it is important to note another trend we're seeing. That is the integration of R into user friendly tools. Assuming that your question is based on a decision to investigate or learn tools, R makes a great choice because of it's growing integration with more user friendly tools.

Profile photo for Richard Orama

I will just give a few factors:

  1. Your current environment: what is used at your current work setting? Is it Python or R? Choose one that is currently being used.
  2. Your professional or technical inclination: are you a programmer (more like a software engineer) or not? If Yes, you might prefer Python, else R.
  3. Anticipated level of software integration: do you plan to integrate with other software systems? If Yes, Python may be better as it integrates more easily, e.g. with web applications.
  4. Etc …
Profile photo for Quora User

Well, that depends.

Here's a post from Revolution Analytics (now part of Microsoft): Companies Using R

R is a lot of things. It's the open source version. It's the Microsoft version. It's Renjin.org | The JVM-based interpreter for the R language for statistical computing

And there are others.

Now that there's the R Consortium, and that Microsoft is behind R... well... the open source version will hopefully improve from the former. R, regular R, is a horrible memory hog, for now. Maybe that will improve. Revolution improved that, certainly, and we'll see how it does inside of MS. Microso

Well, that depends.

Here's a post from Revolution Analytics (now part of Microsoft): Companies Using R

R is a lot of things. It's the open source version. It's the Microsoft version. It's Renjin.org | The JVM-based interpreter for the R language for statistical computing

And there are others.

Now that there's the R Consortium, and that Microsoft is behind R... well... the open source version will hopefully improve from the former. R, regular R, is a horrible memory hog, for now. Maybe that will improve. Revolution improved that, certainly, and we'll see how it does inside of MS. Microsoft is giving it's version of R to developers for free (for now) - but only RedHat and SUSE Linux builds available, as well as the Windows one. I could probably make it work on another distro, but to be honest, regular R is sufficient for what I do. And they do stress that the Windows version is the most powerful, the most complete.

The problem with that is for me: I use Windows for entertainment. All development and data projects are in Linux. (As far as RedHat and SUSE go, well, Linux for me means free.)

R is great. Don't get me wrong. I love R. I've loved it since I met it. But there are so many syntax differences in its huge number of packages. Hadley Wickham is doing tremendous work, as well as others, but Hadley's, well, Hadley. Take ggplot2, though. Completely different than base or lattice graphics. Which is, in some senses, a good thing. However, there are no real design principles common to every package.

There's another problem with R, one that I haven't seen discussed much, although I haven't looked lately. It's not a language. It's an environment equipped with a language. The R environment with the R language. The latter does not exist without the former.

Profile photo for Shivanshu Mishra

I prefer coding in R as i found it

  1. Intuitive
  2. Suitable for statistical analysis(Sampling Distribution, Central Limit Theorem, Hypothesis Testing, Types of Errors, ANOVA, Chi-Square, T-Test)
  3. Easy to perform Data visualization(ggplot2 package)
  4. Easy to perform Data manipulation( dplyr package)
  5. convenient to convert our data in Tidy format with Tidyr package.

It has amazing set of well written libraries created by prominent statistical experts, thanks to Hadley Wickham (creator of dplyr package)

You can learn R programming from Learn R, Python & Data Science Online | DataCamp and for further studies , pre

I prefer coding in R as i found it

  1. Intuitive
  2. Suitable for statistical analysis(Sampling Distribution, Central Limit Theorem, Hypothesis Testing, Types of Errors, ANOVA, Chi-Square, T-Test)
  3. Easy to perform Data visualization(ggplot2 package)
  4. Easy to perform Data manipulation( dplyr package)
  5. convenient to convert our data in Tidy format with Tidyr package.

It has amazing set of well written libraries created by prominent statistical experts, thanks to Hadley Wickham (creator of dplyr package)

You can learn R programming from Learn R, Python & Data Science Online | DataCamp and for further studies , prefer reading this book

About · Careers · Privacy · Terms · Contact · Languages · Your Ad Choices · Press ·
© Quora, Inc. 2025