M.Iqbal Jeelani
From past one decade data science has exploded and is showing no sign of stopping any time soon. Billions of people around the globe are generating new data for which data scientists are required and for every data scientist the first choice is R language. Due to its extreme flexibility and opensource nature it has become a primary tool for statistical analysis and data visualization. At Bell laboratories in 1976, John Chambers and his colleagues began to develop a programming language called S, which provided the possibility to program data sets.
In context to this Ross Ihaka and Robert Gentleman at University of Auckland, New Zealand in the year 1995 started working on an open source implementation of a language which was similar to S, since then its is managed by R core group and its first beta stable version was released in 2000. R also has an integrated development environment known as R studio which was developed on 28th of February 2011 by J.J Allaire an American software engineer. Ris the mostpreferred language among statisticians and data analysts because of its interactive nature. Since data analysis is basically an interactive process, which means that what is the next step that you are going to take is determined by what you see at one stage.It is obvious that if you have to do the analysis of your data, you would like to present your data in a natural form.
For that, you need to use data structures and R includes an impressive mechanism to create data structures. R is a beast when it comes to data visualization andhas state of the art graphical capabilities. R is unbeatable for data visualization taskand one of its remarkabledata visualization library known as ggplot2 developed by Hadley Wickham 2005 creates elegant and eye catching graphs. R is an object oriented programming language, which gives full control of the actions to the user. Free and open source nature of R is probably the mostimportant reason why statisticians and many researchers across the globe prefer it.R incorporates all of the standard statistical procedures, models and analysis, as well as provides tools for managing and manipulating data sets. R has got a massive and overwhelming community support and there are thousands of people across the globe who have come together to make contributions by developing packages for data analysis and visualization.
R libraries like metan, variability, agricolae, agrostab, biotools, etc have played a very significant role for analysis of data in context to agricultural research and are the most preferred R libraries among statisticians working in the field of agricultural and allied sciences. As far as career prospects in R programming is concerned there is a huge demand and job opening for R programmers across the world because in past few years data science has gained a remarkable traction because big data and machine learning algothirms have become fairly relevant and need of hour for any scientific research in present times.
It’s worth mentioning the main reason of popularity of R among data professionals is that people engaged with R community ensures that R does not get outdated or old school as they keep on adding and updating new functionalities through different libraries.Finally R is world’s most widely used statistical programming language among statisticians and is being used in big multinational companies apart from scientific research due to its versatility and open source nature, as B.K Hackenberger in his research paper in Croatian Medical Journal 2020 has rightly said “R is the most unfriendly, but probably the best software”.
(The author is working as an Assistant Professor in SKUAST-Jammu)