Favorite R Packages
Data Import
- data.table –
fread()
for fast table reading. - readr – A fast and friendly way to read tabular data into R.
- readxl – Reads Microsoft Excel spreadsheets.
- openxlsx – Reads Microsoft Excel spreadsheets.
- haven – Reads SPSS, Stata and SAS files in R.
- httr – Reads data from web APIs.
- rvest – Scrapes data from web pages.
- jsonlite – A robust and quick way to parse JSON files in R.
- xml2 – Reads HTML and XML data.
- webreadr – Reads common web log formats.
- feather – A fast, lightweight file format used by both R and Python.
- googlesheets – Reads Google spreadsheets.
- rdrop2 – Dropbox interface from R.
- git2r – Tools to access git repositories.
- DBI – A universal interface to database management systems (DBMS).
Data Manipulation
- data.table – Fast data manipulation in a short and flexible syntax.
- dplyr – Fast data frames manipulation and database query.
- dbplyr – Database backend for dplyr.
- tidyr – Easily tidy data with spread and gather functions.
- lubridate – A set of functions to work with dates and times.
- stringr – Consistent API for string processing, built on top of stringi.
- stringi – Fast string processing facilities.
- fuzzyjoin – Join tables together on inexact matching.
- broom – Convert statistical analysis objects into tidy data frames.
- vtreat – Tools for pre-processing variables for predictive modeling.
- magrittr – A concise syntax for calling sequences of functions.
- widyr – Widen, process, then re-tidy data.
- tibble – Efficient display structure for tabular data.
- Matrix – LAPACK methods for dense and sparse matrix operations
- queryparser – Translate SQL queries into R expressions.
- sqldf – Perform SQL selects on R data frames.
- tidyquery – Query R data frames with SQL
Exploratory Data Analysis
- DataExplorer – Simplifies and automates EDA process and report generation.
- inspectdf – Tools for exploring and comparing data frames.
- visdat – Preliminary visualisation of data.
- dataReporter – Reproducible data screening checks and report of possible errors.
- correlationfunnel – Exploratory eata analysis using a correlation funnel.
- skimr – Compact and flexible summaries of data.
- naniar – Data structures, summaries, and visualisations for missing data.
Data Visualisation
- ggplot2 – An implementation of the Grammar of Graphics.
- ggplot2 Extensions – Showcases of ggplot2 extensions.
- ggrepel – Prevent plot labels from overlapping.
- ggtext – Implements rich-text (basic HTML and Markdown) rendering for the grid graphics engine.
- ggraph – Graphs, networks, trees and more.
- gganimate – Create easy animations with ggplot2.
- patchwork – Combine separate ggplots into the same graphic.
- cowplot – Streamlined plot theme and plot annotations for ggplot2.
- gghighlight – Highlight points and lines in ggplot2.
- ggforce – Provides a repository of additional geoms, stats, etc.
- ggfortify – A unified interface to ggplot2 popular statistical packages using
autoplot
. - GGally – Extensions of ggplot2 such as
ggcoef
,ggpairs
andggsurv
. - ggmap – Maps with Google Maps, Open Street Maps, etc.
- tmap – Thematic maps, i.e. geographical maps in which spatial data distributions are visualized.
- corrplot – A graphical display of a correlation matrix or general matrix. It also contains some algorithms to do matrix reordering.
- waffle – Make waffle (square pie) charts in R.
- r2d3 – R Interface to D3 Visualisations.
- rgl – Interactive 3D plots.
- coefplot – Plotting model coefficients.
- Color palettes:
- paletteer – Collection of most color palettes in a single R package.
- colorspace – A Toolbox for Manipulating and Assessing Colors and Palettes.
- viridis – Matplotlib viridis color pallete for R.
- munsell – Munsell color palettes for R.
- RColorBrewer – ColorBrewer palettes.
- dichromat – Color-blind friendly palettes.
- Rtistic – Cookbook to help build an R package with custom RMarkdown themes and ggplot2 themes & palettes.
- ggplot2 themes:
- ggthemes – Extra themes, scales and geoms for ggplot2.
- tvthemes – TV show themes and color palettes for ggplot2
- ggtech – ggplot2 tech themes, scales, and geoms (Airbnb).
- ggthemr – Themes for ggplot2.
- hrbrthemes – Opinionated, typographic-centric ggplot2 themes and theme components.
HTML Widgets
- htmlwidgets – Framework for creating JavaScript widgets with R.
- crosstalk – Implements cross-widget interactions (such as linked brushing and filtering).
- DT – Displays R matrices or data frames as interactive HTML tables.
- leaflet – JavaScript library for interactive maps.
- plotly – Interactive ggplot2 and Shiny plotting with plot.ly.
- rbokeh – Interactive Bokeh plots.
- highcharter – Interactive Highcharts plots.
- visNetwork – Interactive network graphs.
- networkD3 – Interative d3 network graphs.
- d3heatmap – Interactive d3 heatmaps.
- threejs – Interactive 3d plots and globes.
- rglwidget – Interactive 3d plot.
- DiagrammeR – Interactive diagrams.
- MetricsGraphics – Interactive MetricsGraphics plots.
- rCharts – Many interactive JavaScript visualizations.
- dygraphs – Charting time-series data in R.
- wordcloud2 – R interface to wordcloud2.js.
- rpivotTable – R htmlwidget visualization library built around the Javascript pivottable library.
- htmltools – Tools for HTML generation and output.
Data Profiling
- skimr – Compact and flexible summaries of data, a frictionless, pipeable approach to dealing with summary statistic.
- dlookr – Tools for data diagnosis, exploration, Transformation.
- assertr – Suite of functions designed to verify assumptions about data early in an analysis pipeline so that data errors are spotted early and can be addressed quickly.
- visdat – Preliminary exploratory visualisation of data.
- daff – Diff, patch and merge for data.frames.
R Programming
- purrr – A functional programming toolkit for R.
- glue – Glue strings to data in R. Small, fast, dependency free interpreted string literals.
- formatR –
tidy_source()
to format R source code. - rstudioapi – Safely access RStudio IDE’s API.
- profvis – Interactive Visualizations for Profiling R Code.
- Rcpp – C++ API for R
- RcppArmadillo – interface to ‘Armadillo’ Templated Linear Algebra Library.
- R6 – Fast, simple object class that uses reference semantics.
- RStudio Addins – List of RStudio addins.
Package Development
- devtools – Tools to make an R developer’s life easier.
- usethis – usethis is a workflow package: it automates repetitive tasks that arise during project setup and development, both for R packages and non-package projects.
- testthat – Easy-to-use system for unit testing packages.
- xpectr – Generates expectations for testthat unit testing.
- roxygen2 – Easy-to-use method for documenting packages.
- R6 – simpler, faster, lighter-weight alternative to R’s built-in classes.
- assertr – Suite of functions designed to verify assumptions about data early in an analysis pipeline so that data errors are spotted early and can be addressed quickly.
Communication and Reproducible Research
- rmarkdown – Dynamic documents for R.
- flexdashboard – Easy-to-create interactive dashboards based on rmarkdown.
- xaringan – Create HTML5 slides with R Markdown and the JavaScript library.
- rticles – Ready to use R Markdown templates.
- tufte – Tufte handout R Markdown template.
- blogdown – Create blogs and websites with R Markdown.
- bookdown – Authoring books and long documents with R Markdown.
- thesisdown – An updated R Markdown thesis template using the bookdown package.
- rstudio4edu – Handbook filled with practical advice and resources (and templates!) for educators who teach with R and RStudio.
- rmd4edu – R Markdown templates, designed for educators.
- MonashEBSTemplates – Rmarkdown templates for use at Monash University Department of Econometrics and Business Statistics.
- sorensonimpact – R Markdown template example from Sorenson Impact.
- kableExtra – Build common complex HTML tables and manipulate table styles.
- xtable – Export Tables to LaTeX or HTML.
- packrat – Creates project specific libraries, which handle package versioning and enhance reproducibility.
- renv – Create reproducible environments for R projects.
- checkpoint – Install Packages from Snapshots on the Checkpoint Server for Reproducibility.
- installr – Functions for installing softwares from within R (for Windows).
- starters – One-liners to set up R Projects for packages, projects or training.
- ProjectTemplate – Automates the creation of new statistical analysis projects.
- projmgr – Task tracking and project management with GitHub.
Cloud Computing & Automation
- Rocker – R configurations for Docker.
- AzureR – Family of packages for working with Azure from R.
- ghactions – GitHub Actions for R.
- muggle – Opinionated Devops for R Data Products Strictly Without Magic.
- stevedore – Docker client for R.
Shiny
- shiny – Easy interactive web applications with R.
- shinyjs – Easily improve the user interaction and user experience in your Shiny apps in seconds.
- shinydashboard – interactive dashboards with R.
- shinydashboardPlus – Extensions for shinydashboard.
- shinybulma – Bulma.io for Shiny.
- shinythemes – style themes for Shiny apps.
- fresh – Creates fresh themes for use in shiny, shinydashboard and bs4Dash applications and flexdashboard documents.
- miniUI – UI elements for Shiny gadgets, interactive apps integrated into the R commandline workflow.
- rsconnect – Deploys Shiny apps to shinyapps.io.
- shinyapps.io – Hosting service for Shiny apps.
- Shiny Server Open Source – Open source server to host Shiny apps.
- ShinyProxy – Deploy Shiny apps in an enterprise context.
- promises – Abstractions for promise-based asynchronous programming.
Web Technologies and Services
- httr – User-friendly RCurl wrapper.
- rvest – Simple web scraping for R, using CSSSelect or XPath syntax.
- plumber – A library to expose existing R code as web API.
Parallel Computing
- parallel – R started with release 2.14.0 which includes a new package parallel incorporating (slightly revised) copies of packages multicore and snow.
- foreach – Executing the loop in parallel.
- future – A minimal, efficient, cross-platform unified Future API for parallel and distributed processing in R; designed for beginners as well as advanced developers.
- sparklyr – R interface for Apache Spark from RStudio.
Statistical Modelling and Machine Learning
tidymodels – Meta-package for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.
- broom – Takes the messy output of built-in functions in R, such as lm, nls, or t.test, and turns them into tidy data frames.
- infer – Modern approach to statistical inference.
- recipes – General data preprocessor with a modern interface. It can create model matrices that incorporate feature engineering, imputation, and other help tools.
- rsample – Infrastructure for resampling data so that models can be assessed and empirically validated.
- yardstick – Tools for evaluating models (e.g. accuracy, RMSE, etc.)
- tidypredict – Translates some model prediction equations to SQL for high-performance computing.
- tidyposterior – Compare models using resampling and Bayesian analysis.
- tidytext – Tidy tools for quantitative text analysis, including basic text summarization, sentiment analysis, and text modeling.
parsnip – A tidy unified interface to models.
caret – Classification and Regression Training.
keras – R interface to Keras.
e1071 – Misc Functions of the Department of Statistics (e1071), TU Wien.
earth – Multivariate Adaptive Regression Spline Models.
evtree – Evolutionary Learning of Globally Optimal Trees.
gamlss – Generalised Additive Models for Location Scale and Shape.
gbm – Generalized Boosted Regression Models.
glmnet – Lasso and elastic-net regularized generalized linear models.
h2o – Scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms.
L0Learn – Fast algorithms for best subset selection.
lme4 – Mixed-effects models using Eigen C++ library.
Metrics – Evaluation Metrics for Machine Learning.
mgcv – Mixed GAM Computation Vehicle with Automatic Smoothness Estimation.
mlr – Extensible framework for classification, regression, survival analysis and clustering.
MXNet – MXNet brings flexible and efficient GPU computing and state-of-art deep learning to R.
nlme – Linear and Nonlinear Mixed Effects Models.
pROC – Tools for visualizing, smoothing and comparing ROC curves.
ROCR – Plots to visualize classifier performance.
rpart – Recursive Partitioning and Regression Trees.
- tidyrules – Converts texual rules from C5, cubist and rpart models.
randomForest – randomForest: Breiman and Cutler’s random forests for classification and regression.
ranger – A Fast Implementation of Random Forests.
smurf – Sparse Multi-Type Regularized Feature Modeling.
SuperLearner and subsemble – Multi-algorithm ensemble learning packages.
xgboost – eXtreme Gradient Boosting Tree model, well known for its speed and performance.
marginaleffects – Compute and plot adjusted predictions, contrasts, marginal effects, and marginal means.
modelsummary – Summary Tables and Plots for Statistical Models and Data.
easystats – Easy Statistical Modeling, Visualization, and Reporting.
Time Series
- tidyverts – Tidy tools for time series.
- forecast – Timeseries forecasting using ARIMA, ETS, STLM, TBATS, and neural network models.
- forecastHybrid – Automatic ensemble and cross validation of ARIMA, ETS, STLM, TBATS, and neural network models from the “forecast” package
- zoo – Data structures for time series data.
- xts – Extensible time series class that provides uniform handling of many R time series classes by extending zoo.
- modeltime – The time series forecasting package for the tidymodels ecosystem.
- anomalize – Tidy Anomaly Detection using Twitter’s AnomalyDetection method.
- AnomalyDetection – AnomalyDetection R package from Twitter.
- BreakoutDetection – Breakout Detection via Robust E-Statistics from Twitter.
- CausalImpact – Causal inference using Bayesian structural time-series models.
- prophet – Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
Text Mining and Natural Language Processing
- tidytext – Implementing tidy principles of Hadley Wickham to text mining.
- text2vec – Fast and memory-friendly tools for text vectorization, topic modeling (LDA, LSA), word embeddings (GloVe), similarities.
- NLP Ecosystem – An overview of the NLP ecosystem in R by Jan Wijffels from Bnosac.
- tm – A comprehensive text mining framework for R.
- NLP – Basic functions for Natural Language Processing.
- topicmodels – Topic modeling interface to the C code developed by by David M. Blei for Topic Modeling (Latent Dirichlet Allocation (LDA), and Correlated Topics Models (CTM)).
- stm – Structural topic models that allow for document-level metadata predictors.
- quanteda – R functions for Quantitative Analysis of Textual Data.
- readtext – Import and handling for plain and formatted text files and their metadata (.txt, .docx, .pdf, .json, etc).
- pdftools – Text extraction, rendering and converting of PDF documents.
- tesseract – Bindings to Tesseract, a powerful optical character recognition (OCR) engine that supports over 100 languages.
Spatial Data
- leaflet – One of the most popular JavaScript libraries interactive maps.
- ggmap – Plotting maps in R with ggplot2.
- tmap – Thematic maps, i.e. geographical maps in which spatial data distributions are visualized.
- sf – Simple Features for R, improved Classes and Methods for Spatial Data.
- sp – Classes and Methods for Spatial Data.
- rgeos – Interface to Geometry Engine – Open Source.
- rgdal – Bindings for the Geospatial Data Abstraction Library.
- maptools – Tools for Reading and Handling Spatial Objects.
Actuarial Science
- actuar – An R package for actuarial science.
- ReIns – Actuarial and statistical aspects of reinsurance, including extreme value theory and the splicing of mixed Erlang (ME) distribution with EVT distributions.
Bayesian
- rjags – R interface to the JAGS MCMC library.
- rstan – R interface to the Stan MCMC software.
- brms – Bayesian Regression Models using Stan.
Mixed modelling
- lme4 – Linear Mixed-Effects Models using Eigen and S4
- nlme – Linear and Nonlinear Mixed Effects Models.
Optimization
- lpSolve – Interface to Lp_solve to Solve Linear/Integer Programs.
- minqa – Derivative-free optimization algorithms by quadratic approximation.
- nloptr – NLopt is a free/open-source library for nonlinear optimization.
- ompr – Model mixed integer linear programs in an algebraic way directly in R.
- Rglpk – R/GNU Linear Programming Kit Interface
- ROI – The R Optimization Infrastructure (‘ROI’) is a sophisticated framework for handling optimization problems in R.
Finance
- quantmod – Quantitative Financial Modelling & Trading Framework for R.
- PerformanceAnalytics – Econometric tools for performance and risk analysis.
- zoo – S3 Infrastructure for Regular and Irregular Time Series.
- xts – eXtensible Time Series.
- tseries – Time series analysis and computational finance.
- fAssets – Analysing and Modelling Financial Assets.
Bioinformatics and Biostatistics
- Bioconductor – Tools for the analysis and comprehension of high-throughput genomic data.
- genetics – Classes and methods for handling genetic data.
- gap – An integrated package for genetic data analysis of both population and family data.
- ape – Analyses of Phylogenetics and Evolution.
- pheatmap – Pretty heatmaps made easy.
Network Analysis
- Network Analysis List – Network Analysis related resources.
- igraph – A collection of network analysis tools.
- ggraph – Graphs, networks, trees and more.
- tidygraph – A tidy API for graph manipulation