Favorite R Packages

R Package Hexagon Stickers

Data Import

  • data.tablefread() for fast table reading.
  • readr – A fast and friendly way to read tabular data into R.
  • readxl – Reads Microsoft Excel spreadsheets.
  • openxlsx – Reads Microsoft Excel spreadsheets.
  • haven – Reads SPSS, Stata and SAS files in R.
  • httr – Reads data from web APIs.
  • rvest – Scrapes data from web pages.
  • jsonlite – A robust and quick way to parse JSON files in R.
  • xml2 – Reads HTML and XML data.
  • webreadr – Reads common web log formats.
  • feather – A fast, lightweight file format used by both R and Python.
  • googlesheets – Reads Google spreadsheets.
  • rdrop2 – Dropbox interface from R.
  • git2r – Tools to access git repositories.
  • DBI – A universal interface to database management systems (DBMS).
    • RODBC – Connect to ODBC databases using DBI.
    • RMySQL – MySQL driver for DBI.
    • RPostgres – Postgres driver for DBI.
    • RSQLite – SQlite driver for DBI.
    • bigrquery – Google BigQuery driver for DBI.

Data Manipulation

  • data.table – Fast data manipulation in a short and flexible syntax.
  • dplyr – Fast data frames manipulation and database query.
  • dbplyr – Database backend for dplyr.
  • tidyr – Easily tidy data with spread and gather functions.
  • lubridate – A set of functions to work with dates and times.
  • stringr – Consistent API for string processing, built on top of stringi.
  • stringi – Fast string processing facilities.
  • fuzzyjoin – Join tables together on inexact matching.
  • broom – Convert statistical analysis objects into tidy data frames.
  • vtreat – Tools for pre-processing variables for predictive modeling.
  • magrittr – A concise syntax for calling sequences of functions.
  • widyr – Widen, process, then re-tidy data.
  • tibble – Efficient display structure for tabular data.
  • Matrix – LAPACK methods for dense and sparse matrix operations
  • queryparser – Translate SQL queries into R expressions.
  • sqldf – Perform SQL selects on R data frames.
  • tidyquery – Query R data frames with SQL

Exploratory Data Analysis

  • DataExplorer – Simplifies and automates EDA process and report generation.
  • inspectdf – Tools for exploring and comparing data frames.
  • visdat – Preliminary visualisation of data.
  • dataReporter – Reproducible data screening checks and report of possible errors.
  • correlationfunnel – Exploratory eata analysis using a correlation funnel.
  • skimr – Compact and flexible summaries of data.
  • naniar – Data structures, summaries, and visualisations for missing data.

Data Visualisation

  • ggplot2 – An implementation of the Grammar of Graphics.
  • ggplot2 Extensions – Showcases of ggplot2 extensions.
    • ggrepel – Prevent plot labels from overlapping.
    • ggtext – Implements rich-text (basic HTML and Markdown) rendering for the grid graphics engine.
    • ggraph – Graphs, networks, trees and more.
    • gganimate – Create easy animations with ggplot2.
    • patchwork – Combine separate ggplots into the same graphic.
    • cowplot – Streamlined plot theme and plot annotations for ggplot2.
    • gghighlight – Highlight points and lines in ggplot2.
    • ggforce – Provides a repository of additional geoms, stats, etc.
    • ggfortify – A unified interface to ggplot2 popular statistical packages using autoplot.
    • GGally – Extensions of ggplot2 such as ggcoef, ggpairs and ggsurv.
    • ggmap – Maps with Google Maps, Open Street Maps, etc.
  • tmap – Thematic maps, i.e. geographical maps in which spatial data distributions are visualized.
  • corrplot – A graphical display of a correlation matrix or general matrix. It also contains some algorithms to do matrix reordering.
  • waffle – Make waffle (square pie) charts in R.
  • r2d3 – R Interface to D3 Visualisations.
  • rgl – Interactive 3D plots.
  • coefplot – Plotting model coefficients.
  • Color palettes:
    • paletteer – Collection of most color palettes in a single R package.
    • colorspace – A Toolbox for Manipulating and Assessing Colors and Palettes.
    • viridis – Matplotlib viridis color pallete for R.
    • munsell – Munsell color palettes for R.
    • RColorBrewer – ColorBrewer palettes.
    • dichromat – Color-blind friendly palettes.
    • Rtistic – Cookbook to help build an R package with custom RMarkdown themes and ggplot2 themes & palettes.
  • ggplot2 themes:
    • ggthemes – Extra themes, scales and geoms for ggplot2.
    • tvthemes – TV show themes and color palettes for ggplot2
    • ggtech – ggplot2 tech themes, scales, and geoms (Airbnb).
    • ggthemr – Themes for ggplot2.
    • hrbrthemes – Opinionated, typographic-centric ggplot2 themes and theme components.

HTML Widgets

  • htmlwidgets – Framework for creating JavaScript widgets with R.
  • crosstalk – Implements cross-widget interactions (such as linked brushing and filtering).
  • DT – Displays R matrices or data frames as interactive HTML tables.
  • leaflet – JavaScript library for interactive maps.
  • plotly – Interactive ggplot2 and Shiny plotting with plot.ly.
  • rbokeh – Interactive Bokeh plots.
  • highcharter – Interactive Highcharts plots.
  • visNetwork – Interactive network graphs.
  • networkD3 – Interative d3 network graphs.
  • d3heatmap – Interactive d3 heatmaps.
  • threejs – Interactive 3d plots and globes.
  • rglwidget – Interactive 3d plot.
  • DiagrammeR – Interactive diagrams.
  • MetricsGraphics – Interactive MetricsGraphics plots.
  • rCharts – Many interactive JavaScript visualizations.
  • dygraphs – Charting time-series data in R.
  • wordcloud2 – R interface to wordcloud2.js.
  • rpivotTable – R htmlwidget visualization library built around the Javascript pivottable library.
  • htmltools – Tools for HTML generation and output.

Data Profiling

  • skimr – Compact and flexible summaries of data, a frictionless, pipeable approach to dealing with summary statistic.
  • dlookr – Tools for data diagnosis, exploration, Transformation.
  • assertr – Suite of functions designed to verify assumptions about data early in an analysis pipeline so that data errors are spotted early and can be addressed quickly.
  • visdat – Preliminary exploratory visualisation of data.
  • daff – Diff, patch and merge for data.frames.

R Programming

  • purrr – A functional programming toolkit for R.
  • glue – Glue strings to data in R. Small, fast, dependency free interpreted string literals.
  • formatRtidy_source() to format R source code.
  • rstudioapi – Safely access RStudio IDE’s API.
  • profvis – Interactive Visualizations for Profiling R Code.
  • Rcpp – C++ API for R
  • RcppArmadillo – interface to ‘Armadillo’ Templated Linear Algebra Library.
  • R6 – Fast, simple object class that uses reference semantics.
  • RStudio Addins – List of RStudio addins.

Package Development

  • devtools – Tools to make an R developer’s life easier.
  • usethis – usethis is a workflow package: it automates repetitive tasks that arise during project setup and development, both for R packages and non-package projects.
  • testthat – Easy-to-use system for unit testing packages.
  • xpectr – Generates expectations for testthat unit testing.
  • roxygen2 – Easy-to-use method for documenting packages.
  • R6 – simpler, faster, lighter-weight alternative to R’s built-in classes.
  • assertr – Suite of functions designed to verify assumptions about data early in an analysis pipeline so that data errors are spotted early and can be addressed quickly.

Communication and Reproducible Research

  • rmarkdown – Dynamic documents for R.
  • flexdashboard – Easy-to-create interactive dashboards based on rmarkdown.
  • xaringan – Create HTML5 slides with R Markdown and the JavaScript library.
  • rticles – Ready to use R Markdown templates.
  • tufte – Tufte handout R Markdown template.
  • blogdown – Create blogs and websites with R Markdown.
  • bookdown – Authoring books and long documents with R Markdown.
  • thesisdown – An updated R Markdown thesis template using the bookdown package.
  • rstudio4edu – Handbook filled with practical advice and resources (and templates!) for educators who teach with R and RStudio.
  • rmd4edu – R Markdown templates, designed for educators.
  • MonashEBSTemplates – Rmarkdown templates for use at Monash University Department of Econometrics and Business Statistics.
  • sorensonimpact – R Markdown template example from Sorenson Impact.
  • kableExtra – Build common complex HTML tables and manipulate table styles.
  • xtable – Export Tables to LaTeX or HTML.
  • packrat – Creates project specific libraries, which handle package versioning and enhance reproducibility.
  • renv – Create reproducible environments for R projects.
  • checkpoint – Install Packages from Snapshots on the Checkpoint Server for Reproducibility.
  • installr – Functions for installing softwares from within R (for Windows).
  • starters – One-liners to set up R Projects for packages, projects or training.
  • ProjectTemplate – Automates the creation of new statistical analysis projects.
  • projmgr – Task tracking and project management with GitHub.

Cloud Computing & Automation

  • Rocker – R configurations for Docker.
  • AzureR – Family of packages for working with Azure from R.
  • ghactions – GitHub Actions for R.
  • muggle – Opinionated Devops for R Data Products Strictly Without Magic.
  • stevedore – Docker client for R.

Shiny

  • shiny – Easy interactive web applications with R.
  • shinyjs – Easily improve the user interaction and user experience in your Shiny apps in seconds.
  • shinydashboard – interactive dashboards with R.
  • shinydashboardPlus – Extensions for shinydashboard.
  • shinybulma – Bulma.io for Shiny.
  • shinythemes – style themes for Shiny apps.
  • fresh – Creates fresh themes for use in shiny, shinydashboard and bs4Dash applications and flexdashboard documents.
  • miniUI – UI elements for Shiny gadgets, interactive apps integrated into the R commandline workflow.
  • rsconnect – Deploys Shiny apps to shinyapps.io.
  • shinyapps.io – Hosting service for Shiny apps.
  • Shiny Server Open Source – Open source server to host Shiny apps.
  • ShinyProxy – Deploy Shiny apps in an enterprise context.
  • promises – Abstractions for promise-based asynchronous programming.

Web Technologies and Services

  • httr – User-friendly RCurl wrapper.
  • rvest – Simple web scraping for R, using CSSSelect or XPath syntax.
  • plumber – A library to expose existing R code as web API.

Parallel Computing

  • parallel – R started with release 2.14.0 which includes a new package parallel incorporating (slightly revised) copies of packages multicore and snow.
  • foreach – Executing the loop in parallel.
  • future – A minimal, efficient, cross-platform unified Future API for parallel and distributed processing in R; designed for beginners as well as advanced developers.
  • sparklyr – R interface for Apache Spark from RStudio.

Statistical Modelling and Machine Learning

  • tidymodels – Meta-package for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.

    • broom – Takes the messy output of built-in functions in R, such as lm, nls, or t.test, and turns them into tidy data frames.
    • infer – Modern approach to statistical inference.
    • recipes – General data preprocessor with a modern interface. It can create model matrices that incorporate feature engineering, imputation, and other help tools.
    • rsample – Infrastructure for resampling data so that models can be assessed and empirically validated.
    • yardstick – Tools for evaluating models (e.g. accuracy, RMSE, etc.)
    • tidypredict – Translates some model prediction equations to SQL for high-performance computing.
    • tidyposterior – Compare models using resampling and Bayesian analysis.
    • tidytext – Tidy tools for quantitative text analysis, including basic text summarization, sentiment analysis, and text modeling.
  • parsnip – A tidy unified interface to models.

  • caret – Classification and Regression Training.

  • keras – R interface to Keras.

  • e1071 – Misc Functions of the Department of Statistics (e1071), TU Wien.

  • earth – Multivariate Adaptive Regression Spline Models.

  • evtree – Evolutionary Learning of Globally Optimal Trees.

  • gamlss – Generalised Additive Models for Location Scale and Shape.

  • gbm – Generalized Boosted Regression Models.

  • glmnet – Lasso and elastic-net regularized generalized linear models.

  • h2o – Scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms.

  • L0Learn – Fast algorithms for best subset selection.

  • lme4 – Mixed-effects models using Eigen C++ library.

  • Metrics – Evaluation Metrics for Machine Learning.

  • mgcv – Mixed GAM Computation Vehicle with Automatic Smoothness Estimation.

  • mlr – Extensible framework for classification, regression, survival analysis and clustering.

  • MXNet – MXNet brings flexible and efficient GPU computing and state-of-art deep learning to R.

  • nlme – Linear and Nonlinear Mixed Effects Models.

  • pROC – Tools for visualizing, smoothing and comparing ROC curves.

  • ROCR – Plots to visualize classifier performance.

  • rpart – Recursive Partitioning and Regression Trees.

    • tidyrules – Converts texual rules from C5, cubist and rpart models.
  • randomForest – randomForest: Breiman and Cutler’s random forests for classification and regression.

  • ranger – A Fast Implementation of Random Forests.

  • smurf – Sparse Multi-Type Regularized Feature Modeling.

  • SuperLearner and subsemble – Multi-algorithm ensemble learning packages.

  • xgboost – eXtreme Gradient Boosting Tree model, well known for its speed and performance.

  • marginaleffects – Compute and plot adjusted predictions, contrasts, marginal effects, and marginal means.

  • modelsummary – Summary Tables and Plots for Statistical Models and Data.

  • easystats – Easy Statistical Modeling, Visualization, and Reporting.

Time Series

  • tidyverts – Tidy tools for time series.
    • tsibble – Temporal data frames and tools.
    • fable – Tidy forecasting.
    • feast – Feature extraction and statistics.
  • forecast – Timeseries forecasting using ARIMA, ETS, STLM, TBATS, and neural network models.
  • forecastHybrid – Automatic ensemble and cross validation of ARIMA, ETS, STLM, TBATS, and neural network models from the “forecast” package
  • zoo – Data structures for time series data.
  • xts – Extensible time series class that provides uniform handling of many R time series classes by extending zoo.
  • modeltime – The time series forecasting package for the tidymodels ecosystem.
  • anomalize – Tidy Anomaly Detection using Twitter’s AnomalyDetection method.
  • AnomalyDetection – AnomalyDetection R package from Twitter.
  • BreakoutDetection – Breakout Detection via Robust E-Statistics from Twitter.
  • CausalImpact – Causal inference using Bayesian structural time-series models.
  • prophet – Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

Text Mining and Natural Language Processing

  • tidytext – Implementing tidy principles of Hadley Wickham to text mining.
  • text2vec – Fast and memory-friendly tools for text vectorization, topic modeling (LDA, LSA), word embeddings (GloVe), similarities.
  • NLP Ecosystem – An overview of the NLP ecosystem in R by Jan Wijffels from Bnosac.
  • tm – A comprehensive text mining framework for R.
  • NLP – Basic functions for Natural Language Processing.
  • topicmodels – Topic modeling interface to the C code developed by by David M. Blei for Topic Modeling (Latent Dirichlet Allocation (LDA), and Correlated Topics Models (CTM)).
  • stm – Structural topic models that allow for document-level metadata predictors.
  • quanteda – R functions for Quantitative Analysis of Textual Data.
  • readtext – Import and handling for plain and formatted text files and their metadata (.txt, .docx, .pdf, .json, etc).
  • pdftools – Text extraction, rendering and converting of PDF documents.
  • tesseract – Bindings to Tesseract, a powerful optical character recognition (OCR) engine that supports over 100 languages.

Spatial Data

  • leaflet – One of the most popular JavaScript libraries interactive maps.
  • ggmap – Plotting maps in R with ggplot2.
  • tmap – Thematic maps, i.e. geographical maps in which spatial data distributions are visualized.
  • sf – Simple Features for R, improved Classes and Methods for Spatial Data.
  • sp – Classes and Methods for Spatial Data.
  • rgeos – Interface to Geometry Engine – Open Source.
  • rgdal – Bindings for the Geospatial Data Abstraction Library.
  • maptools – Tools for Reading and Handling Spatial Objects.

Actuarial Science

  • actuar – An R package for actuarial science.
  • ReIns – Actuarial and statistical aspects of reinsurance, including extreme value theory and the splicing of mixed Erlang (ME) distribution with EVT distributions.

Bayesian

  • rjags – R interface to the JAGS MCMC library.
  • rstan – R interface to the Stan MCMC software.
  • brms – Bayesian Regression Models using Stan.

Mixed modelling

  • lme4 – Linear Mixed-Effects Models using Eigen and S4
  • nlme – Linear and Nonlinear Mixed Effects Models.

Optimization

  • lpSolve – Interface to Lp_solve to Solve Linear/Integer Programs.
  • minqa – Derivative-free optimization algorithms by quadratic approximation.
  • nloptr – NLopt is a free/open-source library for nonlinear optimization.
  • ompr – Model mixed integer linear programs in an algebraic way directly in R.
  • Rglpk – R/GNU Linear Programming Kit Interface
  • ROI – The R Optimization Infrastructure (‘ROI’) is a sophisticated framework for handling optimization problems in R.

Finance

  • quantmod – Quantitative Financial Modelling & Trading Framework for R.
  • PerformanceAnalytics – Econometric tools for performance and risk analysis.
  • zoo – S3 Infrastructure for Regular and Irregular Time Series.
  • xts – eXtensible Time Series.
  • tseries – Time series analysis and computational finance.
  • fAssets – Analysing and Modelling Financial Assets.

Bioinformatics and Biostatistics

  • Bioconductor – Tools for the analysis and comprehension of high-throughput genomic data.
  • genetics – Classes and methods for handling genetic data.
  • gap – An integrated package for genetic data analysis of both population and family data.
  • ape – Analyses of Phylogenetics and Evolution.
  • pheatmap – Pretty heatmaps made easy.

Network Analysis

Next