library(learnr) library(tidyverse) library(tidymodels) library(embed) library(corrr) library(tidytext) library(gradethis) library(sortable) library(learntidymodels) knitr::opts_chunk$set(echo = FALSE, exercise.checker = gradethis::grade_learnr) zoo_names <- c("animal_name", "hair", "feathers", "eggs", "milk", "airborne", "aquatic", "predator", "toothed", "backbone", "breathes", "venomous", "fins", "legs", "tail", "domestic", "catsize", "class") anim_types <- tribble(~class, ~type, 1, "mammal", 2, "bird", 3, "reptile", 4, "fish", 5, "amphibian", 6, "insect", 7, "other_arthropods") zoo <- read_csv("http://archive.ics.uci.edu/ml/machine-learning-databases/zoo/zoo.data", col_names = zoo_names) %>% left_join(anim_types) %>% select(-class) %>% rename(animal_type=type) ### correlation ### zoo_corr <- zoo %>% select(-animal_name, -animal_type) %>% correlate() %>% rearrange() ### PCA #### pca_rec <- recipe(data = zoo, formula = ~ .) %>% update_role(animal_name, animal_type, new_role = "id") %>% step_scale(all_predictors()) %>% step_center(all_predictors()) %>% step_pca(all_predictors(), id = "pca") pca_prep <- prep(pca_rec) pca_loading <- tidy(pca_prep, id="pca") pca_variances <- tidy(pca_prep, id = "pca", type = "variance") pca_bake <- bake(pca_prep, zoo) zoo_rec <- recipe(data = zoo, formula = ~.) %>% update_role(animal_name, animal_type, new_role = "id") %>% step_normalize(all_predictors()) %>% step_pca(all_predictors()) zoo_prep <- prep(pca_rec) zoo_bake <- pca_bake zoo_juice <- juice(zoo_prep) ### UMAP ### set.seed(123) umap_rec <- recipe(~., data = zoo) %>% update_role(animal_name, animal_type, new_role = "id") %>% step_normalize(all_predictors()) %>% step_umap(all_predictors()) umap_prep <- prep(umap_rec) umap_bake <- bake(umap_prep, zoo)
Dimension reduction is a regularly used unsupervised method in exploratory data analysis and predictive models.
This tutorial will teach you how to apply these methods using the recipes package, which is a part of the tidymodels ecosystem, a collection of modeling packages designed with common APIs and a shared philosophy.
This tutorial focuses on transforming whole groups of predictors together using two different dimension reduction algorithms:
Here, we will apply these methods to explore our data. These methods can also be used for feature extraction prior to modeling.
While we're applying these methods we will cover:
recipe
role
steps
prep
bake
or juice
If you are new to tidymodels, you can learn what you need with the five Get Started articles on tidymodels.org.
The second article, Preprocessing your data with recipes, shows how to use functions from the recipes package to pre-process your data prior to model fitting.
If you aren't familiar with the recipes functions yet, reading the Preprocessing your data with recipes article would be helpful before going through this tutorial.
Let's get started!
We will use the zoo
dataset to explore these methods. zoo
contains observations collected on r nrow(zoo)
zoo animals.
To see the first ten rows of the data set click on Run Code.
You can use the black triangle that appears at the top right of the table to scroll through all of the columns in zoo
.
zoo
Alternatively, use glimpse()
from the dplyr package to see columns in a more compact way. You can click on the Solution button to get help.
library(tidyverse) glimpse(___)
glimpse(zoo)
We can see that zoo
has r nrow(zoo)
rows and r ncol(zoo)
columns, two of them (animal_name
and animal_type
) are characters.
Let's count the number animals for each animal_type
.
zoo %>% count(___)
zoo %>% count(animal_type)
While looking at the numbers helps, plotting is always a good idea to get an overall view of the data, especially if many sub-categories are present.
Plot the number of animals in each animal_type
category. Fill in the blanks and click on Run Code to generate the plot.
zoo %>% ggplot(aes(___)) + geom____(fill="#CA225E") + theme_minimal()
zoo %>% ggplot(aes(animal_type)) + geom_bar(fill="#CA225E") + theme_minimal()
We can also look at the distribution of animals that lay eggs
across animal types. The eggs
column is coded in 0
and 1
for animals that "doesn't lay eggs" and "lay eggs" respectively. Let's do some data wrangling with recode
function to plot it neatly. Click on Solution if you are stuck.
zoo %>% mutate(eggs = recode(eggs, 0=___, 1=___)) %>% ggplot(aes(___, fill=___)) + geom___() + scale_fill_manual(values = c("#372F60", "#CA225E")) + theme_minimal() + theme(legend.position = "top")
zoo %>% mutate(eggs = recode(eggs, `0`="doesn't lay eggs", `1`="lays Eggs" )) %>% ggplot(aes(animal_type, fill=eggs)) + geom_bar() + scale_fill_manual(values = c("#372F60", "#CA225E")) + theme_minimal() + theme(legend.position = "top")
Not so surprisingly there are very few mammals that lay eggs. Let's get the actual count.
zoo %>% count(___, ___)
zoo %>% count(animal_type, eggs)
It looks like there is one mammal that lays eggs! Can you find the name of that animal?
zoo %>% filter(___ == ___) %>% # select relevant columns for a compact view select(animal_name, animal_type, eggs)
zoo %>% filter(animal_type == "mammal", eggs == 1) %>% # select relevant columns for a compact view select(animal_name, animal_type, eggs)
Having some familiarity with the animal kingdom, we would expect that most animals that produce milk do not lay eggs. In other words, we would expect to see a negative correlation between these features.
Let's see how these animal features correlate with each other to get a sense of these relationships.
Run the code to plot the correlation matrix using the corrr package.
Here we are using three functions from the corrr package:
correlate()
generates a correlation matrix in data frame format. rearrange()
groups highly correlated variables closer together. shave()
shaves off the upper triangle of a correlation data frame by converting its cells to NA
.library(corrr) zoo_corr <- zoo %>% # drop non-numeric columns select(___, ___) %>% correlate() %>% rearrange() %>% shave() zoo_corr
library(corrr) zoo_corr <- zoo %>% # drop non-numeric columns select(-animal_name, -animal_type) %>% correlate() %>% rearrange() %>% shave() zoo_corr
The output is a data frame containing pair-wise correlation coefficients between variables. But it would take us a long time to get an overall sense of these relationships just by looking at raw numbers. Let's plot the correlation matrix with corrr::rplot
to help our brains out.
zoo_corr %>% rplot(shape = 15, colours = c("#372F60", "white", "#CA225E"), print_cor=TRUE) -> g theme(axis.text.x = element_text(angle = 20, hjust = .6))
corrr::rplot
is quite handy because it returns a ggplot object, which can be further customized with ggplot2 functions.So much better!
We can see that producing eggs or milk have a very strong negative correlation. (The odd ball platypus is one reason why it isn't equal to -1
.)
Now, see if you can answer the question correctly.
question("What is the pair of animal features that has the strongest _positive_ correlation?", answer("Tail & Backbone"), answer("Fins & Aquatic"), answer("Milk & Hair", correct = TRUE), answer("Feathers & Airborne"), incorrect = "Incorrect. While these two features have a positive correlation it is not the strongest.", allow_retry = TRUE )
Principal component analysis (PCA) is a handy data reduction technique that uses covariance or a correlation matrix of a set of observed variables (just like the one we visualized) and summarizes it with a smaller set of linear combinations called principal components (PC). These components are statistically independent from one another and capture the maximum amount of information (i.e. variance) in the original variables. This means these components can be used to combat large inter-variable correlations in a data set statistical modeling. PCA can also help us explore the similarities between observations and groups they belong to.
Here's the scatter plot with the first two principle components (PC1 and PC2) of the zoo data:
pca_bake %>% ggplot(aes(PC1, PC2, label=animal_name)) + geom_point(aes(color = animal_type), alpha = 0.7, size = 2)+ geom_text(check_overlap = TRUE, hjust = "inward") + labs(color = "Animal Type") + theme_minimal()
Each dot on the plot represents an observation (animal) that is colored by the animal_type
and labeled by animal_name
.
Overall, we can see that same type of animals are clustered closely compared to the rest. This suggests that these features (having hair, feathers, or laying eggs etc.) are doing a relatively good job at identifying the clusters within the zoo data.
Let's implement principal component analysis (PCA) using recipes.
First, we initiate our recipe with the zoo
data.
library(tidymodels) pca_rec <- recipe(~., data = ___)
pca_rec <- recipe(~., data = zoo)
Here, we define two arguments:
A formula with ~.
tells our recipe that we did not define an outcome variable and would like to use all variables for the next steps of the analysis.
Our data zoo
. We are using our entire data set here, but typically this would be a training set for predictive modeling.
Once we initiate the recipe, we can keep adding new roles and steps.
For example, we already told our recipe to include all variables with our formula; however, we want to exclude identifier column animal_name
and animal_type
from our analysis. On the other hand we need these variables later when we are plotting our results. By using update_role()
we exclude these variables from our analysis without completely dropping them in the next steps:
pca_rec <- recipe(~., data = zoo) %>% # update the role for animal_name and animal_type update_role(___, ___, new_role = "id")
pca_rec <- recipe(~., data = zoo) %>% update_role(animal_name, animal_type, new_role = "id")
Try using summary
to see the defined roles in pca_rec
and arrange them by role
column.
summary() %>% arrange(___)
summary(pca_rec) %>% arrange(role)
We can see that the role of animal_name
and animal_type
is now defined as id
and the remaining variables are listed as predictor
.
Good job! Now, let's add some steps to our recipe.
Since PCA is a variance maximizing exercise, it is important to scale variables so their variance is commensurable. We will achieve this by adding two steps to our recipe:
step_scale()
normalizes numeric data to have a standard deviation of one step_center()
normalizes numeric data to have a mean of zero.We can also use the helper function all_predictors()
to select all the variables that have a role defined as predictor
.
pca_rec <- recipe(~., data = zoo) %>% update_role(animal_name, animal_type, new_role = "id") %>% # add steps to scale and center step____(all_predictors()) %>% step____(all_predictors())
pca_rec <- recipe(~., data = zoo) %>% update_role(animal_name, animal_type, new_role = "id") %>% step_scale(all_predictors()) %>% step_center(all_predictors())
Alternatively, we can accomplish scaling and centering in one single step. Take a look this group of step functions on the recipes reference page. See if you can answer the question below correctly:
question("What function can replace both centering and scaling steps?", answer("step_interact"), answer("step_regex"), answer("step_normalize", correct = TRUE), answer("step_date"), incorrect = "Incorrect. Try again.", allow_retry = TRUE )
We are ready to add our final step to compute our principle components!
Use step_pca()
to tell the recipe to convert all variables (except animal_name
and animal_type
) into principal components.
pca_rec <- recipe(~., data = zoo) %>% update_role(animal_name, animal_type, new_role = "id") %>% step_scale(all_predictors()) %>% step_center(all_predictors()) %>% # add step for PCA computation step____(___, id = "pca")
pca_rec <- recipe(~., data = zoo) %>% update_role(animal_name, animal_type, new_role = "id") %>% step_scale(all_predictors()) %>% step_center(all_predictors()) %>% step_pca(all_predictors(), id = "pca")
Did you notice the additional argument id = "pca"
there? If we take a look at the step_pca
help page, we see that this argument allows us to provide a unique string to identify this step. Providing a stepid
will become handy when we need to extract additional values from that step. Similarly, we could have assigned a unique id to any step we would like to work more on later.
Now, let's print the pca_rec
by running the following code chunk.
pca_rec
We can see that pca_rec
has our id and predictor variables as inputs and the following operations:
Are you surprised that we haven't extracted the PCA components yet? This is because so far we only defined our recipe, but did not train it. To get the results from our PCA, we need evaluate our recipe using prep()
.
Let's prep our recipe and print the output:
pca_prep <- prep(___) pca_prep
pca_prep <- prep(pca_rec) pca_prep
Can you see the difference between the outputs of pca_rec
and pca_prep
? After prepping we can see that scaling and centering, and PCA extraction with all columns of interest has been trained.
Let's take a look at the steps this recipe contains with tidy()
:
tidy(___)
tidy(pca_prep)
We can see that three steps are contained in this prepped recipe:
scale
center
pca
With tidy()
we can extract the intermediate values computed in each step by providing its number
as an argument to tidy()
.
For example, you can extract the mean values for each predictor variable from the second step of our recipe (center
) using the tidy
method:
tidy(___, ___)
tidy(pca_prep, 2)
Using the same method, we can also extract the variable loadings for each component from our step_pca
:
tidy(___, ___)
tidy(pca_prep, 3)
values
when extracted with the tidy
method. You can find the definition of these underlying values under the Value section in the help page of the related step function. For example, take a look at step_scale
help document and scroll down to see the Value.Alternatively, we can use the id
argument (the one we specifically provided for this step) and specify the type
of underlying value we would like to extract.
pca_loading <- tidy(___, ___, ___) pca_loading
pca_loading <- tidy(pca_prep, id = "pca", type = "coef") pca_loading
step_pca
help document. The type
argument provides more details about how to use this step with the tidy()
method.In the PCA setting, loadings
indicate the correlation between the principal component and the variable. In other words, large loadings suggest that a variable has a strong effect on that principal component.
Let's take a look at loadings we generated with the zoo data!
We will use plot_top_loadings()
function from the learntidymodels
package to plot the absolute values of the loadings to easily compare them, and color it by the direction of the loading (positive or negative). The plot_top_loadings()
takes three arguments:
component_number <= 3
. Fill in the blanks to plot the first four principle components and top six variables with largest absolute loadings:
library(learntidymodels) plot_top_loadings(___, component_number ___, n = ___) + scale_fill_manual(values = c("#372F60", "#CA225E")) + theme_minimal()
library(learntidymodels) plot_top_loadings(pca_prep, component_number <= 4, n = 6) + scale_fill_manual(values = c("#372F60", "#CA225E")) + theme_minimal()
It looks like PC1 (first principal component) is mostly about producing milk or eggs, and having hair. Notice the loading direction for milk and hair are the same, which is the opposite of eggs. Do you remember the strongest positive correlation we found? On the other hand, PC2 seems to be about the animal having a fin or being aquatic, both of which have the opposite direction to breathing. Both tails and feathers have a strong correlation with PC3, and finally, PC4 is mostly about being domestic or a predator, which have opposite directions. Overall, we can say that PC1 is mostly about being a mammal, PC2 is being a fish or an aquatic animal, PC3 is being a bird, and PC4 is about being domesticated.
So far we:
recipe
prep
Finally, in order to apply these computations to our data and extract the principal components, we will use bake
by providing two arguments:
pca_bake <- bake(___, ___) pca_bake
pca_bake <- bake(pca_prep, zoo) pca_bake
Now that we got our principal components, we are ready to plot them! r emojifont::emoji('tada')
Let's plot the first two principal components, while labeling our points with animal_name
and coloring them by animal_type
.
library(ggplot2) pca_bake %>% ggplot(aes(___, ___, label=___)) + geom_point(aes(color = ___), alpha = 0.7, size = 2)+ geom_text(check_overlap = TRUE, hjust = "inward") + labs(color = NULL) + theme_minimal()
pca_bake %>% ggplot(aes(PC1, PC2, label=animal_name)) + geom_point(aes(color = animal_type), alpha = 0.7, size = 2)+ geom_text(check_overlap = TRUE, hjust = "inward") + labs(color = NULL) + theme_minimal()
We were able reproduce our initial plot! Let's take a look at our plot in more detail.
Look at mammals, majority of these animals are separated from the other types of animals across the PC1 axis. Recall our plot with loadings: milk
and eggs
were the top two features with largest loadings for PC1. Interestingly, platypus (our favorite odd ball) is placed closer to reptiles and the penguin on the PC1 axis. This is likely driven by laying eggs and not having teeth. On the other hand, seal and especially dolphin are located closer to fish and the sea snake, and separate from rest of the mammals on the PC2 axis. Do you remember the top two loadings for PC2? It was fins
and aquatic
! Do you see the pattern here?
A common practice when conducting PCA is to check how much variability in the data is captured by principal components. Typically, this is achieved by looking at the eigenvalues or their percent proportion for each component. Let's extract variance explained by each principal component using the tidy()
method.
pca_variances <- tidy(pca_prep, id = "pca", type = "variance") pca_variances
pca_variances <- tidy(pca_prep, id = "pca", type = "variance") pca_variances
When we take a close look at the terms
column of pca_variances
, we see that various variance calculations are available for our 16 principal components.
pca_variances %>% count(___)
pca_variances %>% count(terms)
Now, let's plot them to help our brains out once more:
pca_variances %>% filter(terms == "percent variance") %>% ggplot(aes(___, ___)) + geom_col(fill="#372F60") + scale_y_continuous() + labs(x = "Principal Components", y = "Variance explained (%)") + theme_minimal()
pca_variances %>% filter(terms == "percent variance") %>% ggplot(aes(component, value)) + geom_col(fill="#372F60") + labs(x = "Principal Components", y = "Variance explained (%)") + theme_minimal()
We can see that first three principal components explain majority of the variance in the data. But it is difficult to see the cumulative variance explained in this plot. Let's tweak the filter()
function to plot the cumulative variance explained:
pca_variances %>% filter(terms == "___") %>% ggplot(aes(___, ___)) + geom_col(fill="#372F60") + scale_y_continuous() + labs(x = "Principal Components", y = "Cumulative variance explained (%)") + theme_minimal()
pca_variances %>% filter(terms == "cumulative percent variance") %>% ggplot(aes(component, value)) + geom_col(fill="#372F60") + labs(x = "Principal Components", y = "Cumulative variance explained (%)") + theme_minimal()
We can see that 50% of the variance is explained by the first two components. If we were to use more components, we can capture even more variance in the data. That is why it is also common to plot multiple components to get a better idea of the data.
Try plotting PC1 and PC3, do you see other clusters in the data that wasn't as obvious with PC1 and PC2?
pca_bake %>% ggplot(aes(___, ___, label=___)) + geom_point(aes(color = ___), alpha = 0.7, size = 2)+ geom_text(check_overlap = TRUE, hjust = "inward") + labs(color = NULL) + theme_minimal()
pca_bake %>% ggplot(aes(PC1, PC3, label=animal_name)) + geom_point(aes(color = animal_type), alpha = 0.7, size = 2)+ geom_text(check_overlap = TRUE, hjust = "inward") + labs(color = NULL) + theme_minimal()
Uniform manifold approximation and projection (UMAP) is a non-linear graph based dimension reduction algorithm. It finds local, low dimensional representations of the data and can be run unsupervised or supervised with different types of outcome data (e.g. numeric, factor, etc).
Some of the advantages of using UMAP are:
Now that we learned how to create a recipe for PCA, we can apply our knowledge to create one for UMAP with zoo
data too!
umap_ord <- c( "recipe(~., data = zoo) %>%", "update_role(animal_name, animal_type, new_role = \"id\") %>%", "step_normalize(all_predictors()) %>%", "step_umap(all_predictors())" ) question_rank( "Sort the following recipe steps to create a recipe with UMAP:", answer(umap_ord, correct = TRUE), allow_retry = TRUE )
Fill in the blanks to create a recipe with UMAP, then prep and bake the recipe to compute UMAP components:
set.seed(123) # set a seed to reproduce random number generation library(embed) # load the library to use `step_umap()` ## Create the recipe accordingly umap_rec <- recipe(~., data = ___) %>% update____(___, ___, new_role = "id") %>% step_normalize(___) %>% step___(___) ## Train your recipe with prep umap_prep <- prep(___) ## Apply computations with bake umap_bake <- bake(___, ___) umap_bake
set.seed(123) # set a seed to reproduce random number generation library(embed) # load the library to use `step_umap()` ## Create the recipe accordingly umap_rec <- recipe(~., data = zoo) %>% update_role(animal_name, animal_type, new_role = "id") %>% step_normalize(all_predictors()) %>% step_umap(all_predictors()) ## Prep your recipe umap_prep <- prep(umap_rec) ## Extract UMAP components with bake umap_bake <- bake(umap_prep, zoo) umap_bake
Great job!
It's time to plot our UMAP components! r emojifont::emoji('tada')
umap_bake %>% ggplot(aes(___, ___, label=animal_name)) + geom_point(aes(color = animal_type), alpha = 0.7, size = 2)+ geom_text(check_overlap = TRUE, hjust = "inward") + labs(color = NULL) + theme_minimal()
umap_bake %>% ggplot(aes(UMAP1, UMAP2, label=animal_name)) + geom_point(aes(color = animal_type), alpha = 0.7, size = 2)+ geom_text(check_overlap = TRUE, hjust = "inward") + labs(color = NULL) + theme_minimal()
Do you think UMAP is doing a better job than PCA?
UMAP algorithm has multiple hyperparameters that can have significant impact on the results. The four major hyperparameters are:
neighbors
num_comp
min_dist
learn_rate
These hyperparameters can be specified in the step_umap
function. You can find more details in step_umap
help document and additional explanations on hyperparameters here.
Try setting num_comp
(number of components) to 3
and plot UMAP components again, but this time use the first and the third components:
set.seed(123) # set a seed to reproduce random number generation library(embed) # load the library to use `step_umap()` ## Create the recipe accordingly umap_rec <- recipe(~., data = zoo) %>% update_role(animal_name, animal_type, new_role = "id") %>% step_normalize(all_predictors()) %>% step_umap(all_predictors(), ___ = ___) ## Prep your recipe umap_prep <- prep(umap_rec) ## Extract UMAP components with bake umap_bake <- bake(umap_prep, zoo) ## Plot UMAP components umap_bake %>% ggplot(aes(___, ___, label=animal_name)) + geom_point(aes(color = animal_type), alpha = 0.7, size = 2)+ geom_text(check_overlap = TRUE, hjust = "inward") + labs(color = NULL) + theme_minimal()
set.seed(123) # set a seed to reproduce random number generation library(embed) # load the library to use `step_umap()` ## Create the recipe accordingly umap_rec <- recipe(~., data = zoo) %>% update_role(animal_name, animal_type, new_role = "id") %>% step_normalize(all_predictors()) %>% step_umap(all_predictors(), num_comp = 3) ## Prep your recipe umap_prep <- prep(umap_rec) ## Extract UMAP components with bake umap_bake <- bake(umap_prep, zoo) ## Plot UMAP components umap_bake %>% ggplot(aes(UMAP1, UMAP3, label=animal_name)) + geom_point(aes(color = animal_type), alpha = 0.7, size = 2)+ geom_text(check_overlap = TRUE, hjust = "inward") + labs(color = NULL) + theme_minimal()
Looks like increasing the number of UMAP components did not change our results that much. Try other hyperparameters and see if you can produce different plots. Don't forget to modify the plot accordingly!
Good job! You completed all the steps and applied dimensionality reduction to the zoo
data set using the recipes package from tidymodels! r emojifont::emoji('star2')
But before you start your victory lap, let's go over what we learned one last time.
To implement dimensionality reduction with the recipes package, we took the following steps:
recipe()
update_role()
step_*()
prep()
bake()
knitr::include_graphics("https://github.com/allisonhorst/stats-illustrations/raw/master/rstats-artwork/recipes.png")
Throughout this tutorial, we used bake
to apply the computations from a trained recipe to our data set. The bake
method is great because it allows us to apply a set specifications and computations generated with prep
to our data of choice. This is especially handy when you are dealing with training, validation, or test sets during your modeling process.
However, if we simply want to extract the computations generated with prep()
and don't need to reapply them to a new data set, we can simply use juice
. For example, throughout this tutorial, we only worked with the zoo
data and could have easily extracted our principal components with juice(pca_prep)
.
Let's create our recipe, add steps and train with prep
one last time. Then extract principal components first with bake
and then with juice
.
zoo_rec <- recipe(data = zoo, formula = ~.) %>% update_role(animal_name, animal_type, new_role = "id") %>% step_normalize(all_predictors()) %>% step_pca(all_predictors()) zoo_prep <- prep(pca_rec) # bake zoo_bake <- bake(___, ___) zoo_bake
zoo_rec <- recipe(data = zoo, formula = ~.) %>% update_role(animal_name, animal_type, new_role = "id") %>% step_normalize(all_predictors()) %>% step_pca(all_predictors()) zoo_prep <- prep(zoo_rec) # bake zoo_bake <- bake(zoo_prep, zoo) zoo_bake
Now, let's juice! r emojifont::emoji('tropical_drink')
# juice zoo_juice <- juice(___) zoo_juice
# juice zoo_juice <- juice(zoo_prep) zoo_juice
Do you see any difference in the output between bake or juice? Let's compare them with base R function all.equal()
, which simply returns TRUE
if the compared objects are identical:
all.equal(___, ___)
all.equal(zoo_bake, zoo_juice)
Congratulations! You've completed the tutorial! It's time for your victory lap! r emojifont::emoji('runner')
Equipped with the necessary know-how, you are now ready to apply these tools in the wild. If you ever face obstacles on your journey, don't forget to check out the following resources:
The recipes main page
Introduction to recipes package and additional examples.
The recipes reference page
A list of all available recipes functions and their help documents in a neatly categorized layout.
The search table for recipes steps
Search and find available recipes steps for pre-processing.
Ask about it in RStudio Community
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.