Dplyr is a grammar of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges. The key verbs in dplyr are:
- `select()`: Return a subset of the columns of a data frame, using a flexible notation.
- `filter()`: Extract a subset of rows from a data frame based on logical conditions.
- `arrange()`: Reorder rows of a data frame.
- `rename()`: Rename variables in a data frame.
- `mutate()`: Add new variables/columns or transform existing variables.
- `summarize()`: Generate summary statistics of different variables in the data frame, possibly within strata.
The `%>%` operator, also known as the pipe operator, is used to connect multiple verb actions together into a pipeline. This makes it easy to chain together multiple data manipulation operations in a single expression.
The dplyr package also provides a number of its own data types, such as the `tbl_df` data frame. These data types are designed to make it easier to work with data in R.