In this post in the R:case4base series we will look at string manipulation with base R, and provide an overview of a wide range of functions for our string working needs.
We will use simple examples to learn to perform basic string operations, concatenate strings, work with substrings, switch cases, quote, find and replace within strings and more. Some interesting bonuses will also be included.
As always, some popular alternatives to base R will also be suggested and many useful references provided for further reading.
This post is aimed to serve as an overview of functionality provided by base R to work with strings. Note that the term “string” is used somewhat loosely and refers to character vectors and character strings. In R documentation, references to character string
, refer to character vectors of length 1.
Also since this is an overview, we will not examine the details of the functions, but rather list examples with simple, intuitive explanations trading off technical precision.
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
String concatenation is the process of “joining” two strings together and one the most common operations.
Simple concatenation
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
Concatenate a vector into a single character string
1 |
|
1 |
|
1 |
|
1 |
|
String lengths
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
Switching to upper/lower case
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
Removing white spaces
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
Encoding conversion
1 |
|
1 |
|
Quoting
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
Retrieving and working with substrings
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
Pattern matching and replacement using regular expressions in an extremely powerful feature, however it is out of scope of this overview to cover them.
Check the references for better resources if you are interested. A lot more useful detail can also be found in R’s documentation.
The following is just to show very basic use and list useful functions.
Replace substring with other strings
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
Check if a pattern is present within elements of a character vector
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
Check where the matches are within the elements of a character vector
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
We skip regexec
here as parenthesized sub-expressions are very much out of scope of this post.
Strings
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
Using the tidyverse’s stringr and glue
Stringr is built on top of stringi
and focuses on the most important and commonly used string manipulation functions whereas stringi
provides a comprehensive set covering almost anything you can imagine.
glue strings to data in R. Small, fast, dependency free interpreted string literals.
Using stringi
- Stringi is an R package for very fast, correct, consistent, and convenient string/text processing in each locale and any native character encoding.
Did you find the article helpful or interesting? Help others find it by sharing
Related