AzureStor: an R package for working with Azure storage

Storage endpoints

Perhaps the more relevant part of AzureStor for most users is its client interface to storage. With this, you can upload and download files and blobs, create containers and shares, list files, and so on. Unlike the ARM interface, the client interface uses S3 classes. This is for a couple of reasons: it is more familiar to most R users, and it is consistent with most other data manipulation packages in R, in particular the tidyverse.

The starting point for client access is the storage_endpoint object, which stores information about the endpoint of a storage account: the URL that you use to access storage, along with any authentication information needed. The easiest way to obtain an endpoint object is via the storage account resource object’s get_blob_endpoint() and get_file_endpoint() methods:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# get the storage account object
stor <- AzureRMR::az_rm$
 new(tenant="{tenant_id}", app="{app_id}", password="{password}")$
 get_subscription("{subscription_id}")$
 get_resource_group("myresourcegroup")$
 get_storage_account("mynewstorage")

stor$get_blob_endpoint()
# Azure blob storage endpoint
# URL: https://mynewstorage.blob.core.windows.net/
# Access key: <hidden>
# Account shared access signature: <none supplied>
# Storage API version: 2018-03-28

stor$get_file_endpoint()
# Azure file storage endpoint
# URL: https://mynewstorage.file.core.windows.net/
# Access key: <hidden>
# Account shared access signature: <none supplied>
# Storage API version: 2018-03-28

This shows that the base URL to access blob storage is https://mynewstorage.blob.core.windows.net/, while that for file storage is https://mynewstorage.file.core.windows.net/. While it’s not displayed, the endpoint objects also include the access key necessary for authenticated access to storage; this is obtained directly from the storage account resource.

More practically, you will usually want to work with a storage endpoint without having to go through the process of authenticating with Azure Resource Manager (ARM). Often, you may not have any ARM credentials to start with. In this case, you can create the endpoint object directly with blob_endpoint() and file_endpoint():

1
2
3
4
5
6
7
# same as above
blob_endp <- blob_endpoint(
 "https://mynewstorage.blob.core.windows.net/",
 key="mystorageaccesskey")
file_endp <- file_endpoint(
 "https://mynewstorage.file.core.windows.net/",
 key="mystorageaccesskey")

Notice that when creating the endpoint this way, you have to provide the access key explicitly.

Instead of an access key, you can provide a shared access signature (SAS) to gain authenticated access. The main difference between using a key and a SAS is that the former unlocks access to the entire storage account. A user who has a key can access all containers and files, and can read, modify and delete data without restriction. On the other hand, a user with a SAS can be limited to have access only to specific files, or be limited to read access, or only for a given span of time, and so on. This is usually much better in terms of security.

Usually, the SAS will be given to you by your system administrator. However, if you have the storage acccount resource object, you can generate and use a SAS as follows. Note that generating a SAS requires the storage account’s access key.

1
2
3
4
5
6
7
8
9
10
# shared access signature: read/write access, container+object access, valid for 12 hours
now <- Sys.time()
sas <- stor$get_account_sas(permissions="rw",
 resource_types="co",
 start=now,
 end=now + 12 * 60 * 60,
 key=stor$list_keys()[1])

# create an endpoint object with a SAS, but without an access key
blob_endp <- stor$get_blob_endpoint(sas=sas)

If you don’t have a key or a SAS, you will only have access to unauthenticated (public) containers and file shares.