Creating List with Iterator

In the post (https://statcompute.wordpress.com/2018/11/17/growing-list-vs-growing-queue), it is shown how to grow a list or a list-like queue based upon a dataframe. In the example, the code snippet was heavily relied on the FOR loop to do the assignment item by item, which I can’t help thinking of potential alternatives afterwards. For instance, is there an implementation that would enable us to traverse a dataframe without knowing its dimension in advance or even without using the loop?

In the previous example, if we’d want to fetch rows from a dataframe, we need to know the number of rows in advance by using the nrow() function. As shown below, we need to generate a sequence of row index and then to fetch rows by indexing,

If we don’t like to fetch rows from a dataframe by indexing, a workaround would be the split() function by splitting the dataframe into rows. The additional unname() function is doing nothing but removing redundant list names. However, we still need to know the number of rows in this solution.

With the iterators package, the coding logic can be slightly cleaner and more generic by wrapping the dataframe into a row-wise iterator object, as demonstrated below.

In addition, the iterator object is customizable. For instance, we can easily apply a filter function to the iterator.

If the use case is not creating a list, as discussed above, but growing an empty list by inserting, then a simple iterator might not be sufficient. In such case, we might need to tweak it a little by enumerating the iterator with the ienum() function in the itertools2 package. Alternatively, we can also use itertools2::izip() function to construct the enumeration manually. It is noted that, because we need to assign values with a function call within the lapply() to a list in the parent environment, the scoping assignment should be used.

Related