String tokenization inside R data frame [duplicate] - r

I have some data that looks a little bit like this:
test.frame <- read.table(text = "name amounts
JEAN 318.5,45
GREGORY 1518.5,67,8
WALTER 518.5
LARRY 518.5,55,1
HARRY 318.5,32
",header = TRUE,sep = "")
I'd like it to look more like this ...
name amount
JEAN 318.5
GREGORY 1518.5
WALTER 518.5
LARRY 518.5
HARRY 318.5
It seems like there should be a straightforward way to break out the "amounts" column, but I'm not coming up with it. Happy to take a "RTFM page for this particular command" answer. What's the command I'm looking for?

(test.frame <- read.table(text = "name amounts
JEAN 318.5,45
GREGORY 1518.5,67,8
WALTER 518.5
LARRY 518.5,55,1
HARRY 318.5,32
",header = TRUE,sep = ""))
# name amounts
# 1 JEAN 318.5,45
# 2 GREGORY 1518.5,67,8
# 3 WALTER 518.5
# 4 LARRY 518.5,55,1
# 5 HARRY 318.5,32
tmp <- setNames(strsplit(as.character(test.frame$amounts),
split = ','), test.frame$name)
data.frame(name = rep(names(tmp), sapply(tmp, length)),
amounts = unlist(tmp), row.names = NULL)
# name amounts
# 1 JEAN 318.5
# 2 JEAN 45
# 3 GREGORY 1518.5
# 4 GREGORY 67
# 6 WALTER 518.5
# 7 LARRY 518.5
# 8 LARRY 55
# 9 LARRY 1
# 10 HARRY 318.5
# 11 HARRY 32

The fastest way (probably) will be data.table
setDT(test.frame)[, lapply(.SD, function(x) unlist(strsplit(as.character(x), ','))),
.SDcols = "amounts", by = name]
## name amounts
## 1: JEAN 318.5
## 2: JEAN 45
## 3: GREGORY 1518.5
## 4: GREGORY 67
## 5: GREGORY 8
## 6: WALTER 518.5
## 7: LARRY 518.5
## 8: LARRY 55
## 9: LARRY 1
## 10: HARRY 318.5
## 11: HARRY 32

A generalization of David Arenburg's solution would be to use my cSplit function. Get it from the Git Hub Gist ( or load it with "devtools":
# library(devtools)
# source_gist(11380733)
The "long" format would be what you are looking for...
cSplit(test.frame, "amounts", ",", "long")
# name amounts
# 1: JEAN 318.5
# 2: JEAN 45
# 3: GREGORY 1518.5
# 4: GREGORY 67
# 5: GREGORY 8
# 6: WALTER 518.5
# 7: LARRY 518.5
# 8: LARRY 55
# 9: LARRY 1
# 10: HARRY 318.5
# 11: HARRY 32
But the function can create wide output formats too:
cSplit(test.frame, "amounts", ",", "wide")
# name amounts_1 amounts_2 amounts_3
# 1: JEAN 318.5 45 NA
# 2: GREGORY 1518.5 67 8
# 3: WALTER 518.5 NA NA
# 4: LARRY 518.5 55 1
# 5: HARRY 318.5 32 NA
One advantage with this function is being able to split multiple columns at once.

This isn't a super standard format, but here is one way you can transform your data. First, I would use stringsAsFactors=F with your read.table to make sure everything is a character variable rather than a factor. Alternatively you can do as.character() on those columns.
First I split the values in the amounts using the comma then I combine values with the names column
md <-, Map(cbind, test.frame$name,
strsplit(test.frame$amounts, ",")))
Then I paste everything back together and send it to read.table to do the variable conversion
read.table(text=apply(md,1,paste, collapse="\t"),
sep="\t", col.names=names(test.frame))
Alternatively you could just make a data.frame from the md matrix and do the class conversions yourself
data.frame(names=md[,1], amount=as.numeric(md[,2]))

Here is a plyr solution:
Split.Amounts <- function(x) {
amounts <- unlist(strsplit(as.character(x$amounts), ","))
return(data.frame(name = x$name, amounts = amounts, stringsAsFactors=FALSE))
ddply(test.frame, .(name), Split.Amounts)
Using dplyr:
test.frame %>%
group_by(name) %>%


