Removing all "$" from an entire data frame - r

I have a df with several columns that have dollar values preceded by the "$" like so:
> str(data)
Classes ‘data.table’ and 'data.frame': 196879 obs. of 32 variables:
$ City : chr "" "" "" "" ...
$ Company_Goal : chr "" "" "" "" ...
$ Company_Name : chr "" "" "" "" ...
$ Event_Date : chr "5/14/2016" "9/26/2015" "9/12/2015" "6/3/2017" ...
$ Event_Year : chr "FY 2016" "FY 2016" "FY 2016" "FY 2017" ...
$ Fundraising_Goal : chr "$250" "$200" "$350" "$0" ...
$ Name : chr "Heart Walk 2015-2016 St. Louis MO" "Heart Walk 2015-2016 Canton, OH" "Heart Walk 2015-2016 Dallas, TX" "FDA HW 2016-2017 Albany, NY WO-65355" ...
$ Participant_Id : chr "2323216" "2273391" "2419569" "4088558" ...
$ State : chr "" "OH" "TX" "" ...
$ Street : chr "" "" "" "" ...
$ Team_Average : chr "$176" "$123" "$306" "$47" ...
$ Team_Captain : chr "No" "No" "Yes" "No" ...
$ Team_Count : chr "7" "6" "4" "46" ...
$ Team_Id : chr "152788" "127127" "45273" "179207" ...
$ Team_Member_Goal : chr "$0" "$0" "$0" "$0" ...
$ Team_Name : chr "Team Clayton" "Cardiac Crusaders" "BIS - Team Myers" "Independent Walkers" ...
$ Team_Total_Gifts : chr "$1,230 " "$738" "$1,225 " "$2,145 " ...
$ Zip : chr "" "" "" "" ...
$ Gifts_Count : chr "2" "1" "2" "1" ...
$ Registration_Gift: chr "No" "No" "No" "No" ...
$ Participant_Gifts: chr "$236" "$218" "$225" "$0" ...
$ Personal_Gift : chr "$0" "$0" "$0" "$250" ...
$ Total_Gifts : chr "$236" "$218" "$225" "$250" ...
$ MATCH_CODE : chr "UX000" "UX000" "UX000" "UX000" ...
$ TAP_LEVEL : chr "X" "X" "X" "X" ...
$ TAP_DESC : chr "" "" "" "" ...
$ TAP_LIFED : chr "" "" "" "" ...
$ MEDAGE_CY : chr "0" "0" "0" "0" ...
$ DIVINDX_CY : chr "0" "0" "0" "0" ...
$ MEDHINC_CY : chr "0" "0" "0" "0" ...
$ MEDDI_CY : chr "0" "0" "0" "0" ...
$ MEDNW_CY : chr "0" "0" "0" "0" ...
- attr(*, ".internal.selfref")=<externalptr>
I am trying to remove all of the "$". I have been unable to do so- I have tried the suggestions provided in this post as well as this one but in both situations- the data remains unchanged...
Help?

The dollar sign is a reserved character in regular expressions (see here for more info). The gsub() function assumes the pattern is a regex by default.
You have to escape the dollar sign using backslashes (\\$) to match a literal $.
#sample data
df = data.frame(Team_Average = c("$176", "$123", "$306"),
Name = c("Heart Walk 2015-2016 St. Louis MO",
"Heart Walk 2015-2016 Canton, OH",
"Heart Walk 2015-2016 Dallas, TX"),
stringsAsFactors = FALSE)
df[] = lapply(df, gsub, pattern="\\$", replacement="")
Alternatively you can use gsub's option of fixed=TRUE to match the pattern literally.
df[] = lapply(df, gsub, pattern="$", replcement="", fixed=TRUE)

The other answers work nicely on the example provided. However, if the data set contained any numeric columns, then running gsub() or stringr::str_replace_all() via lapply() would coerece numeric columns to character:
library(stringr)
library(dplyr)
d <- data_frame(
x = c("$200", "$191.40", "80.12"),
y = c("$test", "column", "$foo"),
z = 1:3
)
d[] <- lapply(d, gsub, pattern = "\\$", replacement = "")
# A tibble: 3 x 3
x y z
<chr> <chr> <chr>
1 200 test 1
2 191.40 column 2
3 80.12 foo 3
Note the class of z above.
Here is a tidyverse approach to removing $ from all character columns:
d %>%
mutate_if(
is.character,
funs(str_replace_all(., "\\$", ""))
)
# A tibble: 3 x 3
x y z
<chr> <chr> <int>
1 200 test 1
2 191.40 column 2
3 80.12 foo 3

Related

Reading a dropbox file as data frame

I try to read from a dropbox link a csv file as data frame using this option
df <- read.csv("https://www.dropbox.com/s/vta51y5wyzu86m1/FY_2008.csv?dl=0", stringsAsFactors = FALSE)
However I receive this error:
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
duplicate 'row.names' are not allowed
Any help to figure out why this error exist?
Change the dl=0 to dl=1.
For an abbreviated demonstration, I'll limit to just the top 10 rows:
df <- read.csv("https://www.dropbox.com/s/vta51y5wyzu86m1/FY_2008.csv?dl=1", nrows=10)
str(df)
# 'data.frame': 10 obs. of 65 variables:
# $ contract_transaction_unique_key : chr "9700_9700_0000_0_W91QUZ07D0011_0" "9700_9700_0001_0_DAJA6196A0004_0" "6940_6940_0001_1_DTNH2208D00115_0" "9700_9700_0001_17_F0470001D0020_0" ...
# $ contract_award_unique_key : chr "CONT_AWD_0000_9700_W91QUZ07D0011_9700" "CONT_AWD_0001_9700_DAJA6196A0004_9700" "CONT_AWD_0001_6940_DTNH2208D00115_6940" "CONT_AWD_0001_9700_F0470001D0020_9700" ...
# $ award_id_piid : int 0 1 1 1 1 1 1 1 1 1
# $ modification_number : int 0 0 1 17 2 0 0 0 1 1
# $ transaction_number : int 0 0 0 0 0 0 0 0 0 0
# $ parent_award_agency_id : int 9700 9700 6940 9700 9700 9700 9700 9700 9700 9700
# $ parent_award_agency_name : chr "" "DEPT OF DEFENSE" "NATIONAL HIGHWAY TRAFFIC SAFETY ADMINISTRATION" "" ...
# $ parent_award_id_piid : chr "W91QUZ07D0011" "DAJA6196A0004" "DTNH2208D00115" "F0470001D0020" ...
# $ parent_award_modification_number : chr "0" "0" "0" "P00013" ...
# $ federal_action_obligation : num 1082099 1104 0 -15741 -15927 ...
# $ total_dollars_obligated : num NA 1104 NA NA NA ...
# $ current_total_value_of_award : num NA 1104 NA NA NA ...
# $ potential_total_value_of_award : num NA 1104 NA NA NA ...
# $ disaster_emergency_fund_codes_for_overall_award : logi NA NA NA NA NA NA ...
# $ outlayed_amount_funded_by_COVID.19_supplementals_for_overall_aw: logi NA NA NA NA NA NA ...
# $ obligated_amount_funded_by_COVID.19_supplementals_for_overall_a: logi NA NA NA NA NA NA ...
# $ action_date : chr "2008-09-30" "2008-09-30" "2008-09-30" "2008-09-30" ...
# $ action_date_fiscal_year : int 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008
# $ period_of_performance_start_date : chr "2008-09-30 00:00:00" "2008-09-30 00:00:00" "2008-09-30 00:00:00" "2008-09-30 00:00:00" ...
# $ period_of_performance_current_end_date : chr "2009-09-29 00:00:00" "2008-09-30 00:00:00" "2009-12-18 00:00:00" "2003-11-30 00:00:00" ...
# $ period_of_performance_potential_end_date : chr "2009-09-29 00:00:00" "2008-09-30 00:00:00" "2009-12-18 00:00:00" "2003-11-30 00:00:00" ...
# $ awarding_agency_code : int 97 97 69 97 97 97 97 97 97 97
# $ awarding_agency_name : chr "DEPARTMENT OF DEFENSE (DOD)" "DEPARTMENT OF DEFENSE (DOD)" "DEPARTMENT OF TRANSPORTATION (DOT)" "DEPARTMENT OF DEFENSE (DOD)" ...
# $ awarding_sub_agency_code : int 2100 2100 6940 5700 5700 5700 5700 5700 5700 5700
# $ awarding_sub_agency_name : chr "DEPT OF THE ARMY" "DEPT OF THE ARMY" "NATIONAL HIGHWAY TRAFFIC SAFETY ADMINISTRATION" "DEPT OF THE AIR FORCE" ...
# $ awarding_office_code : chr "W911W4" "W912PA" "00022" "FA9301" ...
# $ awarding_office_name : chr "W00Y CONTR OFC DODAAC" "ECC PARC EUROPE REGIONAL CONTRACTIN" "DEPT OF TRANS/NAT HIGHWAY TRAFFIC SAFETY ADM" "FA9301 AFTC PZIO" ...
# $ recipient_duns : int 614948396 123456787 49508120 848288408 92440044 52220485 144606436 132004701 122474104 57579807
# $ recipient_name : chr "WORLD WIDE TECHNOLOGY, INC." "MISCELLANEOUS FOREIGN AWARDEES" "WESTAT, INC." "ACCENT SERVICE COMPANY INC" ...
# $ recipient_doing_business_as_name : logi NA NA NA NA NA NA ...
# $ recipient_parent_duns : int 131784451 123456787 49508120 848288408 92440044 52220485 144606436 132004701 122474104 57579807
# $ recipient_parent_name : chr "WORLD WIDE TECHNOLOGY HOLDING CO. INC." "MISCELLANEOUS FOREIGN CONTRACTORS" "WESTAT INC." "ACCENT SERVICE COMPANY INC" ...
# $ recipient_country_code : chr "USA" "USA" "UNITED STATES" "UNITED STATES" ...
# $ recipient_country_name : chr "UNITED STATES OF AMERICA" "UNITED STATES" "" "" ...
# $ recipient_address_line_1 : chr "60 WELDON PKWY" "1800 F ST NW" "1650 RESEARCH BLVD RM RE164" "2001 LEMNOS DR" ...
# $ recipient_address_line_2 : logi NA NA NA NA NA NA ...
# $ recipient_city_name : chr "MARYLAND HEIGHTS" "WASHINGTON" "ROCKVILLE" "COSTA MESA" ...
# $ recipient_county_name : chr "ST. LOUIS" "DISTRICT OF COLUMBIA" "" "" ...
# $ recipient_state_code : chr "MO" "DC" "MD" "CA" ...
# $ recipient_state_name : chr "MISSOURI" "DISTRICT OF COLUMBIA" "" "" ...
# $ recipient_zip_4_code : int 63043 204050001 208503195 926263535 92408 329205818 769047833 223031802 782584092 782073102
# $ primary_place_of_performance_country_name : chr "UNITED STATES OF AMERICA" "GERMANY" "UNITED STATES" "UNITED STATES" ...
# $ primary_place_of_performance_city_name : chr "FORT BELVOIR" "" "ROCKVILLE" "EDWARDS" ...
# $ primary_place_of_performance_county_name : chr "FAIRFAX" "" "MONTGOMERY" "KERN" ...
# $ primary_place_of_performance_state_code : chr "VA" "" "MD" "CA" ...
# $ primary_place_of_performance_state_name : chr "VIRGINIA" "" "MARYLAND" "CALIFORNIA" ...
# $ award_or_idv_flag : chr "AWARD" "AWARD" "AWARD" "AWARD" ...
# $ award_type_code : chr "C" "C" "C" "C" ...
# $ award_type : chr "DO" "DELIVERY ORDER" "DO" "DO" ...
# $ type_of_contract_pricing_code : chr "J" "J" "3" "S" ...
# $ type_of_contract_pricing : chr "FIXED PRICE" "FIXED PRICE" "OTHER (NONE OF THE ABOVE)" "COST NO FEE" ...
# $ award_description : chr "PURCHASE OF ROUTERS, SERVERS, AND ANCILLARY EQUIPMENT. USED WORLD-WIDE IN SUPPORT OF MISSION." "LOCKSMITH SUPPLIES" "RFP FOR IDIQ CONTRACT - MULTIPLE AWARD" "BASIC CLEANING SERVICES" ...
# $ product_or_service_code : chr "7490" "4510" "R405" "S201" ...
# $ product_or_service_code_description : chr "MISCELLANEOUS OFFICE MACHINES" "PLUMBING FIXTURES AND ACCESSORIES" "OPERATIONS RESEARCH & QUANTITATIVE" "CUSTODIAL JANITORIAL SERVICES" ...
# $ naics_description : chr "WIRED TELECOMMUNICATIONS CARRIERS" "OTHER SUPPORT ACTIVITIES FOR ROAD TRANSPORTATION" "ENGINEERING SERVICES" "JANITORIAL SERVICES" ...
# $ domestic_or_foreign_entity : logi NA NA NA NA NA NA ...
# $ country_of_product_or_service_origin_code : chr "USA" "DEU" "NAN" "USA" ...
# $ extent_competed_code : chr "A" "A" "" "D" ...
# $ extent_competed : chr "FULL AND OPEN COMPETITION" "FULL AND OPEN COMPETITION" "" "FULL AND OPEN COMPETITION AFTER EXCLUSION OF SOURCES" ...
# $ parent_award_type_code : chr "" "B" "" "" ...
# $ parent_award_type : chr "" "IDC" "" "" ...
# $ cost_or_pricing_data_code : chr "N" "N" "" "N" ...
# $ cost_or_pricing_data : chr "NO" "NO" "" "NO" ...
# $ multi_year_contract_code : chr "N" "N" "N" "N" ...
# $ multi_year_contract : chr "NO" "NO" "NO" "NO" ...

how to change the name of lists to unique

I have a list like this
a toy data like this
ltd <- list(structure(list(Abund = c("BROS", "KIS", "TTHS",
"MKS"), `Value: F111: cold, Sample1` = c("1.274e7", "",
"", "2.301e7"), `Value: F111: warm, Sample1` = c("", "",
"", "")), .Names = c("Abund", "Value: F111: cold, Sample1",
"Value: F111: warm, Sample1"), row.names = c(NA, 4L), class = "data.frame"),
structure(list(Abund = c("BROS", "TMS", "KIS",
"HERS"), `Value: F216: cold, Sample2` = c("1.670e6",
"4.115e7", "", "1.302e7"), `Value: F216: warm, Sample2` = c("",
"2.766e7", "", "1.396e7")), .Names = c("Abund", "Value: F216: cold, Sample2",
"Value: F216: warm, Sample2"), row.names = c(NA, 4L), class = "data.frame"),
structure(list(Abund = c("BROS", "TMS", "KIS",
"HERS"), `Value: F655: cold, Sample3` = c("7.074e4",
"1.038e7", "", "7.380e5"), `Value: F655: warm, Sample3` = c("",
"6.874e6", "", "7.029e5")), .Names = c("Abund", "Value: F655: cold, Sample3",
"Value: F655: warm, Sample3"), row.names = c(NA, 4L), class = "data.frame"))
List of 5000
$ :'data.frame': 397 obs. of 3 variables:
..$ Abund : chr [1:363] "TTT" "MMM" "GTR" "NLM" ...
..$ Value: F111: Warm, Sample1: chr [1:363] "1.274e7" "" "" "2.301e7" ...
..$ Value: F111: Cold, Sample1: chr [1:363] "" "" "" "" ...
$ :'data.frame': 673 obs. of 3 variables:
..$ Abund : chr [1:673] "MGL" "KKK" "LFT" "NKL" ...
..$ Value: F216: Warm, Sample2: chr [1:673] "1.670e6" "4.115e7" "" "1.302e7" ...
..$ Value: F216: Cold, Sample2: chr [1:673] "" "2.766e7" "" "1.396e7" ...
$ :'data.frame': 779 obs. of 3 variables:
..$ Abund : chr [1:779] "TTLS" "KIS" "KISA" "LISU" ...
..$ Value: F655: Warm, Sample3: chr [1:779] "7.074e4" "1.038e7" "" "7.380e5" ...
..$ Value: F655: Cold, Sample3: chr [1:779] "" "6.874e6" "" "7.029e5" ...
$ :'data.frame': 387 obs. of 3 variables:
..$ Abund : chr [1:387] "BRO" "BIA" "KIA" "TTHS" ...
..$ Value: F57: Warm, Sample4: chr [1:387] "6.910e6" "" "2.435e7" "3.924e6" ...
..$ Value: F57: Cold, Sample4: chr [1:387] "5.009e6" "" "" "3.624e6" ...
$ :'data.frame': 543 obs. of 3 variables:
I want to give unique names to the abund starting from 1 to whatever it has , so the output should look like
So a disire output looks like below. I have to just write blah blah that this web allow me to post my question otherwise it does not allow
List of 5000
$ :'data.frame': 397 obs. of 3 variables:
..$ Abund1 : chr [1:363] "TTT" "MMM" "GTR" "NLM" ...
..$ Value: F111: Warm, Sample1: chr [1:363] "1.274e7" "" "" "2.301e7" ...
..$ Value: F111: Cold, Sample1: chr [1:363] "" "" "" "" ...
$ :'data.frame': 673 obs. of 3 variables:
..$ Abund2 : chr [1:673] "MGL" "KKK" "LFT" "NKL" ...
..$ Value: F216: Warm, Sample2: chr [1:673] "1.670e6" "4.115e7" "" "1.302e7" ...
..$ Value: F216: Cold, Sample2: chr [1:673] "" "2.766e7" "" "1.396e7" ...
$ :'data.frame': 779 obs. of 3 variables:
..$ Abund3 : chr [1:779] "TTLS" "KIS" "KISA" "LISU" ...
..$ Value: F655: Warm, Sample3: chr [1:779] "7.074e4" "1.038e7" "" "7.380e5" ...
..$ Value: F655: Cold, Sample3: chr [1:779] "" "6.874e6" "" "7.029e5" ...
$ :'data.frame': 387 obs. of 3 variables:
..$ Abund4 : chr [1:387] "BRO" "BIA" "KIA" "TTHS" ...
..$ Value: F57: Warm, Sample4: chr [1:387] "6.910e6" "" "2.435e7" "3.924e6" ...
..$ Value: F57: Cold, Sample4: chr [1:387] "5.009e6" "" "" "3.624e6" ...
To solve a problem like this, instead of attacking the big problem up front, it's best to solve one piece of it at a time. If we look at just one frame from your list, I'll call it x:
x <- structure(list(Abund = c("BROS", "KIS", "TTHS",
"MKS"), `Value: F111: cold, Sample1` = c("1.274e7", "",
"", "2.301e7"), `Value: F111: warm, Sample1` = c("", "",
"", "")), .Names = c("Abund", "Value: F111: cold, Sample1",
"Value: F111: warm, Sample1"), row.names = c(NA, 4L), class = "data.frame")
str(x)
# 'data.frame': 4 obs. of 3 variables:
# $ Abund111 : chr "BROS" "KIS" "TTHS" "MKS"
# $ Value: F111: cold, Sample1: chr "1.274e7" "" "" "2.301e7"
# $ Value: F111: warm, Sample1: chr "" "" "" ""
You had originally wanted to append the number after the "F" in the other column names. I'll attack that first, and then if you really want it, I'll also do the "append an incrementing number" thing.
F-number
Write a function that finds the "F" number within the second column name and appends it to the first column name. (I'm wondering if there are more diverse patterns of headers in your full dataset; I'm confident that the regex we use here can easily be manipulated to handle them, given enough varying samples.)
somefunc <- function(x) {
cn2 <- colnames(x)[2]
Fnum <- gsub(".*F([0-9]+).*", "\\1", cn2)
colnames(x)[1] <- paste0(colnames(x)[1], Fnum)
x
}
A brief explanation:
colnames(x)[2] just retrieves the second one; I'm assuming that we can base everything on the presence and makeup of this second column
gsub(".*F([0-9]+).*", "\\1", cn2) extracts just the numbers after "F"; for the record, if it weren't for the Sample, we might be able to discard any non-number, but I chose being safe here.
.* matches zero or more "anything" characters; sandwiching the rest with this on both sides of our group is essentially discarding all but the number we want
F the literal "F"
(...) this is a group, saved for later (referenced with the \\1 in the replacement string, the second argument to gsub)
[0-9]+ accepts anything within the brackets, which can be literals ([acf] matches the three letters) or a range ([0-9A-F] matches any digit and any letters between A and F); the + makes it "one or more" (contrasting with the * before which is zero or more)
colnames(x)[1] <- ... reassign the first column name
The work on the "single frame":
str( somefunc(x) )
# 'data.frame': 4 obs. of 3 variables:
# $ Abund111 : chr "BROS" "KIS" "TTHS" "MKS"
# $ Value: F111: cold, Sample1: chr "1.274e7" "" "" "2.301e7"
# $ Value: F111: warm, Sample1: chr "" "" "" ""
So now the question is how to apply this function that operates on one frame across a list of frames. lapply to the rescue:
str(lapply(ltd, somefunc))
# List of 3
# $ :'data.frame': 4 obs. of 3 variables:
# ..$ Abund111 : chr [1:4] "BROS" "KIS" "TTHS" "MKS"
# ..$ Value: F111: cold, Sample1: chr [1:4] "1.274e7" "" "" "2.301e7"
# ..$ Value: F111: warm, Sample1: chr [1:4] "" "" "" ""
# $ :'data.frame': 4 obs. of 3 variables:
# ..$ Abund216 : chr [1:4] "BROS" "TMS" "KIS" "HERS"
# ..$ Value: F216: cold, Sample2: chr [1:4] "1.670e6" "4.115e7" "" "1.302e7"
# ..$ Value: F216: warm, Sample2: chr [1:4] "" "2.766e7" "" "1.396e7"
# $ :'data.frame': 4 obs. of 3 variables:
# ..$ Abund655 : chr [1:4] "BROS" "TMS" "KIS" "HERS"
# ..$ Value: F655: cold, Sample3: chr [1:4] "7.074e4" "1.038e7" "" "7.380e5"
# ..$ Value: F655: warm, Sample3: chr [1:4] "" "6.874e6" "" "7.029e5"
Incrementing number
This is both easier and harder. First, we attack the small problem:
otherfunc <- function(x, num) {
colnames(x)[1] <- paste0(colnames(x)[1], num)
x
}
Pretty straight forward. But we cannot use lapply: all it does it accept a single argument, so it will not know what to do for the number. One might be tempted to brute-force things with a tracking variable somewhere (global? please no), but it might be interesting to know that there is a variant of the "apply" functions that operates differently: mapply takes one or more lists, and "zips" them together. For example:
myfunc <- c
mapply(myfunc, 1:3, 4:6, 7:9, SIMPLIFY=FALSE)
# [[1]]
# [1] 1 4 7
# [[2]]
# [1] 2 5 8
# [[3]]
# [1] 3 6 9
We started with three (could have been more) independent vectors (could have been lists, typically are), and took the first value from each and passed them to the function. So this is effectively like:
list(myfunc(1, 4, 7), mufunc(2, 5, 8), myfunc(3, 6, 9))
Ok, so realizing that we want to "zip" together each frame with ltd with a number along a sequence, those numbers are easily generated with:
seq_along(ltd)
# [1] 1 2 3
(This is considered better than 1:length(ltd), since the latter will not behave correctly if the length is 0 ... try 1:length(list()) versus seq_along(list()).)
Okay, so let's use this new trick:
str(mapply(otherfunc, ltd, seq_along(ltd), SIMPLIFY=FALSE))
# List of 3
# $ :'data.frame': 4 obs. of 3 variables:
# ..$ Abund1 : chr [1:4] "BROS" "KIS" "TTHS" "MKS"
# ..$ Value: F111: cold, Sample1: chr [1:4] "1.274e7" "" "" "2.301e7"
# ..$ Value: F111: warm, Sample1: chr [1:4] "" "" "" ""
# $ :'data.frame': 4 obs. of 3 variables:
# ..$ Abund2 : chr [1:4] "BROS" "TMS" "KIS" "HERS"
# ..$ Value: F216: cold, Sample2: chr [1:4] "1.670e6" "4.115e7" "" "1.302e7"
# ..$ Value: F216: warm, Sample2: chr [1:4] "" "2.766e7" "" "1.396e7"
# $ :'data.frame': 4 obs. of 3 variables:
# ..$ Abund3 : chr [1:4] "BROS" "TMS" "KIS" "HERS"
# ..$ Value: F655: cold, Sample3: chr [1:4] "7.074e4" "1.038e7" "" "7.380e5"
# ..$ Value: F655: warm, Sample3: chr [1:4] "" "6.874e6" "" "7.029e5"
It should be noted that mapply, just like sapply, will by default try to simplify things; I find it hard to trust that it always do what I want, so I typically turn off this simplification. There are times for it, yes, here is not that time. The apply functions (including Reduce) are typically very hard to learn to use when thinking in a linear/iterative methodology, but they can be very useful in times like these.
In base R you can do it this way :
ltd2 <- Map(function(x,y) {names(x)[1] <- paste0(names(x)[1],y);x},ltd,seq(ltd))
str(ltd2)
# List of 3
# $ :'data.frame': 4 obs. of 3 variables:
# ..$ Abund1 : chr [1:4] "BROS" "KIS" "TTHS" "MKS"
# ..$ Value: F111: cold, Sample1: chr [1:4] "1.274e7" "" "" "2.301e7"
# ..$ Value: F111: warm, Sample1: chr [1:4] "" "" "" ""
# $ :'data.frame': 4 obs. of 3 variables:
# ..$ Abund2 : chr [1:4] "BROS" "TMS" "KIS" "HERS"
# ..$ Value: F216: cold, Sample2: chr [1:4] "1.670e6" "4.115e7" "" "1.302e7"
# ..$ Value: F216: warm, Sample2: chr [1:4] "" "2.766e7" "" "1.396e7"
# $ :'data.frame': 4 obs. of 3 variables:
# ..$ Abund3 : chr [1:4] "BROS" "TMS" "KIS" "HERS"
# ..$ Value: F655: cold, Sample3: chr [1:4] "7.074e4" "1.038e7" "" "7.380e5"
# ..$ Value: F655: warm, Sample3: chr [1:4] "" "6.874e6" "" "7.029e5"
But I would use purrr::imap and dplyr::rename_at for same result:
library(purrr)
library(dplyr)
ltd3 <- imap(ltd,~rename_at(.,1,paste0,.y))

r: merge list of unnamed data sets

After importing data from a JSON stream, I have a data frame that is 621 lists of the same 22 variables.
List of 621
$ :List of 22
..$ _id : chr "55c79e711cbee48856a30886"
..$ number : num 1
..$ country : chr "Yemen"
..$ date : chr "2002-11-03T00:00:00.000Z"
..$ narrative : chr ""
..$ town : chr ""
..$ location : chr ""
..$ deaths : chr "6"
..$ deaths_min : chr "6"
..$ deaths_max : chr "6"
..$ civilians : chr "0"
..$ injuries : chr ""
..$ children : chr ""
..$ tweet_id : chr "278544689483890688"
..$ bureau_id : chr "YEM001"
..$ bij_summary_short: chr ""
..$ bij_link : chr ""
..$ target : chr ""
..$ lat : chr "15.47467"
..$ lon : chr "45.322755"
..$ articles : list()
..$ names : chr ""| __truncated__
$ :List of 22
..$ _id : chr "55c79e711cbee48856a30887"
..$ number : num 2
..$ country : chr "Pakistan"
..$ date : chr "2004-06-17T00:00:00.000Z"
..$ narrative : chr ""
..$ town : chr ""
..$ location : chr ""
..$ deaths : chr "6-8"
..$ deaths_min : chr "6"
..$ deaths_max : chr "8"
..$ civilians : chr "2"
..$ injuries : chr "1"
..$ children : chr "2"
..$ tweet_id : chr "278544750867533824"
..$ bureau_id : chr "B1"
..$ bij_summary_short: chr ""| __truncated__
..$ bij_link : chr ""
..$ target : chr ""
..$ lat : chr "32.30512565"
..$ lon : chr "69.57624435"
..$ articles : list()
..$ names : chr ""
...
How can I combine these lists into one data frame of 621 observations of 22 variables? Notice that all 621 lists are unnamed.
edit: Per request, here is how I got this data set:
library(rjson)
url <- 'http://api.dronestre.am/data'
document <- fromJSON(file=url, method='C')
str(document$strike)
Can you provide example on how you generated the data ? I did not test the answer but, the following should help. If you can update the Q, on how you came up with the data, I can work to try that.
update
library(rjson)
library(data.table)
library(dplyr)
url <- 'http://api.dronestre.am/data'
document <- fromJSON(file=url, method='C')
is(document)
listdata<- document$strike
df<-do.call(rbind,listdata) %>% as.data.table
dim(df)
purrr has a useful transpose function which 'inverts' a list. The $articles element causes trouble as it appears always to be empty, and scuppers you when you try to convert to a data.frame, so I've subsetted for it.
library(purrr)
df <- transpose(document$strike) %>%
t %>%
apply(FUN = unlist, MARGIN = 2)
df <- df[-21] %>% data.frame %>% tbl_df
df
Source: local data frame [621 x 21]
X_id number country date
(fctr) (dbl) (fctr) (fctr)
1 55c79e711cbee48856a30886 1 Yemen 2002-11-03T00:00:00.000Z
2 55c79e711cbee48856a30887 2 Pakistan 2004-06-17T00:00:00.000Z
3 55c79e711cbee48856a30888 3 Pakistan 2005-05-08T00:00:00.000Z
4 55c79e721cbee48856a30889 4 Pakistan 2005-11-05T00:00:00.000Z
5 55c79e721cbee48856a3088a 5 Pakistan 2005-12-01T00:00:00.000Z
6 55c79e721cbee48856a3088b 6 Pakistan 2006-01-06T00:00:00.000Z
7 55c79e721cbee48856a3088c 7 Pakistan 2006-01-13T00:00:00.000Z
8 55c79e721cbee48856a3088d 8 Pakistan 2006-10-30T00:00:00.000Z
9 55c79e721cbee48856a3088e 9 Pakistan 2007-01-16T00:00:00.000Z
10 55c79e721cbee48856a3088f 10 Pakistan 2007-04-27T00:00:00.000Z
.. ... ... ... ...
Variables not shown: narrative (fctr), town (fctr), location (fctr), deaths
(fctr), deaths_min (fctr), deaths_max (fctr), civilians (fctr), injuries
(fctr), children (fctr), tweet_id (fctr), bureau_id (fctr), bij_summary_short
(fctr), bij_link (fctr), target (fctr), lat (fctr), lon (fctr), names (fctr)

R: changing all string nominal columns to integers

I have a dataset where I'm planning to use ubRacing of unbalanced package. But this ubRacing only accepts numeric columns. Is there anyway I can convert all the chr columns to numeric through R?
Thanks
'data.frame': 31000 obs. of 22 variables:
$ ID : int 1 2 3 4 5 6 7 8 9 10 ...
$ age : int 56 57 37 40 56 45 59 41 24 25 ...
$ job : chr "housemaid" "services" "services" "admin." ...
$ marital : chr "married" "married" "married" "married" ...
$ education : chr "basic.4y" "high.school" "high.school" "basic.6y" ...
$ default : chr "no" "unknown" "no" "no" ...
$ housing : chr "no" "no" "yes" "no" ...
$ loan : chr "no" "no" "no" "no" ...
$ contact : chr "telephone" "telephone" "telephone" "telephone" ...
$ month : chr "may" "may" "may" "may" ...
$ day_of_week : chr "mon" "mon" "mon" "mon" ...
It is not clear how to character columns should be converted to numeric. One possible option would be to convert the character class to factor and then coerce it to numeric. We loop through the columns of the dataset with lapply.
df1[] <- lapply(df1, function(x) if(is.character(x)) as.numeric(factor(x))
else (x))

IBrokers Historical Index Data

How do I get historical data of an INDEX into R from Interactive Brokers? If it were futures, I would use this command (as suggested here IBrokers request Historical Futures Contract Data?):
library(twsInstrument)
a <- reqHistoricalData(tws, getContract("ESJUN2013"))
But the corresponding commanding with the connid of the S&P Index gives an error:
> a <- reqHistoricalData(tws, getContract("11004968"))
Connected with clientId 110.
Contract details request complete. Disconnected.
waiting for TWS reply on ES ....failed.
Warning message:
In errorHandler(con, verbose, OK = c(165, 300, 366, 2104, 2106, :
Error validating request:-'uc' : cause - HMDS Expired Contract Violation:contract can not expire.
P.S. Someone with enough points should create a tag for IBrokers
I don't have market data access to index data, but I think following should work.
reqHistoricalData(tws, twsIndex(symbol = "SPX", exch = "CBOE"))
## waiting for TWS reply on SPX ....failed.
## NULL
## Warning message:
## In errorHandler(con, verbose, OK = c(165, 300, 366, 2104, 2106, :
## Historical Market Data Service error message:No market data permissions for CBOE IND
Following is result of reqContractDetails using similar approach as above which proves that the contract object is created properly by twsIndex
reqContractDetails(tws, twsIndex(symbol = "SPX", exch = "CBOE"))
## [[1]]
## List of 18
## $ version : chr "8"
## $ contract :List of 16
## ..$ conId : chr "416904"
## ..$ symbol : chr "SPX"
## ..$ sectype : chr "IND"
## ..$ exch : chr "CBOE"
## ..$ primary : chr ""
## ..$ expiry : chr ""
## ..$ strike : chr "0"
## ..$ currency : chr "USD"
## ..$ right : chr ""
## ..$ local : chr "SPX"
## ..$ multiplier : chr ""
## ..$ combo_legs_desc: chr ""
## ..$ comboleg : chr ""
## ..$ include_expired: chr ""
## ..$ secIdType : chr ""
## ..$ secId : chr ""
## ..- attr(*, "class")= chr "twsContract"
## $ marketName : chr "SPX"
## $ tradingClass : chr "SPX"
## $ conId : chr "416904"
## $ minTick : chr "0.01"
## $ orderTypes : chr [1:22] "ACTIVETIM" "ADJUST" "ALERT" "ALLOC" ...
## $ validExchanges: chr "CBOE"
## $ priceMagnifier: chr "1"
## $ underConId : chr "0"
## $ longName : chr "S&P 500 Stock Index"
## $ contractMonth : chr ""
## $ industry : chr "Indices"
## $ category : chr "Broad Range Equity Index"
## $ subcategory : chr "*"
## $ timeZoneId : chr "CST"
## $ tradingHours : chr "20130321:0830-1500;20130322:0830-1500"
## $ liquidHours : chr "20130321:0830-1500;20130322:0830-1500"
##

Resources