Fail to create couponbonds object in termstrc package using R - r

I am trying to use R package termstrc to estimate the term structure. To do that I have to prepare the data as the couponbonds class required by the package. I used some fake data to prevent the potential problem of the real data. Though I tried a lot, it still didn't work.
Any idea what is going wrong?
structure of the official demo data which works
data("govbonds")
str(govbonds)
List of 3
$ GERMANY:List of 8
..$ ISIN : chr [1:52] "DE0001141414" "DE0001137131" "DE0001141422" "DE0001137149" ...
..$ MATURITYDATE: Date[1:52], format: "2008-02-15" "2008-03-14" "2008-04-11" ...
..$ ISSUEDATE : Date[1:52], format: "2002-08-14" "2006-03-08" "2003-04-11" ...
..$ COUPONRATE : num [1:52] 0.0425 0.03 0.03 0.0325 0.0413 ...
..$ PRICE : num [1:52] 100 99.9 99.8 99.8 100.1 ...
..$ ACCRUED : num [1:52] 4.09 2.66 2.43 2.07 2.39 ...
..$ CASHFLOWS :List of 3
.. ..$ ISIN: chr [1:384] "DE0001141414" "DE0001137131" "DE0001141422" "DE0001137149" ...
.. ..$ CF : num [1:384] 104 103 103 103 104 ...
.. ..$ DATE: Date[1:384], format: "2008-02-15" "2008-03-14" "2008-04-11" ...
..$ TODAY : Date[1:1], format: "2008-01-30"
#another two are omitted here
- attr(*, "class")= chr "couponbonds"
> ns_res <- estim_nss(govbonds, c("GERMANY"), method = "ns",tauconstr=list(c(0.2, 5, 0.1)))
[1] "Searching startparameters for GERMANY"
beta0 beta1 beta2 tau1
5.008476 -1.092510 -3.209695 2.400100
my code to prepare fake data
bond=list()
bond$CHINA=list()
n=30*12#suppose I have n bond
enddate=as.Date('2014/11/7')
isin=sprintf('DE%010d',1:n)#some fake ISIN
bond$CHINA$ISIN=isin
bond$CHINA$MATURITYDATE=enddate+(1:n)*30
bond$CHINA$ISSUEDATE=rep(enddate,n)
bond$CHINA$COUPONRATE=rep(5/100,n)
bond$CHINA$PRICE=rep(100,n)
bond$CHINA$ACCRUED=rep(0,n)
bond$CHINA$CASHFLOWS=list()
bond$CHINA$CASHFLOWS$ISIN=isin
bond$CHINA$CASHFLOWS$CF=100+(1:n)*5/12
bond$CHINA$CASHFLOWS$DATE=enddate+(1:n)*30
bond$CHINA$TODAY=enddate
class(bond)='couponbonds'
ns_res <- estim_nss(bond, c("CHINA"), method = "ns",tauconstr=list(c(0.2, 5, 0.1)))
the output
Error in `colnames<-`(`*tmp*`, value = c("DE0000000001", "DE0000000002", :
attempt to set 'colnames' on an object with less than two dimensions

The problem was finally solved by adding one cashflow with amount zero to the CASHFLOW$CF.
Put it in another way, at least one bond should have at least two cashflows.
Then you may face another error caused by uniroot function. Be sure to only include the cashflow after TODAY. The termstrc doesn't filter the cashflow for you by using TODAY.

Related

Split function not maintaining structure of dataframe?

I am doing hierarchical clustering in R and need all the cluster's elements separately.
When I use following data splits into 3 list of num [1:2628] (no info of columns in original dataframe (dataA) is transferred)
clusterA <- hclust(dist(dataA),method = "single")
NumA = 3
label <- cutree(clusterA, NumA)
clusterXlist<-split(dataA,f=label)
str(clusterXlist[[1]])
how to make shure that it maintains the structure of dataA
edit:
in my case
>str(clusterXlist[[1]])
num [1:2628] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
where as for dataA
> str(dataA)
num [1:440, 1:6] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:center")= Named num [1:6] 12000 5796 7951 3072 2881 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:scale")= Named num [1:6] 12647 7380 9503 4855 4768 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
edit2 :
for dataA
> dput(head(dataA,n=20))
structure(c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332, 0.522972471408079, 0.543838613660349, 0.408073194590386,
-0.623310408164662, -0.0523368792616442, 0.333686752405346, -0.351915064454946,
-0.113851350576777, -0.291078065290861, 0.717677967619194, -0.053285340273111,
-0.63306600713975, 0.883794139056095, 0.0557876760455718, 0.497093035238056,
-0.634420951441845, 0.409157150032062, 0.0488774601048851, 0.0719115132405076,
-0.447303143322465, -0.0410681453901357, 0.170124700204028, -0.0281250860936324,
-0.3925300807586, -0.0792659545334748, -0.297298628211157, -0.10273182626616,
0.15518230654465, -0.185125447641461, 1.15011422238562, 0.528531691780372,
-0.360751187201331, 0.400469064432042, 0.739829765498898, 0.435615257968889,
-0.434621330503326, 0.438772101699743, -0.528063904936618, 0.226000834240152,
0.159180975270399, -0.588697039406295, -0.269829034507317, -0.137379339965946,
0.68636300602308, 0.173661155768845, -0.495590877769126, -0.533904475256987,
-0.288985833251248, -0.545233764836731, -0.394039245717966, 0.273564891153861,
-0.340276616984998, -0.573659982327726, 0.00475174748902491,
-0.572218072744849, -0.551001403168238, -0.605176006067741, -0.459955112363749,
-0.178576756619561, -0.494972916519322, -0.0435191938188023,
0.0863085949200282, 0.13308015693741, -0.498021323377842, -0.23165413161966,
-0.227878848586867, 0.0542186891412866, 0.0921812574154842, -0.244448146341904,
0.952945788892319, 0.649245242698738, -0.489212329634658, 0.209634507324604,
0.802353943473126, 0.456496070080021, -0.40217108193415, 0.341140199633565,
-0.526755422016323, -0.0240135648160378, -0.0762383134363428,
-0.066263629344282, 0.0890496850231094, 2.24074190324533, 0.0933048443208461,
1.29786952218849, -0.0261942126239276, -0.347458739603052, 0.369181005457445,
-0.274766434933383, 0.203229792845712, 0.0777025935624781, -0.364479376793999,
0.498608767430271, -0.327246732938803, 0.228051555415843, -0.394620088486301,
-0.157749554245622, 1.04716972023017, 0.587257919466454, -0.36306099036142
), .Dim = c(20L, 6L), .Dimnames = list(NULL, c("Fresh", "Milk",
"Grocery", "Frozen", "Detergents_Paper", "Delicassen")))
for clusterXlist[[1]] which was obtained by split of dataA
> dput(head(clusterXlist[[1]],n=20))
c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332)
What you have there is a matrix, not a data frame.
class(dataA)
# [1] "matrix"
The quick and easy way to split() would be to do
split(as.data.frame(dataA), label)
However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply() to split the data, as follows.
clusterXlist <- lapply(
unique(label),
function(i) dataA[label == i, , drop = FALSE]
)
to properly maintain your matrix structure throughout your list elements.
str(clusterXlist[[1]])
# num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...
# - attr(*, "dimnames")=List of 2
# ..$ : NULL
# ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

Create column in R in a large database

My apologies if this question has already been answered, but I haven't found it. I'll post all my ideas to solve it. The problem is that the database is large and my PC cannot perform this calculation (core i7 and 8 GB RAM). I'm using Microsoft R Open 3.3.2 and RStudio 1.0.136.
I've trying to create a new column on a large database in R called tcm.RData (471 MB). My need is a column which divides Shape_Area by the sum of Shape_Area by COD (which I called ShapeSum). I first tried to do it in a single formula but, as it failed, I tried again in two steps with 1) summing up Shape_Area by COD and, if that succeed, to divide Shape_Area by ShapeSum.
> str(tcm)
Classes ‘data.table’ and 'data.frame': 26835293 obs. of 15 variables:
$ OBJECTID : int 1 2 3 4 5 6 7 8 9 10 ...
$ LAT : num -15.7 -15.7 -15.7 -15.7 -15.7 ...
$ LONG : num -58.1 -58.1 -58.1 -58.1 -58.1 ...
$ UF : chr "MT" "MT" "MT" "MT" ...
$ COD : num 510562 510562 510562 510562 510562 ...
$ AREA_97 : num 1130 1130 1130 1130 1130 ...
$ Shape_Area: num 255266.7 14875 25182.2 5503.9 95.5 ...
$ TYPE : chr "2" "2" "2" "2" ...
$ Nomes : chr NA NA NA NA ...
$ NEAR_DIST : num 376104 371332 371410 371592 371330 ...
$ tc_2004 : chr "AREA_URBANA" "DESFLORESTAMENTO_2004" "DESFLORESTAMENTO_2004" "DESFLORESTAMENTO_2004" ...
$ tc_2008 : chr "AREA_URBANA" "AREA_NAO_OBSERVADA" "AREA_NAO_OBSERVADA" "AREA_NAO_OBSERVADA" ...
$ tc_2010 : chr "AREA_URBANA" "PASTO_LIMPO" "PASTO_LIMPO" "PASTO_LIMPO" ...
$ tc_2012 : chr "AREA_URBANA" "PASTO_SUJO" "PASTO_SUJO" "PASTO_SUJO" ...
$ tc_2014 : chr "AREA_URBANA" "PASTO_LIMPO" "PASTO_LIMPO" "PASTO_SUJO" ...
- attr(*, ".internal.selfref")=<externalptr>
> tcm$ShapeSum <- tcm[, Shape_Area := sum(tcm$Shape_Area), by="COD"]
Error: cannot allocate vector of size 204.7 Mb
Error during wrapup: cannot allocate vector of size 542.3 Mb
I also tried the following codes, but all of them failed:
> tcm$ShapeSum <- apply(tcm[, c(Shape_Area)], 1, function(x) sum(x), by="COD")
Error in apply(tcm[, c(Shape_Area)], 1, function(x) sum(x), by = "COD") :
dim(X) must have a positive lenght
> tcm$ShapeSum <- mutate(tcm, ShapeSum = sum(Shape_Area), by="COD", package = "dplyr")
Error: cannot allocate vector of size 204.7 Mb
Error during wrapup: cannot allocate vector of size 542.3 Mb
> tcm$ShapeSum <- tcm[, transform(tcm, ShapeSum = sum(Shape_Area)), by="COD"]
> tcm$ShapeSum <- transform(tcm, aggregate(tcm$AreaShape, by=list(Category=tcm$COD), FUN=sum))
Error in aggregate.data.frame(as.data.frame(x), ...): no rows to aggregate
I thank very much for attention and for any suggestions to solve this problem.
We can use the data.table methods for creating the column as it is more efficient with the assignment (:=) which happens in place
library(data.table)
tcm[, ShapeSum := sum(Shape_Area), by = COD]
Or as #user20650 suggested it could be (based on the OP's description)
tcm[, ShapeSum := Shape_Area/sum(Shape_Area), by = COD]
library(data.table)
tcm <- fread("yout_tcm_file.txt")
tcm[, newColumn:=oldColumnPlusOne+1]
more:
https://cran.r-project.org/web/packages/data.table/vignettes/datatable-intro.html

Scrape with a loop and avoid 404 error

I am trying to scrape wiki for certain astronomy related definitions for my project. The code works pretty well, but I am not able to avoid 404s. I tried tryCatch. I think I am missing something here.
I am looking for a way overcome 404s while running a loop. Here is my code:
library(rvest)
library(httr)
library(XML)
library(tm)
topic<-c("Neutron star", "Black hole", "sagittarius A")
for(i in topic){
site<- paste("https://en.wikipedia.org/wiki/", i)
site <- read_html(site)
stats<- xmlValue(getNodeSet(htmlParse(site),"//p")[[1]]) #only the first paragraph
#error = function(e){NA}
stats[["topic"]] <- i
stats<- gsub('\\[.*?\\]', '', stats)
#stats<-stats[!duplicated(stats),]
#out.file <- data.frame(rbind(stats,F[i]))
output<-rbind(stats,i)
}
Build the variable urls in the loop using sprintf.
Extract all the body text from paragraph nodes.
Remove any vectors returning a length(0)
I added a step to include all of the body text annotated by a prepended [paragraph - n] for reference..because well...friends don't let friends waste data or make multiple http requests.
Build a data frame for each iteration in your topics list in the form of below:
Bind all of the data.frames in the list into one...
wiki_url : should be obvious
topic: from the topics list
info_summary: The first paragraph (you mentioned in your post)
all_info: In case you need more..ya know.
Note that I use an older, source version of rvest
for ease of understanding i'm simply assigning the name html to what would be your read_html.
library(rvest)
library(jsonlite)
html <- rvest::read_html
wiki_base <- "https://en.wikipedia.org/wiki/%s"
my_table <- lapply(sprintf(wiki_base, topic), function(i){
raw_1 <- html_text(html_nodes(html(i),"p"))
raw_valid <- raw_1[nchar(raw_1)>0]
all_info <- lapply(1:length(raw_valid), function(i){
sprintf(' [paragraph - %d] %s ', i, raw_valid[[i]])
}) %>% paste0(collapse = "")
data.frame(wiki_url = i,
topic = basename(i),
info_summary = raw_valid[[1]],
trimws(all_info),
stringsAsFactors = FALSE)
}) %>% rbind.pages
> str(my_table)
'data.frame': 3 obs. of 4 variables:
$ wiki_url : chr "https://en.wikipedia.org/wiki/Neutron star" "https://en.wikipedia.org/wiki/Black hole" "https://en.wikipedia.org/wiki/sagittarius A"
$ topic : chr "Neutron star" "Black hole" "sagittarius A"
$ info_summary: chr "A neutron star is the collapsed core of a large star (10–29 solar masses). Neutron stars are the smallest and densest stars kno"| __truncated__ "A black hole is a region of spacetime exhibiting such strong gravitational effects that nothing—not even particles and electrom"| __truncated__ "Sagittarius A or Sgr A is a complex radio source at the center of the Milky Way. It is located in the constellation Sagittarius"| __truncated__
$ all_info : chr " [paragraph - 1] A neutron star is the collapsed core of a large star (10–29 solar masses). Neutron stars are the smallest and "| __truncated__ " [paragraph - 1] A black hole is a region of spacetime exhibiting such strong gravitational effects that nothing—not even parti"| __truncated__ " [paragraph - 1] Sagittarius A or Sgr A is a complex radio source at the center of the Milky Way. It is located in the constell"| __truncated__
EDIT
A function for error handling.... returns a logical. So this becomes our first step.
url_works <- function(url){
tryCatch(
identical(status_code(HEAD(url)),200L),
error = function(e){
FALSE
})
}
Based on your use of 'exoplanet' Here is all of the applicable data from the wiki page:
exo_data <- (html_nodes(html('https://en.wikipedia.org/wiki/List_of_exoplanets'),'.wikitable')%>%html_table)[[2]]
str(exo_data)
'data.frame': 2048 obs. of 16 variables:
$ Name : chr "Proxima Centauri b" "KOI-1843.03" "KOI-1843.01" "KOI-1843.02" ...
$ bf : int 0 0 0 0 0 0 0 0 0 0 ...
$ Mass (Jupiter mass) : num 0.004 0.0014 NA NA 0.1419 ...
$ Radius (Jupiter radii) : num NA 0.054 0.114 0.071 1.012 ...
$ Period (days) : num 11.186 0.177 4.195 6.356 19.224 ...
$ Semi-major axis (AU) : num 0.05 0.0048 0.039 0.052 0.143 0.229 0.0271 0.053 1.33 2.1 ...
$ Ecc. : num 0.35 1.012 NA NA 0.0626 ...
$ Inc. (deg) : num NA 72 89.4 88.2 87.1 ...
$ Temp. (K) : num 234 NA NA NA 707 ...
$ Discovery method : chr "radial vel." "transit" "transit" "transit" ...
$ Disc. Year : int 2016 2012 2012 2012 2010 2010 2010 2014 2009 2005 ...
$ Distance (pc) : num 1.29 NA NA NA 650 ...
$ Host star mass (solar masses) : num 0.123 0.46 0.46 0.46 1.05 1.05 1.05 0.69 1.25 0.22 ...
$ Host star radius (solar radii): num 0.141 0.45 0.45 0.45 1.23 1.23 1.23 NA NA NA ...
$ Host star temp. (K) : num 3024 3584 3584 3584 5722 ...
$ Remarks : chr "Closest exoplanet to our Solar System. Within host star’s habitable zone; possibl
y Earth-like." "controversial" "controversial" "controversial" ...
test our url_works function on random sample of the table
tests <- dplyr::sample_frac(exo_data, 0.02) %>% .$Name
Now lets build a ref table with the Name, url to check, and a logical if the url is valid, and in one step create a list of two data frames, one containing the urls that don't exists....and the other that do. The ones that check out we can run through the above function with no issues. This way the error handling is done before we actually start trying to parse in a loop. Avoids headaches and gives a reference ack to what items need to be further looked into.
b <- ldply(sprintf('https://en.wikipedia.org/wiki/%s',tests), function(i){
data.frame(name = basename(i), url_checked = i,url_valid = url_works(i))
}) %>%split(.$url_valid)
> str(b)
List of 2
$ FALSE:'data.frame': 24 obs. of 3 variables:
..$ name : chr [1:24] "Kepler-539c" "HD 142 A c" "WASP-44 b" "Kepler-280 b" ...
..$ url_checked: chr [1:24] "https://en.wikipedia.org/wiki/Kepler-539c" "https://en.wikipedia.org/wiki/HD 142 A c" "https://en.wikipedia.org/wiki/WASP-44 b" "https://en.wikipedia.org/wiki/Kepler-280 b" ...
..$ url_valid : logi [1:24] FALSE FALSE FALSE FALSE FALSE FALSE ...
$ TRUE :'data.frame': 17 obs. of 3 variables:
..$ name : chr [1:17] "HD 179079 b" "HD 47186 c" "HD 93083 b" "HD 200964 b" ...
..$ url_checked: chr [1:17] "https://en.wikipedia.org/wiki/HD 179079 b" "https://en.wikipedia.org/wiki/HD 47186 c" "https://en.wikipedia.org/wiki/HD 93083 b" "https://en.wikipedia.org/wiki/HD 200964 b" ...
..$ url_valid : logi [1:17] TRUE TRUE TRUE TRUE TRUE TRUE ...
Obviously the second item of the list contains the data frame with valid urls, so apply the prior function to the url column in that one. Note that I sampled the table of all planets for purposes of explanation...There are 2400 some-odd names, so that check will take a min or two to run in your case. Hope that wraps it up for you.

R dataframe define column names at creation

I get monthly price value for the two assets below from Yahoo:
if(!require("tseries") | !require(its) ) { install.packages(c("tseries", 'its')); require("tseries"); require(its) }
startDate <- as.Date("2000-01-01", format="%Y-%m-%d")
MSFT.prices = get.hist.quote(instrument="msft", start= startDate,
quote="AdjClose", provider="yahoo", origin="1970-01-01",
compression="m", retclass="its")
SP500.prices = get.hist.quote(instrument="^gspc", start=startDate,
quote="AdjClose", provider="yahoo", origin="1970-01-01",
compression="m", retclass="its")
I want to put these two into a single data frame with specified columnames (Pandas allows this now - a bit ironic since they take the data.frame concept from R). As below, I assign the two time series with names:
MSFTSP500.prices <- data.frame(msft = MSFT.prices, sp500= SP500.prices )
However, this does not preserve the column names [msft, snp500] I have appointed. I need to define column names in a separate line of code:
colnames(MSFTSP500.prices) <- c("msft", "sp500")
I tried to put colnames and col.names inside the data.frame() call but it doesn't work. How can I define column names while creating the data frame?
I found ?data.frame very unhelpful...
The code fails with an error message indicating no availability of as.its. So I added the missing code (which appears to have been successful after two failed attempts.) Once you issue the missing require() call you can use str to see what sort of object get.hist.quote actually returns. It is neither a dataframe nor a zoo object, although it resembles a zoo-object in many ways:
> str(SP500.prices)
Formal class 'its' [package "its"] with 2 slots
..# .Data: num [1:180, 1] 1394 1366 1499 1452 1421 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:180] "2000-01-02" "2000-01-31" "2000-02-29" "2000-04-02" ...
.. .. ..$ : chr "AdjClose"
..# dates: POSIXct[1:180], format: "2000-01-02 16:00:00" "2000-01-31 16:00:00" ...
If you run cbind on those two objects you get a regular matrix with dimnames:
> str(cbind(SP500.prices, MSFT.prices) )
num [1:180, 1:2] 1394 1366 1499 1452 1421 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:180] "2000-01-02" "2000-01-31" "2000-02-29" "2000-04-02" ...
..$ : chr [1:2] "AdjClose" "AdjClose"
You will still need to change the column names since there does not seem to be a cbind.its that lets you assign column-names. I would caution about using the data.frame method, since the object is might get confusing in its behavior:
> str( MSFTSP500.prices )
'data.frame': 180 obs. of 2 variables:
$ AdjClose :Formal class 'AsIs', 'its' [package ""] with 1 slot
.. ..# .S3Class: chr "AsIs" "its"
$ AdjClose.1:Formal class 'AsIs', 'its' [package ""] with 1 slot
.. ..# .S3Class: chr "AsIs" "its"
The columns are still S4 objects. I suppose that might be useful if you were going to pass them to other its-methods but could be confusing otherwise. This might be what you were shooting for:
> MSFTSP500.prices <- data.frame(msft = as.vector(MSFT.prices),
sp500= as.vector(SP500.prices) ,
row.names= as.character(MSFT.prices#dates) )
> str( MSFTSP500.prices )
'data.frame': 180 obs. of 2 variables:
$ msft : num 35.1 32 38.1 25 22.4 ...
$ sp500: num 1394 1366 1499 1452 1421 ...
> head(rownames(MSFTSP500.prices))
[1] "2000-01-02 16:00:00" "2000-01-31 16:00:00" "2000-02-29 16:00:00"
[4] "2000-04-02 17:00:00" "2000-04-30 17:00:00" "2000-05-31 17:00:00"
MSFT.prices is a zoo object, which seems to be a data-frame-alike, with its own column name which gets transferred to the object. Confer
tmp <- data.frame(a=1:10)
b <- data.frame(lost=tmp)
which loses the second column name.
If you do
MSFTSP500.prices <- data.frame(msft = as.vector(MSFT.prices),
sp500=as.vector(SP500.prices))
then you will get the colnames you want (though you won't get zoo-specific behaviours). Not sure why you object to renaming columns in a second command, though.

read.zoo works but then as.xts fails with "currently unsupported data type"

I've a csv file of daily bars, with just two lines:
"datestamp","Open","High","Low","Close","Volume"
"2012-07-02",79.862,79.9795,79.313,79.509,48455
(That file was an xts that was converted to a data.frame then passed on to write.csv)
I load it with this:
z=read.zoo(file='tmp.csv',sep=',',header=T,format = "%Y-%m-%d")
And it is fine as print(z) shows:
Open High Low Close Volume
2012-07-02 79.862 79.9795 79.313 79.509 48455
But then as.xts(z) gives: Error in coredata.xts(x) : currently unsupported data type
Here is the str(z) output:
‘zoo’ series from 2012-07-02 to 2012-07-02
Data:List of 5
$ : num 79.9
$ : num 80
$ : num 79.3
$ : num 79.5
$ : int 48455
- attr(*, "dim")= int [1:2] 1 5
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:5] "Open" "High" "Low" "Close" ...
Index: Date[1:1], format: "2012-07-02"
I've so far confirmed it is not that 4 columns are num and one column is int, as I still get the error even after removing the Volume column. But, then, what could that error message be talking about?
As Sebastian pointed out in the comments, the problem is in the single row. Specifically the coredata is a list when read.zoo reads a single row, but something else (a matrix?) when there are 2+ rows.
I replaced the call to read.zoo with the following, and it works fine whether 1 or 2+ rows:
d=read.table(fname,sep=',',header=T)
x=as.xts(subset(d,select=-datestamp),order.by=as.Date(d$datestamp))

Resources