How to change the list of kdims of a holoviews Dataset - holoviews

I have a tabular data set and it has multiple columns that might be the key dimension for some plots.
ds = hv.Dataset(data_df, kdims=['time', 'forecasttime', 'group'], vdims=['speed'])
I could initial use a curve plot:
ds.to(hv.Curve, kdims=['time'], vdims=['speed'])
This would provide timeseries curves with a selector widget on 'forecasttime', 'group'.
What I would like to achieve is to plot a curve that ignores the key dimenion 'forecasttime', 'group'. While I could certain achieve the samething by defining my Dataset object in a different way like the following:
ds = hv.Dataset(data_df, kdims=['time'], vdims=['speed'])
ds.to(hv.Curve, kdims=['time'], vdims=['speed'])
I was hoping that I could remove a key dimension from the kdims of ds after it is defined. What could I do?
I am new to holoviews. Perhaps I am not using holoviews's Dataset object the correct way. I would also appreciate any advice.

You can easily ignore the additional dimensions by declaring the groupby keyword to be empty in the call to .to, e.g.
ds.to(hv.Curve, kdims=['time'], vdims=['speed'], groupby=[])
That said in the case of a Curve it is a bit weird to just ignore a dimension and you may end up with the curve zig-zagging across the plot. Since I don't know the structure of your data this may well be a valid thing to do though. If you instead want to overlay each Curve something like this might be what you want:
ds.to(hv.Curve, kdims=['time'], vdims=['speed'], groupby=['group']).overlay()
or written more simply:
ds.to(hv.Curve, 'time', 'speed', 'group').overlay()

Related

Plotting POSIXct in ggplot manually scaling x-axis

I am trying to plot up this windspeed data, with years displaying on the x-axis. The data frame was set up as
wsAvg<-data.frame(date=as.POSIXct(ws07$date[1224:1559]),u.1=(ws07$u[1224:1559]),stringsAsFactors = FALSE)
wsAvg<-rbind(wsAvg,c(date=as.POSIXct(ws08$date[1032:1367]),(ws08$u[1032:1367])))
And below using ggplot to plot my windspeed data frame.
ggplot(wsAvg,aes(x=date,y=as.numeric(u.1)))+geom_point(size=3,pch=2)+
geom_smooth(method="lm",colour="black",se=FALSE)+
#scale_x_datetime(limits=as.POSIXct(c('2006-09-01','2016-10-01')),breaks=date_breaks("1 year"),labels=date_format("%Y"))+
Without the scale_x_datetime() in my command, I get those dates. When I add in the scale_x_datetime() function to manually scale my x-axis to display only years. All my data lines up onto 2007. Anyone know why this is?
It is very difficult to provide the answer to your question, since we don't have a clear picture of any of your data. With that being said, let's look at the information you did provide and see where the likely source of the problem is for your question.
The issue is clearly related to the formatting/data located in your "date" column. It's best to look at this stepwise and test at each step to see what can go wrong here:
Your raw data: There is likely nothing wrong with your base data, but we don't know the format of the "date" vector coming from ws07$date[1224:1559] and ws08$date[1032:1367]. Your raw data originates from two data frames, so just confirm that the raw data from these two vectors is formatted identically, but more importantly, is it already formatted as a date? What is class(ws08$date)? Also, what does the data look like if you took a sample of that dataset? (e.g. ws07$date[sample(1224:1559, 20)]).
Conversion to POSIXct: The first code you show includes as.POSIXct(), but does not include the argument for format=. You may or may not need to specify this, but I would recommend consulting the documentation to be sure you're using the function correctly. You can try converting a small subset of the data just using as.POSIXct(ws07$date[1224:1250]) or something like that. Does it give you the dates formatted correctly? If not, try specifying the format= arg until it "works" as you intended.
Initial Plot and Second plot The data is spread out in the first plot, likely kind of how you expected. What about the month/day combinations in the first plot - are they correct? If they are correct, it may indicate the year is being read wrong, since apparently all dates are clustered around May and June of 2007. Comparing the first and second plots, there's no obvious issue with scale_x_datetime() here. Those two plots are consistent with data that has x values = dates ranging from May-June of 2007.
Bottom line: Hard to discern exactly where it's going wrong for you, but likely it's (1) in the conversion to date using as.POSIXct from your ws07 and ws08 datasets, or (2) the format of ws07$date or ws08$date being imported/converted incorrectly. The solution is to use the format= argument in the date conversion/import function you are using to ensure that the format is correct and years/months/dates are imported accordingly.
The code that worked for me. Instead of using c() function when I was binding data from other datasets, I had to use data.frame() to add other years into the wsAvg data frame.
wsAvg<-data.frame(date=as.POSIXct(ws07$date[1224:1559]),u.1=(ws07$u[1224:1559]),stringsAsFactors = FALSE)
wsAvg<-rbind(wsAvg,data.frame(date=as.POSIXct(ws08$date[1032:1367]),u.1=(ws08$u[1032:1367])))

Efficient way to review formulas that generate named objects in R

If I have a named object (in my case a named plot) in R, is there an efficient way to double check the formula that generated it? As of now I am scrolling back through the console, but I'm hoping that there is a more efficient way.
For example, at the start of my project I input
Boxplot <- ggplot(plotting input) + geom_boxplot(plotting input)
Now I can call Boxplot by name to plot it, but I want to be able to efficiently review my ggplot input. Is there a tool to do this?
For your example, you can see the elements of Boxplot using:
names(Boxplot)
So you can see, for example, the input data using:
Boxplot$data
Or the parameters and type of the plot using:
Boxplot$layers

r sna equiv.clust more than one graph

I would like to provide more than one graph as input to the equiv.clust function in the sna package. For example
library(ergm)
library(sna)
data(florentine)
flobusiness # first relation
flomarriage # second relation
eq<-equiv.clust(flobusiness)
b<-blockmodel(flobusiness,eq,h=10)
plot(b)
So far so good. I get the output I expect. However, how do I include both relations in the equiv.clust and blockmodel commands?
According to the Usage in documentation
equiv.clust(dat, g=NULL, equiv.dist=NULL, equiv.fun="sedist",
method="hamming", mode="digraph", diag=FALSE,
cluster.method="complete", glabels=NULL, plabels=NULL, ...)
where
dat one or more graphs.
Specifically, I am requesting to know how to provide two or more graphs in as the dat part of the argument. Thanks a ton
try entering the graphs as a list, as in:
equiv.clust(list(flobusiness,flomarriage))
not sure if that will work but in general I think you need to use lists to analyze multiple graphs. though in this case, it depends on whether you want two separate blockmodels in which case you could just loop or use
lapply(equiv.clust, list(flobusiness,flomarriage))
and then a slightly more complicated statement for the block model, or whether you want a blockmodel of the combined network in which case you could just add them together

(wx)Maxima plot point by point, numbered

I have a list of roots and I want to plot the real/imaginary parts. If s=allroots(), r=realpart() and i=imagpart(), all with makelist(). Since length(s) can get ...lengthy, is there a way to plot point by point and have them numbered? Actually, the numbering part is what concerns me most. I can simply use points(r,i) and get the job done, but I'd like to know their occurence before and after some sorting algorithms. It's not always necessary to plot all the points, I can plot up until some number, but I do have to be able to see their order of having been sorted out.
I have tried multiplot_mode but it doesn't work:
multiplot_mode(wxt)$
for i:1 thru length(s) do draw2d(points([r[i]],[i[i]]))$
multiplot_mode(none)$
All I get is a single point. Now, if this should work, using draw2d's label(["label",posx,posy]) is very handy, but can I somehow evaluate i in the for loop inside the ""?
Or, is there any other way to do it? With Octave? or Scilab? I'm on Linux, btw.
Just to be clear, here's what I currently do: (I can't post images, here's the link: i.stack.imgur.com/hNYZF.png )
...and here is the wxMaxima code:
ptest:sortd(pp2); length(ptest);
draw2d(proportional_axes=xy,xrange=[sort(realpart(s))[1]-0.1,sort(realpart(s))[length(s)]+0.1],
yrange=[sort(imagpart(s))[1]-0.1,sort(imagpart(s))[length(s)]+0.1],point_type=0,
label(["1",realpart(ptest[1]),imagpart(ptest[1])]),points([realpart(ptest[1])],[imagpart(ptest[1])]),
label(["2",realpart(ptest[2]),imagpart(ptest[2])]),points([realpart(ptest[2])],[imagpart(ptest[2])]),
label(["3",realpart(ptest[3]),imagpart(ptest[3])]),points([realpart(ptest[3])],[imagpart(ptest[3])]),
label(["4",realpart(ptest[4]),imagpart(ptest[4])]),points([realpart(ptest[4])],[imagpart(ptest[4])]),
label(["5",realpart(ptest[5]),imagpart(ptest[5])]),points([realpart(ptest[5])],[imagpart(ptest[5])]),
label(["6",realpart(ptest[6]),imagpart(ptest[6])]),points([realpart(ptest[6])],[imagpart(ptest[6])]),
label(["7",realpart(ptest[7]),imagpart(ptest[7])]),points([realpart(ptest[7])],[imagpart(ptest[7])]),
label(["8",realpart(ptest[8]),imagpart(ptest[8])]),points([realpart(ptest[8])],[imagpart(ptest[8])]),
label(["9",realpart(ptest[9]),imagpart(ptest[9])]),points([realpart(ptest[9])],[imagpart(ptest[9])]),
label(["10",realpart(ptest[10]),imagpart(ptest[10])]),points([realpart(ptest[10])],[imagpart(ptest[10])]),
label(["11",realpart(ptest[11]),imagpart(ptest[11])]),points([realpart(ptest[11])],[imagpart(ptest[11])]),
label(["12",realpart(ptest[12]),imagpart(ptest[12])]),points([realpart(ptest[12])],[imagpart(ptest[12])]),/*
label(["13",realpart(ptest[13]),imagpart(ptest[13])]),points([realpart(ptest[13])],[imagpart(ptest[13])]),
label(["14",realpart(ptest[14]),imagpart(ptest[14])]),points([realpart(ptest[14])],[imagpart(ptest[14])]),*/
color=red,point_type=circle,point_size=3,points_joined=false,points(realpart(pp2),imagpart(pp2)),points_joined=false,
color=black,key="",line_type=dots,nticks=50,polar(1,t,0,2*%pi) )$
This is for 14 zeroes, only. For higher orders it would be very painful.
I gather that the problem is that you want to automatically construct all the points([realpart(...), imagpart(...)]). My advice is to construct the list of points expressions via makelist, then append that list to any other plotting arguments, then apply the plotting function to the appended list. Something like:
my_labels_and_points :
apply (append,
makelist ([label ([sconcat (i), realpart (ptest[i]), imagpart (ptest[i])]),
points ([realpart (ptest[i])], [imagpart (ptest[i])])],
i, 1, length (ptest)));
all_plot_args : append ([proptional_axes=..., ...], my_labels_and_points, [color=..., key=..., ...]);
apply (draw2d, all_plot_args);
The general idea is to build up the list of plotting arguments and then apply the plotting function to that.

Taylor diagram from existing Correlation and Standard Dev values

Is it possible to create a Taylor diagram from already calculated correlation and standard deviation values?
I am doing model evaluation, and I have already the correlation and standard deviations values.I understand that there is already a package plotrix where by giving the observation and the modeled values, the diagram is created. However for the type of work that I am doing, it is easier to start by giving already the correlation and standard deviation values.
Is there any way I can do this in R?
There's no reason it shouldn't be possible, but the authors didn't seem to allow for that when they wrote the function. The function is a bit long and complex, but the part that does the calculation is at the top. It is possible to swap out that code and replace it to allow for the passing of summary statistics. Now, keep in mind what i'm about to do is a hack and i've only tested it with versions 3.5-5 of plotrix. Other version may not work.
Here will will create a new function taylor.diagram2 that takes all the code from taylor.diagram but adds in an extra if statement to check for a list of summarized data as the first argument
taylor.diagram2<-taylor.diagram
bl<-as.list(body(taylor.diagram))
cond<-list(
as.name("if"),
quote(is.list(ref) & missing(model)), #condition
quote({R<-ref$R; sd.r<-ref$sd.r; sd.f<-ref$sd.f}), #if true
as.call(c(as.symbol("{"), bl[3:8]))) #else
bl<-c(bl[1:2], as.call(cond), bl[9:length(bl)]) #splice in new code
body(taylor.diagram2)<-as.call(bl) #update function
Now we can test the function. First, we'll do things the standard way
#test data
aref<-rnorm(30,sd=2)
amodel1<-aref+rnorm(30)/2
#standard behavior function
taylor.diagram2(aref,amodel1, main="Standard Behavior"))
#summarized data
xx<-list(
R=cor(aref, amodel1, use = "pairwise"),
sd.r=sd(aref),
sd.f=sd(amodel1)
)
#modified behavior
taylor.diagram2(xx, main="Modified Behavior")
So the new taylor.diagram2 function can do both. If you pass it two vectors, it will do the standard behavior. If you pass it a list with the names R, sd.r, and sd.f, then it will do the same plot but with the values you passed in. Also, the model parameter must be empty for the modified version to work. That means if you want to set any additional parameter, you must use named parameters rather than positional arguments.

Resources