I am struggling with GNUPLOT binary data handling.
I have a binary file, printed by MATLAB frite function, which prints in column order.
I am printing a Nx2 array, that is a collection of points on xy plane, that I guess is stored as x1..xn y1..yn, as consecutive records in the binary file. Do you agree? Consider that I still have a not clear idea of what binary storage means. I am used to ASCII files, with nice separators and \n's.
So I want to plot these points with gnuplot. I have been reading the binary general documentation and I ended trying this:
plot 'datafile.bin' binary array=N:N w l
that means that my data file is made by two arrays, each one of N elements. Gnuplot produces one line, first following the values of the first array, then following the values of the second array, both of them on the interval 1:N.
I tried to use the first array as x axis of my plot and the second array as y axis, So I try:
plot 'datafile.bin' binary array=N:N u 1:2 w l
It plots the two arrays again consecutively, not in a xy plot. Where am I wrong?
Many thanks
EDIT: I tried to apply the scan=xy keyword to both the lines, but he told me that my file is a unidimensional record. So I guess that u 1:2 has no sense
I don't think gnuplot can handle the data type you describe. It doesn't know about arrays and matrices like matlab does.
Write your data file with pairs of x,y values.
Then you can
plot dataf binary format='%float%float' using 1:2".
(if your x,y values are both floats).
The "array" keyword is meant for the case when your file only contains the function values and you want gnuplot to construct the independent variable(s). Totally different.
Related
I need to check for and plot correlance between few properties in R, where many of them are String-based.
Consider the following CSV example data extract for webpage hits:
id;type;lang
1;EN;browser
2;EN;ios
3;DE;android
4;DE;browser
5;FR;ios
the type and lang columns contain only strings, and (as far as I understand) cannot be used for plotting or correlation analysis. So I would need to convert them into numbers, right? But how do I reattach the string when plotting language against browser type?
If I consider some methods like PCA, are they even possible with number-converted strings, as there is no useful information in the distance or distribution that way?
Probably, a great solution is to create an if/else statement. You will compare the strings, and you assign a auxiliar variable a number for each string.
Now, when you have your numbers in a list or in a vector, you can create a new data frame with the number values. After that you can represent your plot.
Here it's your code:
id<-c(1:5)
lang<-c("EN","EN","DE","DE","FR")
type<-c("browser","ios","android","browser","ios")
data<-data.frame(id,type,lang)
tmp<-vector(mode="list",length=nrow(data))
auxType<-NULL
auxLang<-NULL
for(i in 1:nrow(data))
{
#Assign lang
if(data$lang[i]=="EN")
auxLang<-1
else
if(data$lang[i]=="DE")
auxLang<-2
else
if(data$lang[i]=="FR")
auxLang<-3
#Assign type
if(data$type[i]=="browser")
auxType<-1
else
if(data$type[i]=="ios")
auxType<-2
else
if(data$lang[i]=="android")
auxType<-3
#Create an auxiliar data frame
tmp[[i]]<-data.frame(data$id[i],auxType,auxLang)
}
allData<-do.call(rbind,tmp)
names(allData)<-c("id","types","lang")
I am trying to get a more meaningful version of the data plotted when a categorical predictor appears in the output of a tree function.
The values are airport codes: FLR, FUE, GOA, HER etc,
If I use tree() and
plot(Simulate.tree2); text(Simulate.tree2, pretty=1)
I get:
Which is not bad, but the codes are abbreviated and not clear.
If I use maptree() and
draw.tree(Simulate.tree2)
I get:
which is not at all helpful, since the letters just indicate the position of the value in a vector (I assume)
Is there a way in either package (or both) to get the actual values printed?
Have you tried this?
plot(Simulate.tree2)
text(Simulate.tree1, pretty = 3)
From the documentation, it looks like passing an integer to pretty sets the minimum length of the labels at that integer value. So for airport codes, you'd want 3.
I'm using a fluid simulation software which can create .vtk files of scalars x-velocity, y-velocity, and z-velocity. I'm trying to view streamlines using ParaView, however that requires vectorized data. Is there an easy way to combine the scalar .vtk files to produce a vectorized .vtk file?
Thanks a lot!
You can use the calculator filter in ParaView to combine the components to a vector.
The required entities are iHat, jHat, and kHat, i.e. the vector constants representing unit vectors in the X, Y, and Z directions, respectively.
In your case the required line would look something like iHat*Xvel+jHat*Yvel+kHat*Zvel with
Xvel, Yvel and Zvel are the x, y, and z velocity components.
You can find your scalar data in the dropdown list 'Scalars'.
As an example, the following shows an example combining the coordinates (scalars) to a coordinate vector.
I have a very large data set that I have binned, and stored each bin (subset) as a list so that I can easily call any given subset. My problem is in calling for a specific column within a subset.
For example my data (which has diameters and strengths as the columns), is broken up into 20 bins, by diameter. I manually binned the data, like so:
subset.1 <- subset(mydata, Diameter <= 0.01)
Similar commands were used, to make 20 bins. Then I stored the names (subset.1 through subset.20) into a list:
diameter.bin<-list(subset.1, ... , subset.20)
I can successfully call each diameter bin using:
diameter.bin[x]
Now, if I only want to see the strength values for a given diameter bin, I can use the original name (that is store in the list):
subset.x$Strength
But I cannot get this information using the list call:
diameter.bin[x]$Strength
This command returns NULL
Note that when I call any subset (either by diameter.bin[x], subset.x or even subset.x$Strength) my column headers do show up. When I use:
names(subset.1)
This returns "Diameter" and "Strength"
But when I use:
names(diameter.bin[1])
This returns NULL.
I'm assuming that the column header is part of the problem, but I'm not sure how to fix it, other than take the headers off of the original data file. I would prefer not to do this if at all possible.
The end goal is to look at the distribution of strength values for each diameter bin, so I will be doing things like drawing histograms, calculating parameters etc. I was hoping to do something along these lines to produce the histograms:
n=length(diameter.bin)
for(i in (1:n))
{
hist(diameter.bin[i]$Strength)
}
And do something similar to this to store median values for each bin in a new vector.
Any tips are greatly appreciated, as right now I'm doing it all 1 bin at a time, and I know a loop (or something similar) would really speed up my analysis.
You need two square brackets. Here is a reproducible example demonstrating the issue:
> diam <- data.frame(x=rnorm(5), y=rnorm(5))
>
> diam.l <- list(diam, diam)
> diam.l[1]$x
NULL
> diam.l[[1]]$x
[1] -0.5389441 -0.5155441 -1.2437108 -2.0044323 -0.6914124
New to R and having problem with a very simple task! I have read a few columns of .csv data into R, the contents of which contains of variables that are in the natural numbers plus zero, and have missing values. After trying to use the non-parametric package, I have two problems: first, if I use the simple command bw=npregbw(ydat=y, xdat=x, na.omit), where x and y are column vectors, I get the error that "number of regression data and response data do not match". Why do I get this, as I have the same number of elements in each vector?
Second, I would like to call the data ordered and tell npregbw this, using the command bw=npregbw(ydat=y, xdat=ordered(x)). When I do that, I get the error that x must be atomic for sort.list. But how is x not atomic, it is just a vector with natural numbers and NA's?
Any clarifications would be greatly appreciated!
1) You probably have a different number of NA's in y and x.
2) Can't be sure about this, since there is no example. If it is of following type:
x <- c(3,4,NA,2)
Then ordered(x) should work fine. Please provide an example of your case.
EDIT: You of course tried bw=npregbw(ydat=y, xdat=x)? ordered() makes your vector an ordered factor (see ?ordered), which is not an atomic vector (see 2.1.1 link and ?factor)
EDIT2: So the problem was the way of subsetting data. Note the difference in various ways of subsetting. data$x and data[,i] (where i = column number of column x) give you vectors, while data[c("x")] and data[i] give a data frame. Functions expect vectors, unless they call for data = (your data). In that case they work with column names