I have x,y,z data in 3 columns like this:
1 2 1
2 4 1
3 3 1
4 4 2
5 8 2
6 6 2
Say I only wanted to plot just (x,y) values where z=2 (i.e, just last 3 rows). How do I do that within gnuplot?
plot 'datafile.dat' using 1:((column(3) == 2) ? column(2):NaN)
Note that you can also use the shorthand form: $3 instead of column(3). I just used the latter form because it is easier to read.
Related
This is my first question, so please let me know if I made any mistakes in the ask.
I am trying to create a dataframe which has multiple columns all containing the same values in the same order, but shifted in position. Where the first value from each column is moved to the end, and everything else is shifted up.
For example, I would like to convert a data frame like this:
example = data.frame(x=c(1,2,3,4), y=c(1,2,3,4), z=c(1,2,3,4), w=c(1,2,3,4)
Which looks like this
x y z w
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
into this:
x y z w
1 2 3 4
2 3 4 1
3 4 1 2
4 1 2 3
In the new dataframe, the "peak" or # 4 has moved progressively up in rows.
I've seen advice on how to shift columns up and down, but just replacing the remaining values with zeroes or NA. But I don't know how to shift the column up and replace the bottom-most value with what was formerly at the top.
Thanks in advance for any help.
In base R, we can update with Map by removing the sequence of elements while appending values from the end
example[-1] <- Map(function(x, y) c(tail(x, -y),
head(x, y)), example[-1], head(seq_along(example), -1))
example
# x y z w
#1 1 2 3 4
#2 2 3 4 1
#3 3 4 1 2
#4 4 1 2 3
Or another option is embed
example[] <- embed(unlist(example), 4)[1:4, 4:1]
Suppose I have the following data.
x<- c(1,2, 3,4,5,1,3,8,2)
y<- c(4,2, 5,6,7,6,7,8,9)
data<-cbind(x,y)
x y
1 1 4
2 2 2
3 3 5
4 4 6
5 5 7
6 1 6
7 3 7
8 8 8
9 2 9
Now, if I subset this data to select only the observations with "x" between 1 and 3 I can do:
s1<- subset(data, x>=1 & x<=3)
and obtain my desired output:
x y
1 1 4
2 2 2
3 3 5
4 1 6
5 3 7
6 2 9
However, if I subset using the colon operator I obtained a different result:
s2<- subset(data, x==1:3)
x y
1 1 4
2 2 2
3 3 5
This time it only includes the first observation in which "x" was 1,2, or 3. Why?
I would like to use the ":" operator because I am writing a function so the user would input a range of values from which she wants to see an average calculated over the "y" variable. I would prefer if they can use ":" operator to pass this argument to the subset function inside my function but I don't know why subsetting with ":" gives me different results.
I'd appreciate any suggestions on this regard.
You can use %in% instead of ==
subset(data, x %in% 1:3)
In general, if we are comparing two vectors of unequal sizes, %in% would be used. There are cases where we can take advantage of the recycling (it can fail too) if the length of one of the vector is double that of the second. Some examples with some description is here.
In R, I’m trying to generate a matrix that shows results from a model and the values used to solve them- all of which are constrained. Every possible solution. An example model:
Model= a^2+b^2+c^2+d^2
Where:
20≤Model≤30
a=1
2 ≤b ≤3
2 ≤c ≤3
3 ≤d ≤4
I’d like the output to look like this:
[a] [b] [c] [d] [Model]
[1] 1 3 2 3 23
[2] 1 2 2 4 25
[3] 1 3 3 3 28
[4] 1 2 3 3 23
Order doesn't matter. I just want the full permutation of feasible [integer] values. Any packages or help you could point my way?
In my example case, I want to generate all possible inputs(a,b,c,d) that hold valid, based on the parameters I set. I only want values from my output equation (Model) between 20 and 30. In this case, only 4 solutions are possible based on the criteria I'm setting.
Assuming you're only looking for integer solutions, you can use expand.grid()
dd <- expand.grid(a=1, b=2:3, c=2:3, d=3:4)
m <- with(dd, a^2+b^2+c^2+d^2)
inside <- function(x, a,b) a<=x & x<=b
cbind(dd, m)[inside(m, 20, 30),]
# a b c d m
# 2 1 3 2 3 23
# 3 1 2 3 3 23
# 4 1 3 3 3 28
# 5 1 2 2 4 25
# 6 1 3 2 4 30
# 7 1 2 3 4 30
(you said you want values <=30 but you seem to have left out the 30's in your example, you can change the inside() function of you want an open interval)
If I have a vector numbers <- c(1,1,2,4,2,2,2,2,5,4,4,4), and I use 'table(numbers)', I get
names 1 2 4 5
counts 2 5 4 1
What if I want it to include 3 also or generally, all numbers from 1:max(numbers) even if they are not represented in numbers. Thus, how would I generate an output as such:
names 1 2 3 4 5
counts 2 5 0 4 1
If you want R to add up numbers that aren't there, you should create a factor and explicitly set the levels. table will return a count for each level.
table(factor(numbers, levels=1:max(numbers)))
# 1 2 3 4 5
# 2 5 0 4 1
For this particular example (positive integers), tabulate would also work:
numbers <- c(1,1,2,4,2,2,2,2,5,4,4,4)
tabulate(numbers)
# [1] 2 5 0 4 1
I'm new to R and plotting in R. This might be a very simple question but here it is,
Suppose I have a data frame like this:
a b c d
1 5 6 7
2 3 5 7
1 4 6 2
2 3 5 NA
1 4 4 2
2 2 4 2
1 2 5 1
2 3 4 NA
Here a, b, c, d are column names. I want to plot a bar chart that has values in column d on the x axis, and the number of rows with that value on y axis. So 7 has 2 rows, 1 has 1 and 2 has 3. It's not important to include missing values in between(3, 4, 5, 6).
So the result would be something like a histogram. I know I can do counting on column d and then do the plotting but I feel there must be a better way to do this.
Here's an approach--if I understand your question, columns A, B, and C are immaterial to what you are doing, which is plotting frequencies of column D.
library(ggplot2)
library(reshape)
##get frequencies of col d
test.summary<-table(test$d)
## re-shape the data
test.summary.m<-melt(test.summary)
ggplot(test.summary.m,aes(x=as.factor(Var.1),y=value))+
geom_bar(stat='identity')