Issue with order function [duplicate] - r

This question already has answers here:
Understanding the order() function
(7 answers)
Closed 9 years ago.
I have this function and it takes a few parameters.
I have this part of the function here:
sort.order <- order(inputs[,input.of.interest])
Iif I read inputs I get something like:
Status Quo Vaccination
[1,] 10.409146 16.252537
[2,] 5.834875 9.373437
[3,] 5.784903 15.935623
[4,] 12.208484 18.654250
[5,] 9.786787 16.467321
[6,] 6.560276 9.689887
But what is input.of.interest supposed to be?
What does it mean, how is this function used?
Should it be a number, i.e if it's 2, what would it do?

It chooses the column to sort by. If it's 1 it sorts by Status Quo and if it's 2 it sorts by Vaccination.

x <- seq(20, 11, -1)
x
# [1] 20 19 18 17 16 15 14 13 12 11
order(x)
# [1] 10 9 8 7 6 5 4 3 2 1
x[order(x)]
# [1] 11 12 13 14 15 16 17 18 19 20
Hope you see better how it works.

Related

compare the freq with the threshold and store characters of corresponding frequency [duplicate]

This question already has an answer here:
R Extract rows where column greater than 40 [duplicate]
(1 answer)
Closed 5 years ago.
Let's call the data frame below df. I want to store the names of Factory in a vector such that freq is greater than 15 (threshold).
Factory freq
1 F63F5C2CC9ADEC78 93
2 437D11819C8F3086 73
3 BCCFA6F2C54A964B 72
4 0C1DFC7996E98A98 60
5 4DBE085C274FC0D2 32
6 A8FCA1AD604D3A61 31
7 B33691F8279D733C 28
8 001DD6C2202E54F1 25
9 BBBC5737EFE9C6F5 25
10 09FDC29D7442958A 21
11 4A61DE171F2743E7 19
12 62131A16C832AB49 18
13 73DF23BF482EE5FE 18
14 793C792AE6E71D33 16
15 5F3A38C49F3C3296 6
16 923963E76AF1360D 6
17 D7055DCB51E1297A 6
18 1F4D81F7A9BC7031 4
19 898C2388F2312392 2
20 CAD1A7D01E482069 2
vec = with(dat,Factory[freq>=15])
vec
[1] "F63F5C2CC9ADEC78" "437D11819C8F3086" "BCCFA6F2C54A964B" "0C1DFC7996E98A98" "4DBE085C274FC0D2" "A8FCA1AD604D3A61"
[7] "B33691F8279D733C" "001DD6C2202E54F1" "BBBC5737EFE9C6F5" "09FDC29D7442958A" "4A61DE171F2743E7" "62131A16C832AB49"
[13] "73DF23BF482EE5FE" "793C792AE6E71D33"
Another easy option could be:
> v <- df[which(df$freq > 15), "Factory"]
> v
[1] "F63F5C2CC9ADEC78" "437D11819C8F3086" "BCCFA6F2C54A964B" "0C1DFC7996E98A98"
[5] "4DBE085C274FC0D2" "A8FCA1AD604D3A61" "B33691F8279D733C" "001DD6C2202E54F1"
[9] "BBBC5737EFE9C6F5" "09FDC29D7442958A" "4A61DE171F2743E7" "62131A16C832AB49"
[13] "73DF23BF482EE5FE" "793C792AE6E71D33"

How to extract the values from a raster in R

I want to use R to extract values from a raster. Basically, my raster has values from 0-6 and I want to extract for every single pixel the corresponding value. So that I have at the end a data table containing those two variables.
Thank you for your help, I hope my explanations are precisely enough.
Example data
library(raster)
r <- raster(ncol=5, nrow=5, vals=1:25)
To get all values, you can do
values(r)
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#as.matrix(r)
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 2 3 4 5
#[2,] 6 7 8 9 10
#[3,] 11 12 13 14 15
#[4,] 16 17 18 19 20
#[5,] 21 22 23 24 25
Also see ?getValues
You can also use indexing
r[2,2]
#7
r[7:8]
#[1] 7 8
For more complex extractions using points, lines or polygons, see ?extract
x is the raster object you are trying to extract values from; y is may be a SpatialPoints, SpatialPolygons,SpatialLines, Extent or a vector representing cell numbers (take a look at ?extract). Your code values_raster <- extract(x = values, df=TRUE) will not work because you're feeding the function with any y object/vector.
You could try to build a vector with all cell numbers of your raster. Imagine your raster have 200 cells. If your do values_raster <- extract(x = values,y=seq(1,200,1), df=TRUE) you'll get a dataframe with values for each cell.
How about simply doing
as.data.frame(s, xy=TRUE) # s is your raster file

R: counting amount of patterns of numbers [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I'm fairly new here and also fairly new to R so apologies if anything is unclear.
Basically, I have a csv table of numbers for each person, 1 number for each week for 38 weeks.
For example, Anthony has number 6 in week 1, 12 in week 2 and so on, these numbers are fairly random and range from 1-20.
I have taken the numbers from the table and saved them into a string, hence Anthonys string when printed would look like
"6 12 18 7 17 4 16 11 20 15 3 5 19 10 8 9 1 14 13 19 11 16 18 4 17 7 6 12 14 1 10 13 20 15 3 5 8 9"
What I'm trying to do with this is find/count the amount of times a number between 1 and 10 occurs in groups of 3 consecutively and then groups of 4 consecutively and possibly 5.
For example, in this string 8, 9 and 1 occur consecutively and then 3, 5, 8 and 9 occur consecutively, meaning the amount of occurrences is 2.
I've tried using str_count from the stringr package and also tried a few different functions located here - Count the number of overlapping substrings within a string
I can't seem to find a method/function to get this to output what I want (a simple count of the number of occurrences).
If anyone could provide any insight/help it would be greatly appreciated.
It would be easier to keep these as numbers. Here I use scan() to turn your string into a vector of values indicating if each number is less than 10 or not then I call rle() on it to calculate run lenths
x <- "6 12 18 7 17 4 16 11 20 15 3 5 19 10 8 9 1 14 13 19 11 16 18 4 17 7 6 12 14 1 10 13 20 15 3 5 8 9"
rr <- rle(scan(text=x)<10)
Now I can mangle this into a data.frame and see which runs were longer than 2
subset(as.data.frame(unclass(rr)), values==T & lengths>2)
# lengths values
# 9 3 TRUE
# 17 4 TRUE
So we can see that we had a run of 3 and a run of 4.
I could clean this up by defining a function to turn the rle into a data.frame more easily and track the starting indexes
as.data.frame.rle <- function(x) {
data.frame(unclass(x), start=head(cumsum(c(0,rr$lengths))+1,-1))
}
and can then run
subset(as.data.frame(rle(scan(text=x)<10)), values==T & lengths>2)
# lengths values start
# 9 3 TRUE 15
# 17 4 TRUE 35
so we can see those runs start at positions 15 and 35.

How can I tell a for loop in R to regenerate a sample if the sample contains a certain pair of species?

I am creating 1000 random communities (vectors) from a species pool of 128 with certain operations applied to the community and stored in a new vector. For simplicity, I have been practicing writing code using 10 random communities from a species pool of 20. The problem is that there are a couple of pairs of species such that if one of the pairs is generated in the random community, I need that community to be thrown out and a new one regenerated. I have been able to code that if the pair is found in a community for that community(vector) to be labeled NA. I also know how to tell the loop to skip that vector using the "next" command. But with both of these options, I do not get all of the communities that I needing.
Here is my code using the NA option, but again that ends up shorting me communities.
C<-c(1:20)
D<-numeric(10)
X<- numeric(5)
for(i in 1:10){
X<-sample(C, size=5, replace = FALSE)
if("10" %in% X & "11" %in% X) X=NA else X=X
if("1" %in% X & "2" %in% X) X=NA else X=X
print(X)
D[i]<-sum(X)
}
print(D)
This is what my result looks like.
[1] 5 1 7 3 14
[1] 20 8 3 18 17
[1] NA
[1] NA
[1] 4 7 1 5 3
[1] 16 1 11 3 12
[1] 14 3 8 10 15
[1] 7 6 18 3 17
[1] 6 5 7 3 20
[1] 16 14 17 7 9
> print(D)
[1] 30 66 NA NA 20 43 50 51 41 63
Thanks so much!

Merge values of a factor column

Column data$form contains 170 unique different values, (numbers from 1 to ~800).
I would like to merge some values (e.g with a 10 radius/step).
I need to do this in order to use:
colors = rainbow(length(unique(data$form)))
In a plot and provide a better visual result.
Thank you in advance for your help.
you can use %/% to group them and mean to combine them and normalize to scale them.
# if you want specifically 20 groups:
groups <- sort(form) %/% (800/20)
x <- c(by(sort(form), groups, mean))
x <- normalize(x, TRUE) * 19 + 1
0 1 2 3 4
1.000000 1.971781 2.957476 4.103704 4.948560
5 6 7 8 9
5.950617 7.175309 7.996914 8.953086 9.952263
10 11 12 13 14
10.800705 11.901235 12.888889 13.772291 14.888889
15 16 17 18 19
15.927984 16.864198 17.918519 18.860082 20.000000
You could also use cut. If you use the argument labels=FALSE, you get an integer value:
form <- runif(170, min=1,max=800)
> cut(form, breaks=20)
[1] (518,558] (280,320] (240,280] (121,160] (757,797]
[6] (160,200] (320,359] (598,638] (80.8,121] (359,399]
[7] (121,160] (200,240] ...
20 Levels: (1.18,41] (41,80.8] (80.8,121] (121,160] (160,200] (200,240] (240,280] (280,320] (320,359] (359,399] (399,439] ... (757,797]
> cut(form, breaks=20, labels=FALSE)
[1] 14 8 7 4 20 5 9 16 3 10 4 6 5 18 18 6 2 12
[19] 2 19 13 11 13 11 14 12 17 5 ...
On a side-note, I want you to re-consider plotting with rainbow colours, as it distorts reading the data, cf. Rainbow Color Map (Still) Considered Harmful.

Resources