This question already has answers here:
How to repeat elements in list n times?
(2 answers)
Closed 2 years ago.
I want to repeat the numbers 48, 23, 45, 67 three times in Julia to get a 12-element array like 48, 23, 45, 67, 48, 23, 45, 67, 48, 23, 45, 67. How to get the expected output succinctly without having to type the elements multiple times?
The repeat(array, n) function can be used to create an array with repeated values. To the first argument pass the numbers as an array and to the argument pass the number of times it is expected to be repeated.
julia> array = repeat([48, 23, 45, 67], 3)
12-element Array{Int64,1}:
48
23
45
67
48
23
45
67
48
23
45
67
Related
I am trying to carry out following operation in R
I have different series of data,
series 1: 75, 56, 100, 23, 38, 40 series 2: 60, 18, 86, 100, 44
I would like to annex these data. To do so, I have to multiply series 1 by 1.5 to make last data of series 1 (40) match with the first data of the second series (60) (40*1.5=60)
Same way I would like to match many different series, but for other series I will need to multiply by other numbers. For another series i.e Series1: ...20 ; Series 2: 80... I would have to multiply it by 4.
How can I carry out such an operation to many series in many data frames?
Thanks in advance,
Given two vectors x and y, the function f(x,y) below will convert x the way you desire.
f <- function(x,y) x*(y[1]/x[length(x)])
Usage:
x = c(75,56,100,23,38,40)
y = c(60,18,86,100,44)
f(x,y)
Output:
[1] 112.5 84.0 150.0 34.5 57.0 60.0
However, how this approach gets applied to "many series in many data frames" depends on the actual structure you have, and what type of output you want.
GT VB WM
23 34 28
34 27 33
44 46
54
I have a data like above in a csv file.I need a R script to retrieve by column wise values either by loop or function when passing arguments as a variable name.Ex. When I type GT I should get relevant values without NA like
GT
23 34
This
lapply(df, na.omit)
creates a list of vector where all NAs are removed.
Based on all the information you have given, the following R commands (an "R script") will do this for you. I'm assuming that the CSV file contains 3 columns called GT, VB and WM in the first row and there are 4 rows of data starting in row 2. I'm also assuming that the file is in fact a comma separated file format, meaning that the columns (including the header row) is separated by commas.
df <- read.csv("myfile.csv")
If you don't want NA values to appear whenever you type the name of the column, you'll have to remove the NA from each element of the data frame, saving the results as a list (since a data frame cannot have columns of unequal length), using either lapply or sapply.
df.list <- sapply(df, FUN=function(x) x[!is.na(x)])
And then attach to it:
attach(df.list)
Typing the names of the columns should return the original values with NA omitted.
GT
[1] 23 34
VB
[1] 34 27 44 54
WM
[1] 28 33 46
When you are finished, detach from this modified R object as it is good practice to do so.
detach(df.list) # Good practice
And this does exactly what you said. No more and no less.
data
library(tibble)
df <- tribble(~GT, ~VB, ~WM,
23, 34, 28,
34, 27, 33,
NA, 44, 46,
NA, 54, NA)
regarding my time series analysis I have got a very specific question for you - I hope you can help me out! I have already checked stackoverflow for various approaches, but I failed.
I have got a huge list with 12elements. Every element of that list represents a rasterstack of 16rasterlayers. Now I want to reassign the values of every single layer in this list. I do not have a clue how to do that in a for-loop or something similar, since I have to chase every single layer out of the list to reassign the values.
What I have got so far looks like as following:
list_monthly_stack
[[1]]
class : RasterStack
dimensions : 26, 42, 1092, 16 (nrow, ncol, ncell, nlayers)
resolution : 0.04, 0.04 (x, y)
extent : 76.4, 78.08, 51.32, 52.36 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
names : CCS_1m200301, CCS_1m200401, CCS_1m200501, CCS_1m200601, CCS_1m200701, CCS_1m200801, CCS_1m200901, CCS_1m201001, CCS_1m201101, CCS_1m201201, CCS_1m201301, CCS_1m201401, CCS_1m201501, CCS_1m201601, CCS_1m201701, ...
min values : 26, 35, 24, 59, 37, 18, 107, 52, 20, 8, 73, 33, 47, 49, 73, ...
max values : 139, 193, 123, 369, 173, 198, 299, 324, 270, 175, 198, 181, 138, 236, 299, ...
# this is how one list element looks like
To chase on layer of a rasterstack/list element to apply the required operations I could do the following:
test <- list_monthly_stack[[1]][[1]]
test[test < 0] <- 666
test[test > 0 & test < 666] <- 0
But since I have to do this 12*16times I would like to automatize the described process in a for-loop or something similar. Do you have any ideas how to solve that? Thanks a lot in advance!
While I respectfully disagree with your list being 'huge', you can use lapply and reclassify to get what you need.
lapply just iterates over the list as any for-loop would, but is wrapped in a nice and tidy function.
First, let's get a reproducible dataset:
library(raster)
r <- raster(nrow=26,ncol=42)
set.seed(42)
list_monthly_stack <- lapply(1:12,function(i) do.call(stack,replicate(16,setValues(r,runif(ncell(r))))))
Now to the reclassification:
list_monthly_stack_rc <- lapply(list_monthly_stack, function(x) reclassify(x,c(-Inf,0,666,0,666,0),right=FALSE)
The second argument of reclassify is the 'matrix' for reclassification, with the values "from", "to", "new value".
In our case, this means all values from -Inf to 0 will be re-coded to 666, and all values between 0 and 666 will be re-coded to 0.
The argument right=FALSE means that the intervals will be open to the right, so as 0 won't be re-coded to 666 and 666 won't be re-coded to 0.
This is exactly what you do with the logical indexing in your question ... this means that all values already 0, will stay 0. And all values bigger than 666, will keep their original values.
Right now I am trying to create a new dummy variable in a dataset out of a variable that has more than two vectors. More specifically, my dataset has a "State" variable, and I want to make a dummy where 1 = states in the North, and 0 = all other states. Here's a portion of the dataset (it's an extremely large set so I'll only include the essential data):
Year StateICP
1 1940 71
2 1940 21
3 1940 22
4 1940 32
5 1940 18
6 1940 22
7 1940 45
8 1940 40
9 1940 33
So what I would want to do is create a new Column (called "North") where if the StateICP = 21, 22, 40, or 45, then the new variable would = 1, and otherwise would be 0. Like I said, this is a very large dataset (over 1000000 observations), so I can't enter it row by row manually. I tried an ifelse function, but that only gave me errors.
I'm sure this isn't that complicated, but I am fairly new to R. I know how to create a dummy variable normally, but I am getting stuck here. Any help would be greatly appreciated! Thank you!
So, creating simple dataset to replicate what you have above:
df <- data.frame(Year = rep(1940,500), StateICP = sample(1:100, 500, TRUE))
This will create a data.frame with columns like you describe and 500 records. The StateICP values are randomly generated integers between 1 and 100. If we want to code a boolean we could simply add a new column:
df$boolean <- df$StateICP %in% c(21, 22, 40, 45)
If we want to code them specifically as 0,1 as you describe then you can use ifelse:
df$dummy <- ifelse(df$StateICP %in% c(21, 22, 40, 45), 1, 0)
You have to make sure you are using a vector in the ifelse (since it does not accept a data argument).
I have a program that outputs to file an unevenly spaced time series of vectors (one vector per interval) that vary in size . I'm wondering what would be the best way of formatting the output so that the file can be read into a list of vectors in R (Assuming that is the correct data structure), and what code in R i would use to read it.
For example, I imagine the output could look something like this:
1, 24, 5, 211
3, 5
59, 465, 3, 333, 9, 98
or
(1 24 5 211)
(3 5)
(59 465 3 333 9 98)
But what I'm saying is that I want to change the formatting to suite the R read function.
Keep fill = TRUE
data = read.table(file.choose(),sep=",",fill=TRUE)
data[is.na(data)] <- "" # Replacing NA Values with nothing..