Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm quite new to R, and if I imported a .csv file and if rows represent
time and columns represent n variables of interest, how could I construct a
function that returns any given 1xn vector from the table?
P.S. I'm not just interested in constructing a vector, but I will perform
matrix algebra with iterative calculations to estimate parameters, which means
I will need to use a for-loop.
If the data structure contains e.g. m rows and n columns i.e. n variables, you can easily construct the n vectors without much effort.
data<-read.csv(".../file.csv")
class(data)
[1] "data.frame"
class(as.numeric(data[1,]))
[1] "numeric"
So it is not a big deal to convert 1*n matrix i.e. vector of length(ncol(data)).
In a loop just use
data["required Row Number",]
to access the particular row. Each case it will ultimately give 1*n matrix or a vector of length(n)
You can use the function melt() from the package reshape2
Or if you want to use the for loop, try something like:
one_col <- data[,1]
for (i in 2:ncol(data)){
one_col <- rbind(one_col, data[,i])
}
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I have two character vectors and I just want to compare them and just keep those, which contain the same character pattern, here country.
a<-c("nutr_sup_AFG.csv", "nutr_sup_ARE.csv", "nutr_sup_ARG.csv", "nutr_sup_AUS.csv")
b<-c("nutr_needs_AFG_pop.csv", "nutr_needs_AGO_pop.csv", "nutr_needs_ARE_pop.csv", "nutr_needs_ARG_pop.csv")
#wished result:
result_a<-c("nutr_sup_AFG.csv", "nutr_sup_ARE.csv", "nutr_sup_ARG.csv")
result_b<-c("nutr_needs_AFG_pop.csv", "nutr_needs_ARE_pop.csv", "nutr_needs_ARG_pop.csv")
I thought about subsetting first and compare the strings then:
a_ISO<-str_sub(a, start=10, end = -5) #subset just ISO name
b_ISO<-str_sub(b, start =12, end = -9 ) #subset just ISO name
dif1<-setdiff(a, b) # get difference (order is important)
dif2<-setdiff(b,a) # get difference
dif<-c(dif1,dif2) # selection which to remove
But I don't know from here how to compare a and b with dif. So basically How to compare a character vector by regex with another character vector.
I think you should extract the characters with a more general approach with regex, not with position. I think it is also easier to just subset the elements you want to keep with intersect() rather than determining the ones to drop with settdiff():
Extract the three-character code with a regex:
index_a<-stringr::string_extract(a, "[A-Z]{3}")
index_b<-stringr::string_extract(b, "[A-Z]{3}")
Then subset the vectors with intersect() and base indexing:
intersect_ab<-intersect(index_a, index_b)
result_a<-a[index_a %in% intersect_ab]
result_b<-b[index_b %in% intersect_ab]
That said, your solution does work with an additional final step:
result_a<-a[!dif1 %in% a_ISO]
result_b<-b[!dif2 %in% b_ISO]
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
Apologies that my question was unclear. It was my interpretation of the following assignment question:
Create a list (named mylist) that consists of one character vector (“J”,“A”,“G”,”B”,”H”,”E”,”C”,”F”,”D”,”I”), one numeric vector (10 random values from rnorm), and a matrix of size 10 x 10 (containing integer 1 to 100). After that you will provide a way of sorting the rows of all components (character, numeric and matrix) of mylist based on the order of the sorted character list. Finally, do a matrix times vector multiplication of the sorted second component and the third (integer) component (you will need to extract and convert these components to suitable modes).
Based on the code above, write a function that reads one character vector (of size n), one numeric vector (of size n) and one matrix (of size n x n). Then sorts the rows of all components based on the character vector, performs matrix times vector multiplication, combines the output of the multiplication with the input into a data frame that should be the output of the function.
We need to extract the character vector separately, order it and then use lapply toorder the elements
i1 <- order(lst$vec1) #assuming that the character `vector` is named `vec1`
lst1 <- lapply(lst, function(x) if(is.vector(x)) x[i1] else x[i1,])
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I'm new (very new) in R. I'm struggling with making a function that's supposed to take a matrix (old_matrix) and return a new matrix (new_matrix), but in new_matrix all values in old_matrix that is a prime should be multiplied by 2 when it appears in new_matrix. So the new matrix should look the same as the old matrix, but where a prime occurs in old, this element should be multiplied by 2.
I'm thinking that I should start out with a for loop, but I'm already struggling with how to make the loop go through all elements of the matrix. I appreciate all the help I can get to get closer to making this function!
The isPrime function in the numbers package could be a big help
# Start by creating an example to work with
old_matrix <- matrix(sample.int(100, 25), 5, 5)
# Create your new matrix and determine which numbers are prime
new_matrix <- old_matrix
primeVals <- numbers::isPrime(old_matrix)
# Index into the matrix using the prime value indicator and multiply by 2
new_matrix[primeVals] <- new_matrix[primeVals]*2
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a fairly large dataset (6.5 M rows, 8 cols) that I'm summarizing in a time series of aggregate counts of observations by day.
I'm currently summing across the intersection of two vectors that are the axes in my time series matrix. The iterations are taking hours to run, and I'm wondering if I'm overlooking something that might give better performance.
My code:
m<-length(datespace)
sensorlist<-as.vector(unique(sensordata$SOURCE))
n<-length(sensorlist)
y <- matrix(0, nrow=m, ncol=n)
colnames(y) <- sensorlist
for(sensor in 1:n){
for(date in 1:m){
count<-sum(as.vector(sensordata$SOURCE==sensorlist[sensor] & di==datespace[date]))
y[date,sensor] = count
}
}
I know FOR loops are less efficient are an indicator that there's probably a better way in R to get this done.
The crux of this problem seems to be a fast way to create a sparse matrix that fills in the missing summary data with zeros.
Pretty sure this is a simple tally:
library(dplyr)
sensordata %>%
group_by(SOURCE) %>% # or maybe group_by(SOURCE, di)?
tally()
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have CSV data as follows:
code, label, value
ABC, len, 10
ABC, count, 20
ABC, data, 102
ABC, data, 212
ABC, data, 443
...
XYZ, len, 11
XYZ, count, 25
XYZ, data, 782
...
The number of data entries is different for each code. (This doesn't matter for my question; I'm just point it out.)
I need to analyze the data entries for each code. This would include calculating the median, plotting graphs, etc. This means I should separate out the data for each code and make it numeric?
Is there a better way of doing this than this kind of thing:
x = read.csv('dataFile.csv, header=T)
...
median(as.numeric(subset(x, x$code=='ABC' & x$label=='data')$value))
boxplot(median(as.numeric(subset(x, x$code=='ABC' & x$label=='data')$value)))
split and list2env allows you to separate your data.frame x for each code generating one data.frame for each level in code:
list2env(split(x, x$code), envir=.GlobalEnv)
or just
my.list <- split(x, x$code)
if you prefer to work with lists.
I'm not sure I totally understand the final objective of your question, do you just want some pointers of what you could do it? because there are a lot of possible solutions.
When you ask: I need to analyze the data entries for each code. This would include calculating the median, plotting graphs, etc. This means I should separate out the data for each code and make it numeric?
The answer would be no, you don't strictly have to. You could use R functions which does this task for you, for example:
x = read.csv('dataFile.csv', header=T)
#is it numeric?
class(x$value)
# if it is already numeric you shouldn't have to convert it,
# if it strictly numeric I don't know any reason why it
# should be read as strings but it happens.
aggregate(x,by=list(x$code),FUN="median")
boxplot(value~code,data=x)
# and you can do ?boxplot to look into its options.