Replace value table with condition in R - r

I have list of dataset :
> data1
[1] /index.php/search?
[2] /tabel/graphic1_.php?
[3] /mod/Layout/variableView2.php?
[4] /table/tblmon-frameee.php?
and a table:
> tes
[1] http://aladdine/index.php/search?
[2] http://aladdine/mod/params/returnParams.php
[3] http://aladdine/mod/Layout/variableView2.php
[4] http://aladdine/index.php/bos/index?
[5] http://aladdine/index.php/Bos
I want to change the value of the test table with an index on dataset which has a matching string values in the dataset.
I have tried this code:
for(i in 1:length(dataset)){
p = data[i]
for(j in 1:length(tes)){
t = tes [j]
if(grepl(p, t)){
tes[j]=i
}
else tes[j] = "-"
}
}
My expectation result like this,
> tes
[1] 1
[2] -
[3] 3
[4] -
[5] -
But, I always get warning message invalid factor level, NA generated. Why?
Thanks before.

The following code does not do exactly what you need, but effectively it should give you the same information.
data1<-c('/index.php/search?',
'/tabel/graphic1_.php?',
'/mod/Layout/variableView2.php?',
'/table/tblmon-frameee.php?')
tes<-c('http://aladdine/index.php/search?',
'http://aladdine/mod/params/returnParams.php',
'http://aladdine/mod/Layout/variableView2.php',
'http://aladdine/index.php/bos/index?',
'http://aladdine/index.php/Bos')
> lapply(data1,FUN = function(x) which(grepl(x,tes)))
[[1]]
[1] 1
[[2]]
integer(0)
[[3]]
[1] 3
[[4]]
integer(0)
For example, the first output in [[1]] tells which element in "tes" match the first element in "data1" etc...

Probably not the fastest one as i use for loop in this code but hope this provides a solution:
require(data.table)
data1<-c("/index.php/search?","/tabel/graphic1_.php?","/mod/Layout/variableView2.php?","/table/tblmon-frameee.php?")
tes<-c("http://aladdine/index.php/search?","http://aladdine/mod/params/returnParams.php" ,"http://aladdine/mod/Layout/variableView2.php","http://aladdine/index.php/bos/index?","http://aladdine/index.php/Bos")
d<-data.table(d=data1,t=tes)
d$id<-seq(1:nrow(d))
for (i in 1:nrow(d))
{
d$index[i]<-lapply(data1,FUN=function(x) {ifelse(length(grep(x,tes[i]))>0,d$id[i],"-")})[i]
}

Related

Is their a way to find the index of a particular value from a data frame in R?

*This is the actual problem
Q. Write a code to print the price for 3bedroom houses (each house not combined).
new_data = read.csv("LR.csv")
count = 0
index = 1
for(x in new_data$bedrooms){
if(x == 3){
count = count+1
print(new_data$price[index])
index = index +1
}
}
print(count)
Result of this code is:
[1] 275000
[1] 565000
[1] 460000
[1] 603500
[1] 490600
[1] 1010000
[1] 5e+05
[1] 249000
[1] 235000
[1] 410000
[1] 370000
[1] 360000
[1] 1410000
[1] 298000
[1] 485000
[1] 4e+05
[1] 580000
[1] 355000
[1] 650000
[1] 261490
[1] 347000
[1] 485000
[1] 601000
And many more like this. But I want to find exactly which property is it by either it's index number or id.
For the file please reply "file" and give your contact method like mail id or SNS id. I will send you the file.
Please help. Thanks in advance.
You could use which() to get the row indices of the rows fullfilling your condition:
idx <- which(new_data$bedrooms == 3)
print(new_data$price[idx]) # just printing the price
print(new_data[idx,] # printing the whole row
Or you could just directly get the results with slicing based on a condition:
print(new_data$price[new_data$bedrooms == 3] # just printing the price
print(new_data[new_data$bedrooms == 3,] # printing the whole row

Apply mutate to data frame in R: iterates one extra step and causes error

I wonder why my get_http_status function iterates once more than necessary causing an exception
I have a data frame like:
> str(df5)
'data.frame': 10 obs. of 3 variables:
$ text : chr "\n" "\n" "\n" "\n" ...
$ enlace: chr "//www.blogger.com| __truncated__ ...
$ Freq : int 1 1 1 1 1 1 1 1 1 r code here
I'm trying to get the http status code for each "enlace"
Using this function:
get_http_status <- function(url){
if (!is.null(url)){
Sys.sleep(3)
print(url)
ret <- HEAD(url)
return(ret$status_code)
}
return("")
}
df44 <- mutate(df5, status = get_http_status(enlace))
but keeps trowing the error:
** Error in parse_url(url) : length(url) == 1 is not TRUE**
i can warp the function with try/catch and it works, but i don't know why the error is happening in first place.
get_http_status_2 <- function(url){
tryCatch(
expr = {
Sys.sleep(3)
print(url)
ret <- HEAD(url)
return(ret$status_code)
},
error = function(e){
return("")
}
)
}
The content of the df5$enlace is:
> df5$enlace
[1] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=Attribution&widgetId=Attribution1&action=editWidget&sectionId=footer-3"
[2] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=BlogArchive&widgetId=BlogArchive1&action=editWidget&sectionId=sidebar-right-1"
[3] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=BlogSearch&widgetId=BlogSearch1&action=editWidget&sectionId=sidebar-right-1"
[4] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=Followers&widgetId=Followers1&action=editWidget&sectionId=sidebar-right-1"
[5] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=PageList&widgetId=PageList1&action=editWidget&sectionId=crosscol"
[6] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=Text&widgetId=Text1&action=editWidget&sectionId=sidebar-right-1"
[7] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=Text&widgetId=Text2&action=editWidget&sectionId=sidebar-right-1"
[8] "http://5d4a.wordpress.com/2010/08/02/smashing-the-stack-in-2010/"
[9] "http://advancedwindowsdebugging.com/ch06.pdf"
[10] "http://beej.us/guide/
I think it iterate one time more because the result of the function is:
> df44 <- mutate(df5, status = get_http_status(enlace))
[1] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=Attribution&widgetId=Attribution1&action=editWidget&sectionId=footer-3"
[2] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=BlogArchive&widgetId=BlogArchive1&action=editWidget&sectionId=sidebar-right-1"
[3] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=BlogSearch&widgetId=BlogSearch1&action=editWidget&sectionId=sidebar-right-1"
[4] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=Followers&widgetId=Followers1&action=editWidget&sectionId=sidebar-right-1"
[5] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=PageList&widgetId=PageList1&action=editWidget&sectionId=crosscol"
[6] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=Text&widgetId=Text1&action=editWidget&sectionId=sidebar-right-1"
[7] "//www.blogger.com/rearrange?blogID=4514563088285989046&widgetType=Text&widgetId=Text2&action=editWidget&sectionId=sidebar-right-1"
[8] "http://5d4a.wordpress.com/2010/08/02/smashing-the-stack-in-2010/"
[9] "http://advancedwindowsdebugging.com/ch06.pdf"
[10] "http://beej.us/guide/bgc/"
Error in parse_url(url) : length(url) == 1 is not TRUE
Since your function contains a function that is not vectored, use the apply family of higher order function to iterate over your vector.
Below, get_http_status will be called on each element of df$enlace.
For each call a length one character vector is expected as the return, character(1):
vapply(df5$enlace, get_http_status, character(1))

How can I use two lists to create a Table (Columns and Rows)

I want to write a script in R that allows me to import MSG files and store the information in a table. The fields may vary by course, so the column names are defined based on the first MSG file being imported.
The import and extraction are already working (special thanks to the user "January")
What does not work is the filling in the table, which consists of two steps. Add column names and fill in rows.
I've tried using unlist to prepare the contents of the lists so that I can add them as colums and rows to a table.
Anmeldung <- gsub("^\\s+", "", Anmeldung) # remove spaces at the beginning and end
Anmeldung <- gsub("\\s+$", "", Anmeldung)
words <- strsplit(Anmeldung, " *[\n\r]+ *")[[1]]
fields <- as.list(words[seq(1, length(words), 2)])
information <- as.list(words[seq(2, length(words), 2)])
resTab1 = data.frame(t(unlist(fields)))
resTab2 = data.frame(t(unlist(information)))
colnames(resTab2) = c(resTab1)
variable.names(resTab2)
When I am trying to create the Table,this error appears:
colnames(resTab2) = c(resTab1)
Error in names(x) <- value :
'names' attribute [22] must be the same length as the vector [21]
This is what the Dataframes Fields and Information look like:
Fields
> fields
[[1]]
[1] "Anrede"
[[2]]
[1] "Vorname"
[[3]]
[1] "Name"
[[4]]
[1] "Email (für Kontaktaufnahme)"
[[5]]
[1] "Telefon/Mobile (geschäftlich)"
[[6]]
[1] "Telefon/Mobile (privat)"
[[7]]
[1] "Strasse/Nr."
Information:
> information
[[1]]
[1] "Herr"
[[2]]
[1] "James"
[[3]]
[1] "Bond"
[[4]]
[1] "james.bond#email.com"
[[5]]
[1] "007 000 77 07"
[[6]]
[1] "007 000 77 07"
[[7]]
[1] "Lampenstrasse 8"
I see you're trying to give names to resTab2 that is shorter than your resTab1
ex:
x <- c(1,2)
y <- c("a","b","c")
names(x) <- y
#Error in names(x) <- y :
#'names' attribute [3] must be the same length as the vector [2]
EDIT:
use unlist to flatten the list
information <- unlist(information)
fields <- unlist(fields)
names(information) <- fields
information
#OUTPUT
#Anrede 'Herr'
#Vorname 'James'
#Name 'Bond'
#Email (für Kontaktaufnahme) 'james.bond#email.com'
#Telefon/Mobile (geschäftlich) '007 000 77 07'
#Telefon/Mobile (privat) '007 000 77 07'
#Strasse/Nr. 'Lampenstrasse 8'

in R:how to store number by order in list

Suppose I have a for loop like following
for(i in 1:2)
{
eigeni=lapply(rho,function(mat) eigen(mat[1:i,1:i])$values)
x=lapply(eigeni,function(x) i-sum(ifelse(x>1,x,1)-1))
}
When i=1,x is
> x
$rs6139074
[1] 0.241097
$rs1418258
[1] 0.241097
When i=2,x is
> x
$rs6139074
[1] 1.241097
$rs1418258
[1] 1.247
How can I store x by column for each iteration like following after this loop?
I would like something like following
for(i in 1:2)
{
eigeni=lapply(rho,function(mat) eigen(mat[1:i,1:i])$values)
x[i]=lapply(eigeni,function(x) i-sum(ifelse(x>1,x,1)-1))
}
> x
$rs6139074
[1] 0.241097 1.241097
$rs1418258
[1] 0.241097 1.247
Following Dave2e answer
I got
> mylist
[[1]]
[[1]]$rs6139074
[1] 0.241097
[[1]]$rs1418258
[1] 0.241097
[[2]]
[[2]]$rs6139074
[1] 1.241097
[[2]]$rs1418258
[1] 1.247
One of the easiest ways is to define a empty list and append values onto it:
mylist<-list()
for(i in 1:2)
{
eigeni=lapply(rho,function(mat) eigen(mat[1:i,1:i])$values)
x=lapply(eigeni,function(x) i-sum(ifelse(x>1,x,1)-1))
mylist[[i]]<-x
}
print(mylist)
Of course if you can use the lapply function, that would be the most condensed and most likely fastest method.

Accessing selected elements of a list of lists in R

I have a list of list subgame[[i]]$Weight of this type:
[[1]]
[1] 0.4720550 0.4858826 0.4990469 0.5115899 0.5235512 0.5349672 0.5458720
[8] 0.5562970 0.5662715 0.5758226 0.5849754 0.5937532 0.6021778 0.6102692
[15] 0.6180462 0.6255260 0.6327250 0.6396582 0.6463397 0.6527826
[[2]]
[1] 0.4639948 0.4779027 0.4911519 0.5037834 0.5158356 0.5273443 0.5383429
[8] 0.5488623 0.5589313 0.5685767 0.5778233 0.5866943 0.5952111 0.6033936
[15] 0.6112605 0.6188291 0.6261153 0.6331344 0.6399002 0.6464260
[[3]]
[1] 0.4629488 0.4768668 0.4901266 0.5027692 0.5148329 0.5263534 0.5373639
[8] 0.5478953 0.5579764 0.5676339 0.5768926 0.5857755 0.5943041 0.6024984
[15] 0.6103768 0.6179568 0.6252543 0.6322844 0.6390611 0.6455976
What I am looking for is to access all the j-th elements of every list. Example if j=1 I must get:
>0.4720550 0.4639948 0.4629488
How can I do it?
I found
sapply(1:length(subgame[[i]]$Weight),function(k) subgame[[i]]$Weight[[k]][1])
But seems too tricky to me.
There is a more elegant way?
If j=1, then you're interested in subgame[[i]]$Weight[[1]][1], subgame[[i]]$Weight[[2]][1], and subgame[[i]]$Weight[[3]][1]. In other words, you want to use [1] on each list element.
But what happens when you subset a vector? For example:
(x <- rnorm(5))
# [1] -1.8965529 0.4688618 0.6588774 0.2749539 0.1829046
x[3]
# [1] 0.6588774
[ is actually a function, and it gets called in this situation. You can read a bit more about it with ?"[", but the point is that you can call it like any other function. Its first argument will be the object to subset, then you can pass it the index (or indices) you're interested in (along with some other arguments that the help page discusses):
x[3]
# [1] 0.6588774
`[`(x, 3)
# [1] 0.6588774
Note the backticks surrounding the name. A bare [ will throw an error, so you need to quote it. The same goes for other functions like +.
So if you want to get the first element of each list element, you can apply [ to each element of the list, passing it 1 or whatever j is:
sapply(subgame[[i]]$Weight, `[`, 1)
I would like to add a solution which returns the result you want for the Weight list of each elements of your subgame list.
> subgame <- list(list(weight = list(c(1, 2), c(3, 4), c(5, 6))), list(weight = list(c(7, 8), c(9, 10), c(11, 12))))
>
> j = 1
>
> do.call(rbind, subgame[[1]]$weight)[,j]
[1] 1 3 5
>
> lapply(subgame, function(x) {do.call(rbind, x$weight)[,j]})
[[1]]
[1] 1 3 5
[[2]]
[1] 7 9 11

Resources