I am trying to assign a vector as an attribute for a vertex, but without any luck:
# assignment of a numeric value (everything is ok)
g<-set.vertex.attribute(g, 'checked', 2, 3)
V(g)$checked
.
# assignment of a vector (is not working)
g<-set.vertex.attribute(g, 'checked', 2, c(3, 1))
V(g)$checked
checking the manual, http://igraph.sourceforge.net/doc/R/attributes.html
it looks like this is not possible. Is there any workaround?
Up till now the only things I come up with are:
store this
information in another structure
convert vector to a string with delimiters and store as a string
This works fine:
## replace c(3,1) by list(c(3,1))
g <- set.vertex.attribute(g, 'checked', 2, list(c(3, 1)))
V(g)[2]$checked
[1] 3 1
EDIT Why this works?
When you use :
g<-set.vertex.attribute(g, 'checked', 2, c(3, 1))
You get this warning :
number of items to replace is not a multiple of replacement length
Indeed you try to put c(3,1) which has a length =2 in a variable with length =1. SO the idea is to replace c(3,1) with something similar but with length =1. For example:
length(list(c(3,1)))
[1] 1
> length(data.frame(c(3,1)))
[1] 1
Related
I'm trying to extract some characters from a vector called "identhog" which is allocated in the table "E". But I want to extract some characters according with its text length. Then if the lenght of the text in the vector is 10 I want to extract some characters, otherwise I want to extract another characters from another position.
if (nchar(E$identhog)==10) {
E <- mutate(E,prueba2= substr(E$identhog, 2, 6))
} else {
E <- mutate(E, prueba2=substr(E$identhog, 3,7))
}
I´m using an IF ELSE conditional, but When I run the code the following message shows up.
"Warning message: In if (nchar(E$identhog) == 10) { : the condition has length > 1 and only
the first element will be used"
And R ignores my whole IF conditional and just run:
E <- mutate(E,prueba2= substr(E$identhog, 2, 6))
How can I fix this? I have investigated about this problem and it seems that happens because I'm attempting to use an if() function to check for some condition, but it's passing a vector to the if() function instead of individual elements.
I understand that R is just checking one element in a vector at one time, but I want to check each individual element.
Some users tell that the command "ifelse" is a solution, but it is not working with my data for the amount of information it has.
ifelse((nchar(E$identhog)==10),
E <- mutate(E,prueba2= substr(E$identhog, 2, 6)),
E <- mutate(E, prueba2=substr(E$identhog, 3,7)))
Any solution?
You are using ifelse outside mutate, here an example of how to use it with dplyr.
library(dplyr)
df <- data.frame(string = c("1234567890","12345678901"))
df %>%
mutate(
prueba2 = if_else(
condition = nchar(string) == 10,
true = substr(string, 2, 6),
false = substr(string, 3, 7)
)
)
string prueba2
1 1234567890 23456
2 12345678901 34567
I am importing a key in which each row is an argument setting for a function I have programmed. The goal is to batch test my function by producing outputs for all sets of arguments. That's not terribly important. What is important is that I import a column that contains in each row a value for a range. For instance, "1:5" is meant to be entered into an argument as the value 1:5. I try to coerce using as.numeric("1:5"), but R is not happy with this. Is there a way to coerce this to the string c(1,2,3,4,5) from the character value "1:5"
Your text is valid code, so you can eval(parse it
dat$parsed <- lapply(dat$key, function(x) eval(parse(text=x)))
# key parsed
# 1 1:5 1, 2, 3, 4, 5
# 2 1:6 1, 2, 3, 4, 5, 6
# 3 1:4 1, 2, 3, 4
Data
dat <- read.table(text="key
1:5
1:6
1:4", strings=F, header=T)
Reduce(':', strsplit(x,":")[[1]])
[1] 1 2 3 4 5
If x = "1:5", we can use strsplit to separate the two numbers. We can then use Reduce to execute the operator : on the split.
I am not sure how to handle NA within Julia DataFrames.
For example with the following DataFrame:
> import DataFrames
> a = DataFrames.#data([1, 2, 3, 4, 5]);
> b = DataFrames.#data([3, 4, 5, 6, NA]);
> ndf = DataFrames.DataFrame(a=a, b=b)
I can successfully execute the following operation on column :a
> ndf[ndf[:a] .== 4, :]
but if I try the same operation on :b I get an error NAException("cannot index an array with a DataArray containing NA values").
> ndf[ndf[:b] .== 4, :]
NAException("cannot index an array with a DataArray containing NA values")
while loading In[108], in expression starting on line 1
in to_index at /Users/abisen/.julia/v0.3/DataArrays/src/indexing.jl:85
in getindex at /Users/abisen/.julia/v0.3/DataArrays/src/indexing.jl:210
in getindex at /Users/abisen/.julia/v0.3/DataFrames/src/dataframe/dataframe.jl:268
Which is because of the presence of NA value.
My question is how should DataFrames with NA should typically be handled? I can understand that > or < operation against NA would be undefined but == should work (no?).
What's your desired behavior here? If you want to do selections like this you can make the condition (not a NAN) AND (equal to 4). If the first test fails then the second one never happens.
using DataFrames
a = #data([1, 2, 3, 4, 5]);
b = #data([3, 4, 5, 6, NA]);
ndf = DataFrame(a=a, b=b)
ndf[(!isna(ndf[:b]))&(ndf[:b].==4),:]
In some cases you might just want to drop all rows with NAs in certain columns
ndf = ndf[!isna(ndf[:b]),:]
Regarding to this question I asked before, you can change this NA behavior directly in the modules sourcecode if you want. In the file indexing.jl there is a function named Base.to_index(A::DataArray) beginning at line 75, where you can alter the code to set NA's in the boolean array to false. For example you can do the following:
# Indexing with NA throws an error
function Base.to_index(A::DataArray)
A[A.na] = false
any(A.na) && throw(NAException("cannot index an array with a DataArray containing NA values"))
Base.to_index(A.data)
end
Ignoring NA's with isna() will cause a less readable sourcecode and in big formulas, a performance loss:
#timeit ndf[(!isna(ndf[:b])) & (ndf[:b] .== 4),:] #3.68 µs per loop
#timeit ndf[ndf[:b] .== 4, :] #2.32 µs per loop
## 71x179 2D Array
#timeit dm[(!isna(dm)) & (dm .< 3)] = 1 #14.55 µs per loop
#timeit dm[dm .< 3] = 1 #754.79 ns per loop
In many cases you want to treat NA as separate instances, i.e. assume that that everything that is NA is "equal" and everything else is different.
If this is the behaviour you want, current DataFrames API doesn't help you much, as both (NA == NA) and (NA == 1) returns NA instead of their expected boolean results.
This makes extremely tedious DataFrame filters using loops:
function filter(df,c)
for r in eachrow(df)
if (isna(c) && isna(r:[c])) || ( !isna(r[:c]) && r[:c] == c )
...
and breaks select-like functionalities in DataFramesMeta.jl and Query.jl when NA values are present or requested for..
One workaround is to use isequal(a,b) in place of a==b
test = #where(df, isequal.(:a,"cc"), isequal.(:b,NA) ) #from DataFramesMeta.jl
I think the new syntax in Julia is to use ismissing:
# drop NAs
df = DataFrame(col=[0,1,1,missing,0,1])
df = df[.!ismissing.(df[:col]),:]
df <- data.frame(name=c('aa', 'bb', 'cc','dd'),
code=seq(1:4), value= seq(100, 400, by=100))
df
v <- c(1, 2, 2)
v
A <- df[df$code %in% v,]$value
A
str(A)
I tried to obtain the corresponding value based on the code. I was expecting A to be of length 3; but it actually returns a vector of 2. What can I do if I want A to be a vector of 3, that is c(100,200,200)?
%in% returns a logical vector, the same length as vector 1, that indicates whether each element of vector 1 occurs in vector 2.
In contrast, the match function returns, for each element of vector 1, the position in vector 2 where the element first appears (or NA if it doesn't exist in vector 2). Try the following:
df[match(v, df$code), 'value']
You could just use v as an argument if those were the lines whose "value"s you wanted:
> df[v,]$value
[1] 100 200 200
df[v,2] # minimum characters :)
I do have the output of a function which looks like this
function(i,var1,list1)->h
and then the output
value
2.8763
There is a line break in the output and I only need the number bit of the result but not the string. Hence, I tried to use h[1] but this is
value
2.87..
and length(h) is also equal 1. Is there any way to access only the number in this case?
Thanks,
you are accessing only the value and the name you see is the name of the element in a vector you return. you can get rid of those names / attributes like this:
> v <- c("a" = 1, "b" = 2)
> v
a b
1 2
> attributes(v) <- NULL
> v
[1] 1 2