Insert "" instead of NA when adding rows in gdf [gWidgets2RGtk2] - r

Is it possible to insert "" instead of NA when creating a new row in gdf?
EDIT: Here's some sample code that I tried
require(gWidgets2RGtk2)
df <- data.frame(x=1:5,y=6:10) #Sample data frame
w2 <- gwindow("keyfile editor")
h <- gdf(df,cont=w2)
addHandlerChanged(h, handler = function(h,...){ #Handler to remove NA
h<<-apply(h[1:nrow(h),1:ncol(h)], 2, function(x) gsub("NA","",x))
})

svalue(h$obj, drop = FALSE)
gives you the new value for the updated row. So in theory,
addHandlerChanged(h, handler = function(h,...) {
svalue(h$obj, drop = FALSE)[] <- lapply(
svalue(h$obj, drop = FALSE),
function(x) {
x[is.na(x)] <- ""
}
)
}
should replace all the NAs with "". There are two problems:
Firstly, replacing the missing values with an empty string converts the whole column to be a character vector, which you probably don't want, and secondly, there seems to be a problem with svalue<- that means the values aren't updating.
I think that the problem is this:
methods(`svalue<-`)
## [1] svalue<-.default* svalue<-.GCheckbox* svalue<-.GFormLayout* svalue<-.GGroup*
## [5] svalue<-.GHtml* svalue<-.GLabel* svalue<-.GMenuBar* svalue<-.GRadio*
## [9] svalue<-.GToolBar* svalue<-.GTree*
shows that there is no GDf-specific method for setting the svalue, so svalue<-.default will be called.
gWidgets2:::`svalue<-.default`
## function (obj, index = NULL, ..., value)
## {
## if (!isExtant(obj)) {
## return(obj)
## }
## if (getWithDefault(index, FALSE))
## obj$set_index(value, ...)
## else obj$set_value(value, ...)
## obj
## }
This calls the object's set_value method.
ls(attr(h, ".xData"))
## [1] "add_cell_popup" "add_popup_to_view_col" "add_to_parent"
## [4] "add_view_columns" "block" "block_editable_column"
## [7] "cell_popup_id" "change_signal" "clear_stack"
## [10] "clear_view_columns" "cmd_coerce_column" "cmd_insert_column"
## [13] "cmd_remove_column" "cmd_replace_column" "cmd_set_column_name"
## [16] "cmd_set_column_names" "cmd_stack" "coerce_with"
## [19] "connected_signals" "default_cell_popup_menu" "default_expand"
## [22] "default_fill" "default_popup_menu" "freeze_attributes"
## [25] "get_column_index" "get_column_value" "get_dim"
## [28] "get_name" "get_view_column" "handler_id"
## [31] "initFields" "initialize" "initialize#GComponent"
## [34] "initialize#GWidget" "invoke_change_handler" "invoke_handler"
## [37] "is_editable" "map_j" "model"
## [40] "not_deleted" "notify_observers" "parent"
## [43] "set_editable" "set_frame" "set_name"
## [46] "set_names" "set_parent" "store"
## [49] "toolkit" "unblock_editable_column" "widget"
but there doesn't seem to be one implemented yet.

Well, Richie did his usual thorough job. This question has a few problems: One you use the variable h as a global variable (for the gdf object) and as the argument to the handler, so within the handler h does not refer to the object, but h$obj would. Second To set values for selection in the gdf object uses the [<- method (h[i,j] <- "" calls the h object's set_items method). You tried to modify the object, not call a method on it. As for NA values, underlying the items to select from is an RGtk2DataFrame, which like a data frame in R will coerce values to character if you try to put a character value into a numeric value. Best, to use R as it is intended. If you really want to get rid of NA values you can do so when you go to use the values that the user has edited, modifying h[,] as you want.
Now, if you really wanted to do this, I think you could at the RGtk2 level by writing an appropriate cell renderer.

Related

Partial Variances at each row of a Matrix

I generated a series of 10,000 random numbers through:
rand_x = rf(10000, 3, 5)
Now I want to produce another series that contains the variances at each point i.e. the column look like this:
[variance(first two numbers)]
[variance(first three numbers)]
[variance(first four numbers)]
[variance(first five numbers)]
.
.
.
.
[variance of 10,000 numbers]
I have written the code as:
c ( var(rand_x[1:1]) : var(rand_x[1:10000])
but I am only getting 157 elements in the column rather than not 10,000. Can someone guide what I am doing wrong here?
An option is to loop over the index from 2 to 10000 in sapply, extract the elements of 'rand_x' from position 1 to the looped index, apply the var and return a vector of variance output
out <- sapply(2:10000, function(i) var(rand_x[1:i]))
Your code creates a sequence incrementing by one with the variance of the first two elements as start value and the variance of the whole vector as limit.
var(rand_x[1:2]):var(rand_x[1:n])
# [1] 0.9026262 1.9026262 2.9026262
## compare:
.9026262:3.33433
# [1] 0.9026262 1.9026262 2.9026262
What you want is to loop over the vector indices, using seq_along to get the variances of sequences growing by one. To see what needs to be done, I show you first a (rather slow) for loop.
vars <- numeric() ## initialize numeric vector
for (i in seq_along(rand_x)) {
vars[i] <- var(rand_x[1:i])
}
vars
# [1] NA 0.9026262 1.4786540 1.2771584 1.7877717 1.6095619
# [7] 1.4483273 1.5653797 1.8121144 1.6192175 1.4821020 3.5005254
# [13] 3.3771453 3.1723564 2.9464537 2.7620001 2.7086317 2.5757641
# [19] 2.4330738 2.4073546 2.4242747 2.3149455 2.3192964 2.2544765
# [25] 3.1333738 3.0343781 3.0354998 2.9230927 2.8226541 2.7258979
# [31] 2.6775278 2.6651541 2.5995346 3.1333880 3.0487177 3.0392603
# [37] 3.0483917 4.0446074 4.0463367 4.0465158 3.9473870 3.8537925
# [43] 3.8461463 3.7848464 3.7505158 3.7048694 3.6953796 3.6605357
# [49] 3.6720684 3.6580296
The first element has to be NA because the variance of one element is not defined (division by zero).
However, the for loop is slow. Since R is vectorized we rather want to use a function from the *apply family, e.g. vapply, which is much faster. In vapply we initialize with numeric(1) (or just 0) because the result of each iteration is of length one.
vars <- vapply(seq_along(rand_x), function(i) var(rand_x[1:i]), numeric(1))
vars
# [1] NA 0.9026262 1.4786540 1.2771584 1.7877717 1.6095619
# [7] 1.4483273 1.5653797 1.8121144 1.6192175 1.4821020 3.5005254
# [13] 3.3771453 3.1723564 2.9464537 2.7620001 2.7086317 2.5757641
# [19] 2.4330738 2.4073546 2.4242747 2.3149455 2.3192964 2.2544765
# [25] 3.1333738 3.0343781 3.0354998 2.9230927 2.8226541 2.7258979
# [31] 2.6775278 2.6651541 2.5995346 3.1333880 3.0487177 3.0392603
# [37] 3.0483917 4.0446074 4.0463367 4.0465158 3.9473870 3.8537925
# [43] 3.8461463 3.7848464 3.7505158 3.7048694 3.6953796 3.6605357
# [49] 3.6720684 3.6580296
Data:
n <- 50
set.seed(42)
rand_x <- rf(n, 3, 5)

Can't append values to a list in R

I got a list with a weird format:
[[1]]
[1] "Freq.2432.40862794099" "Freq.2792.87280096993" "Freq.2955.16577598796"
[4] "Freq.3161.12982491516" "Freq.3194.19720315405" "Freq.3218.83311568825"
[7] "Freq.3265.37951283662" "Freq.3317.86908506493" "Freq.3900.50408838719"
[10] "Freq.4073.33935633108" "Freq.4302.8830598659" "Freq.4404.80065271461"
[13] "Freq.4469.12305573234" "Freq.4567.90688886175" "Freq.4965.4984006347"
[16] "Freq.5854.45161215455" "Freq.5905.64933878776" "Freq.6175.68130655941"
[19] "Freq.6433.22411185796" "Freq.6631.46775487994" "Freq.6958.20015968149"
[22] "Freq.7469.83422424355" "Freq.8602.43342069553" "Freq.8766.14436081853"
[25] "Freq.8811.22677706485" "Freq.8915.90029255773" "Freq.9131.39810096"
[28] "Freq.9378.82122607608"
Never saw that [[1]] in a list before, and the problem is that I can't append things to this list.
How can I solve this?
This is a list in a list. Normally this can be referred to as a nested list.
a <- c(1,2,3)
b <- c(4,5,6)
list <- list(a,b)
In this code snippet we are creating two vectors and put them into a list. Now you can access the nested vectors/lists using the double brackets. Like so:
list[[1]]
> [1] 1 2 3
Now, if you want to change the value (or append it, see comment) you can use the normal syntax but solely assign it to the nested object.
list[[1]] <- c(7,8,9)
list[[1]]
> [1] 7 8 9

How to remove elements of a list in R?

I have an igraph object, what I have created with the igraph library. This object is a list. Some of the components of this list have a length of 2. I would like to remove all of these ones.
IGRAPH clustering walktrap, groups: 114, mod: 0.79
+ groups:
$`1`
[1] "OTU0041" "OTU0016" "OTU0062"
[4] "OTU1362" "UniRef90_A0A075FHQ0" "UniRef90_A0A075FSE2"
[7] "UniRef90_A0A075FTT8" "UniRef90_A0A075FYU2" "UniRef90_A0A075G543"
[10] "UniRef90_A0A075G6B2" "UniRef90_A0A075GIL8" "UniRef90_A0A075GR85"
[13] "UniRef90_A0A075H910" "UniRef90_A0A075HTF5" "UniRef90_A0A075IFG0"
[16] "UniRef90_A0A0C1R539" "UniRef90_A0A0C1R6X4" "UniRef90_A0A0C1R985"
[19] "UniRef90_A0A0C1RCN7" "UniRef90_A0A0C1RE67" "UniRef90_A0A0C1RFI5"
[22] "UniRef90_A0A0C1RFN8" "UniRef90_A0A0C1RGE0" "UniRef90_A0A0C1RGX0"
[25] "UniRef90_A0A0C1RHM1" "UniRef90_A0A0C1RHR5" "UniRef90_A0A0C1RHZ4"
+ ... omitted several groups/vertices
For example, this one :
> a[[91]]
[1] "OTU0099" "UniRef90_UPI0005B28A7E"
I tried this but it does not work :
a[lapply(a,length)>2]
Any help?
Since you didn't provide any reproducible data or example, I had to produce some dummy data:
# create dummy data
a <- list(x = 1, y = 1:4, z = 1:2)
# remove elements in list with lengths greater than 2:
a[which(lapply(a, length) > 2)] <- NULL
In case you wanted to remove the items with lengths exactly equal to 2 (question is unclear), then last line should be replaced by:
a[which(lapply(a, length) == 2)] <- NULL

Detecting `package_name::function_name()` with static code analysis

I am trying to dive into the internals of static code analysis packages like codetools and CodeDepends, and my immediate goal is to understand how to detect function calls written as package_name::function_name() or package_name:::function_name(). I would have liked to just use findGlobals() from codetools, but this is not so simple.
Example function to analyze:
f <- function(n){
tmp <- digest::digest(n)
stats::rnorm(n)
}
Desired functionality:
analyze_function(f)
## [1] "digest::digest" "stats::rnorm"
Attempt with codetools:
library(codetools)
f = function(n) stats::rnorm(n)
findGlobals(f, merge = FALSE)
## $functions
## [1] "::"
##
## $variables
## character(0)
CodeDepends comes closer, but I am not sure I can always use the output to match functions to packages. I am looking for an automatic rule that connects rnorm() to stats and digest() to digest.
library(CodeDepends)
getInputs(body(f)
## An object of class "ScriptNodeInfo"
## Slot "files":
## character(0)
##
## Slot "strings":
## character(0)
##
## Slot "libraries":
## [1] "digest" "stats"
##
## Slot "inputs":
## [1] "n"
##
## Slot "outputs":
## [1] "tmp"
##
## Slot "updates":
## character(0)
##
## Slot "functions":
## { :: digest rnorm
## NA NA NA NA
##
## Slot "removes":
## character(0)
##
## Slot "nsevalVars":
## character(0)
##
## Slot "sideEffects":
## character(0)
##
## Slot "code":
## {
## tmp <- digest::digest(n)
## stats::rnorm(n)
## }
EDIT To be fair to CodeDepends, there is so much customizability and power for those who understand the internals. At the moment, I am just trying to wrap my head around collectors, handlers, walkers, etc. Apparently, it is possible to modify the standard :: collector to make special note of each namespaced call. For now, here is a naive attempt at something similar.
col <- inputCollector(`::` = function(e, collector, ...){
collector$call(paste0(e[[2]], "::", e[[3]]))
})
getInputs(quote(stats::rnorm(x)), collector = col)#functions
Browse[1]> getInputs(quote(stats::rnorm(x)), collector = col)#functions
stats::rnorm rnorm
NA NA
If you want to extract namespaced functions from a function, try something like this
find_ns_functions <- function(f, found=c()) {
if( is.function(f) ) {
# function, begin search on body
return(find_ns_functions(body(f), found))
} else if (is.call(f) && deparse(f[[1]]) %in% c("::", ":::")) {
found <- c(found, deparse(f))
} else if (is.recursive(f)) {
# compound object, iterate through sub-parts
v <- lapply(as.list(f), find_ns_functions, found)
found <- unique( c(found, unlist(v) ))
}
found
}
And we can test with
f <- function(n){
tmp <- digest::digest(n)
stats::rnorm(n)
}
find_ns_functions(f)
# [1] "digest::digest" "stats::rnorm"
Ok, so this was possible with CodeDepends previously, but a bit harder than it should have been. I've just committed version 0.5-4 to github, which now makes this really "easy". Essentially you just need to modify the default colonshandlers ("::" and/or ":::") as follows:
library(CodeDepends) # version >= 0.5-4
handler = function(e, collector, ..., iscall = FALSE) {
collector$library(asVarName(e[[2]]))
## :: or ::: name, remove if you don't want to count those as functions called
collector$call(asVarName(e[[1]]))
if(iscall)
collector$call(deparse((e))) #whole expr ie stats::norm
else
collector$vars(deparse((e)), input=TRUE) #whole expr ie stats::norm
}
getInputs(quote(stats::rnorm(x,y,z)), collector = inputCollector("::" = handler))
getInputs(quote(lapply( 1:10, stats::rnorm)), collector = inputCollector("::" = handler))
The first getInputs call above gives the result:
An object of class "ScriptNodeInfo"
Slot "files":
character(0)
Slot "strings":
character(0)
Slot "libraries":
[1] "stats"
Slot "inputs":
[1] "x" "y" "z"
Slot "outputs":
character(0)
Slot "updates":
character(0)
Slot "functions":
:: stats::rnorm
NA NA
Slot "removes":
character(0)
Slot "nsevalVars":
character(0)
Slot "sideEffects":
character(0)
Slot "code":
stats::rnorm(x, y, z)
As, I believe, desired.
One thing to note here is the iscall argument I've added to the colons handler. The default handler and applyhandlerfactory now have special logic so that when they invoke one of the colons handlers in a situation where it is a function being called, that is set to TRUE.
I haven't done extensive testing yet of what will happen when "stats::rnorm" appears in lieu of symbols, particularly in the inputs slot when calculating dependencies, but I'm hopeful that should all continue to work as well. If it doesn't let me know.
~G

R - get values from multiple variables in the environment

I have some variables in my current R environment:
ls()
[1] "clt.list" "commands.list" "dirs.list" "eq" "hurs.list" "mlist" "prec.list" "temp.list" "vars"
[10] "vars.list" "wind.list"
where each one of the variables "clt.list", "hurs.list", "prec.list", "temp.list" and "wind.list" is a (huge) list of strings.
For example:
clt.list[1:20]
[1] "clt_Amon_ACCESS1-0_historical_r1i1p1_185001-200512.nc" "clt_Amon_ACCESS1-3_historical_r1i1p1_185001-200512.nc"
[3] "clt_Amon_bcc-csm1-1_historical_r1i1p1_185001-201212.nc" "clt_Amon_bcc-csm1-1-m_historical_r1i1p1_185001-201212.nc"
[5] "clt_Amon_BNU-ESM_historical_r1i1p1_185001-200512.nc" "clt_Amon_CanESM2_historical_r1i1p1_185001-200512.nc"
[7] "clt_Amon_CCSM4_historical_r1i1p1_185001-200512.nc" "clt_Amon_CESM1-BGC_historical_r1i1p1_185001-200512.nc"
[9] "clt_Amon_CESM1-CAM5_historical_r1i1p1_185001-200512.nc" "clt_Amon_CESM1-CAM5-1-FV2_historical_r1i1p1_185001-200512.nc"
[11] "clt_Amon_CESM1-FASTCHEM_historical_r1i1p1_185001-200512.nc" "clt_Amon_CESM1-WACCM_historical_r1i1p1_185001-200512.nc"
[13] "clt_Amon_CMCC-CESM_historical_r1i1p1_190001-190412.nc" "clt_Amon_CMCC-CESM_historical_r1i1p1_190001-200512.nc"
[15] "clt_Amon_CMCC-CESM_historical_r1i1p1_190501-190912.nc" "clt_Amon_CMCC-CESM_historical_r1i1p1_191001-191412.nc"
[17] "clt_Amon_CMCC-CESM_historical_r1i1p1_191501-191912.nc" "clt_Amon_CMCC-CESM_historical_r1i1p1_192001-192412.nc"
[19] "clt_Amon_CMCC-CESM_historical_r1i1p1_192501-192912.nc" "clt_Amon_CMCC-CESM_historical_r1i1p1_193001-193412.nc"
What I need to do is extract the subset of the string that is between "Amon_" and "_historical".
I can do this for a single variable, as shown here:
levels(as.factor(sub(".*?Amon_(.*?)_historical.*", "\\1", clt.list[1:20])))
[1] "ACCESS1-0" "ACCESS1-3" "bcc-csm1-1" "bcc-csm1-1-m" "BNU-ESM" "CanESM2" "CCSM4"
[8] "CESM1-BGC" "CESM1-CAM5" "CESM1-CAM5-1-FV2" "CESM1-FASTCHEM" "CESM1-WACCM" "CMCC-CESM"
However, what I'd like to do is to run the command above for all the five variables at once. Instead of using just "ctl.list" as argument in the command above, I'd like to use all variables "clt.list", "hurs.list", "prec.list", "temp.list" and "wind.list" at once.
How can I do that?
Many thanks in advance!
You can put your operation into a function and then iterate over it:
get_my_substr <- function(vecname)
levels(as.factor(sub(".*?Amon_(.*?)_historical.*", "\\1", get(vecname))))
lapply(my_vecnames,get_my_substr)
lapply acts like a loop. You can create your list of vector names with
my_vecnames <- ls(pattern=".list$")
It is generally good practice to post a reproducible example in your question. Since none was provided here, I tested this approach with...
# example-maker
prestr <- "grr_Amon_"
posstr <- "_historical_zzz"
make_ex <- function()
replicate(
sample(10,1),
paste0(prestr,paste0(sample(LETTERS,sample(5,1)),collapse=""),posstr)
)
# make a couple examples
set.seed(1)
m01 <- make_ex()
m02 <- make_ex()
# test result
lapply(ls(pattern="^m[0-9][0-9]$"),get_my_substr)
One solution would be to create a vector containing the variable names that you want extract the data from, for example:
var.names <- c("clt.list", "commands.list", "dirs.list")
Then to access the value of each variable from the name:
for (var.name in var.names) {
var.value <- as.list(environment())[[var.name]]
# Do something with var.value
}

Resources