This question sounds like it has already been asked on SO, but I ask it anyway because the existing answers did not work for me and I'm not sure how better to phrase it, as I am new to R and don't entirely grasp the intricacies of its data types.
Time for a minimal example. I am looking for a transformation of target such that targetObject is exactly equal to referenceObject.
reference = '{"airport":[{"name":"brussels","loc":{"lat":"1","lon":"2"}}],"parking":[{"name":"P1"}]}'
target = '{"airport":{"name":"brussels","loc":{"lat":"1","lon":"2"}},"parking":{"name":"P1"}}'
referenceObject = jsonlite::fromJSON(reference)$airport
x = jsonlite::fromJSON(target)$airport
# Transformation
targetObject = do.call(rbind.data.frame, x)
# Currently prints FALSE, should become TRUE
results_same = identical(referenceObject, targetObject)
print(results_same)
I would expect this to be very simple in any language, but R seems to handle the nested loc lists very differently depending on the shape of the outer object airport.
I managed to find a solution by serializing back to JSON. It's not elegant but at least it works.
# Transformation
targetObject = jsonlite::fromJSON(jsonlite::toJSON(list(x), auto_unbox = TRUE))
For now I will not mark this answer as correct because it's more of a workaround than an ideomatic solution.
Related
Recently I encountered the following problem in my R code. In a function, accepting a data frame as an argument, I needed to add (or replace, if it exists) a column with data calculated based on values of the data frame's original column. I wrote the code, but the testing revealed that data frame extract/replace operations, which I've used, resulted in a loss of the object's special (user-defined) attributes.
After realizing that and confirming that behavior by reading R documentation (http://stat.ethz.ch/R-manual/R-patched/library/base/html/Extract.html), I decided to solve the problem very simply - by saving the attributes before the extract/replace operations and restoring them thereafter:
myTransformationFunction <- function (data) {
# save object's attributes
attrs <- attributes(data)
<data frame transformations; involves extract/replace operations on `data`>
# restore the attributes
attributes(data) <- attrs
return (data)
}
This approach worked. However, accidentally, I ran across another piece of R documentation (http://stat.ethz.ch/R-manual/R-patched/library/base/html/Extract.data.frame.html), which offers IMHO an interesting (and, potentially, a more generic?) alternative approach to solving the same problem:
## keeping special attributes: use a class with a
## "as.data.frame" and "[" method:
as.data.frame.avector <- as.data.frame.vector
`[.avector` <- function(x,i,...) {
r <- NextMethod("[")
mostattributes(r) <- attributes(x)
r
}
d <- data.frame(i = 0:7, f = gl(2,4),
u = structure(11:18, unit = "kg", class = "avector"))
str(d[2:4, -1]) # 'u' keeps its "unit"
I would really appreciate if people here could help by:
Comparing the two above-mentioned approaches, if they are comparable (I realize that the second approach as defined is for data frames, but I suspect it can be generalized to any object);
Explaining the syntax and meaning in the function definition in the second approach, especially as.data.frame.avector, as well as what is the purpose of the line as.data.frame.avector <- as.data.frame.vector.
I'm answering my own question, since I have just found an SO question (How to delete a row from a data.frame without losing the attributes), answers to which cover most of my questions posed above. However, additional explanations (for R beginners) for the second approach would still be appreciated.
UPDATE:
Another solution to this problem has been proposed in an answer to the following SO question: indexing operation removes attributes. Personally, however, I better like the approach, based on creating a new class, as it's IMHO semantically cleaner.
I've been trying to learn more about environments in R. Through reading, it seemed that I should be able to use functions like with() and transform() to modify variables in a data.frame as if I was operating within that object's environment. So, I thought the following might work:
X <- expand.grid(
Cond=c("baseline","perceptual","semantic"),
Age=c("child","adult"),
Gender=c("male","female")
)
Z <- transform(X,
contrasts(Cond) <- cbind(c(1,0,-1)/2, c(1,-2,1))/4,
contrasts(Age) <- cbind(c(-1,1)/2),
contrasts(Gender) <- cbind(c(-1,1)/2)
)
str(Z)
contrasts(Z$Cond)
But it does not. I was hoping someone could explain why. Of course, I understand that contrasts(X$Cond) <- ... would work, but I'm curious about why this does not.
In fact, this does not work either [EDIT: false, this does work. I tried this quickly before posting originally and did something wrong]:
attach(X)
contrasts(Cond) <- cbind(c(1,0,-1)/2, c(1,-2,1))/4
contrasts(Age) <- cbind(c(-1,1)/2)
contrasts(Gender) <- cbind(c(-1,1)/2)
detach(X)
I apologize if this is a "RTFM" sort of thing... it's not that I haven't looked. I just don't understand. Thank you!
[EDIT: Thank you joran---within() instead of with() or transform() does the trick! The following syntax worked.]
Z <- within(X, {
contrasts(Cond) <- ...
contrasts(Age) <- ...
contrasts(Gender) <- ...
}
)
transform is definitely the wrong tool, I think. And you don't want with, you probably want within, in order to return the entire object:
X <- within(X,{contrasts(Cond) <- cbind(c(1,0,-1)/2, c(1,-2,1))/4
contrasts(Age) <- cbind(c(-1,1)/2)
contrasts(Gender) <- cbind(c(-1,1)/2)})
The only tricky part here is to remember the curly braces to enclose multiple lines in a single expression.
Your last example, using attach, works just fine for me.
transform is only set up to evaluate expressions of the form tag = value, and because of the way it evaluates those expressions, it isn't really set up to modify attributes of a column. It is more intended for direct modifications to the columns themselves. (Scaling, taking the log, etc.)
The difference between with and within is nicely summed up by the Value section of ?within:
Value For with, the value of the evaluated expr. For within, the modified object.
So with only returns the result of the expression. within is for modifying an object and returning the whole thing.
While I agree with #Jornan that within is the best strategy here, I will point out it is possible to use transform you just need to do so in a different way
Z <- transform(X,
Cond = `contrasts<-`(Cond, value=cbind(c(1,0,-1)/2, c(1,-2,1))/4),
Age = `contrasts<-`(Age, value=cbind(c(-1,1)/2)),
Gender= `contrasts<-`(Gender, value=cbind(c(-1,1)/2))
)
Here we are explicitly calling the magic function that is used when you run contrasts(a)=b. This actually returns a value that can be used with the a=b format that transform expects. And of course it leaves X unchanged.
The within solution looks much cleaner of course.
I have a variable a=0.01
I then create a matrix b<-matrix(data=NA,ncol=2,nrow=9)
I would like to rename this matrix by adding the value stored in a to its name.
The results should be b_0.01
I bet there are more elegant ways to achieve what you need, but this seems to work:
assign(x = paste("b", a, sep = "_"), value = b)
Edit following #Roland's comment:
rm(b)
Please note that I address your question in a narrow sense. As pointed out by both #Roland and #Paul Hiemstra, there may be more general aspects of the work-flow that could be fruitful to consider as well.
You can use assign to get this done:
a = 0.01
b = matrix(data=NA,ncol=2,nrow=9)
assign(sprintf('b_%s', a), b)
b_0.01
In general, I would avoid creating data objects like this. In stead, I would use list's to create, store and manipulate groups of objects.
Are there concise (yet fairly thorough) tutorials to get someone used to working in MATLAB, up to speed with writing R code.
Here is one particular issue I have in mind: From my limited experience with the R documentation and tutorials, I am left with a lot of confusion regarding datatypes in R and how to manipulate them. For example, what is a vector, matrix, list, data frame, etc and how do they relate. I haven't found a source which explains the basic data types clearly, to the point that I am wondering if the language is ambiguous by design.
It's always difficult if you are primarily familiar with only one programming language when you try to learn another that works differently, because you expect to think through a problem in a different way, and these incorrect expectations cause problems. It would be very difficult to have an introductory guide that is appropriate for students coming from each of the other languages ('you're going to think you should do X, but in R, you should do Y'). However, I can assure you that R was not designed to be ambiguous.
Mostly, you are simply going to have to get an introductory guide and plod through it. At first, it will be a lot of work, and frustrating, but that's the only way. In the end, it will get easier. Perhaps I can tell you a couple of things to jumpstart the process:
a list is just an ordered set of elements. This can be of any length, and contain any old type of thing. For example, x <- list(5, "word", TRUE).
a vector is also an ordered set of elements. Although it can be of any length, the elements must all be of the same type. For example, x <- c(3,5,4), x <- c("letter", "word", "a phrase"), x <- c(TRUE, FALSE, FALSE, TRUE).
a matrix is a vector of vectors, where all component vectors are of the same length and type. For example, x <- matrix(c("a", "b", "c", "d"), ncol=2).
a data.frame is a list of vectors, where all component vectors are of the same length, but do NOT have to be of the same type. For example, x <- data.frame(category=c("blue", "green"), amount=c(5, 30), condition.met=c(TRUE, FALSE)).
(response to comments:)
The function ?c is for concatenation; c(c("a", "b"), c("c", "d")), will not create a matrix, but a longer vector from two shorter vectors. The function ?cbind (to bind columns together), or rbind() (to bind rows together), will create a matrix.
I don't know of a single function that will output the type of any object. The closest thing is probably ?class, but this will sometimes give, e.g., "integer", where I think you want "vector". There are also mode(), and typeof(), which are related, but aren't quite what you're looking for. Find out more about the distinctions among these here and here. To check whether an object is a specific type you can use is.<specific type>(), e.g., ?is.vector.
To coerce (i.e., 'cast') an object to a specific type, you can use as.vector(), but this will only work if the conditions (e.g., noted above) are met.
I have some issue while adding properties of nodes in igraph working with R. I made a text list named journal.txt and I want to give the nodes of my graph a property. With other textual or numeric lists, I had absolutely no issues, but with this one I have.
with this I read the txt file, read just the first column, although there is just one, read as character, although i tried also without and it doesn't work
journalList = read.csv("c:/temp/biblioCoupling/journals.txt", header=FALSE)
journalLR = (journalList[1:303,1])
journalLR = as.character(journalLR)
V(g)$journalName = journalLR
then when I save the file,
write.graph(gr,"filename.gml",format=c("gml"), creator="Claudio Biscaro")
I see all other properties I added to nodes, but not this one!!!
could it be because some entry in journalLR is more than 15 character long?
I have absolutely no idea why I can't do that
Your code is not reproducible, it is impossible to tell for sure, but I guess that V(g)$journalName is a complex attribute, i.e. it is not a vector of values, but a list of values.
To check, you can do str(g) and then look at the code letter after the journalName attribute. If it is x, then it is complex, if it is c, then it is character.
If this is the problem and you don't really need a list, then the workaround is to do
g <- remove.vertex.attribute(g, "journalName")
V(g)$journalName <- journalName
solved by adding one at a time. That was weird. after a long time trying!
for (i in 1:length(journalLR))
{
V(g)[i]$journalName = journalLR[i]
}
probably it is not a formally good solution, but it works!