Analysis by subset does not work [duplicate]

Analysis by subset does not work [duplicate] - r

I tried using a for loop to print out a few rows. here is the code.
Weird thing is that it doesn't work for head() function. It works if I replaced head() with print().
kw_id=c('a','b')
keyword_text=data.frame(col=c('a','b'), col2=c(1,2), row.names=(c('r1','r2')))
for (i in 1:2) {
plot_data<-subset(keyword_text,col==kw_id[i])
print(plot_data)
head(plot_data)
}
Could someone help? I suspect it has something to do with head() function.

This is a relatively common class of problem that newcomers to R run into. The issue here is that R serves two mistresses: interactive console work and "true programming".
When you type a command at the console that returns a value, the console automatically calls a print method in order to display the results. When running a script, this doesn't happen unless you tell it to.
So if you changed it to print(head(plot_data)) it should work.
These are discussed in FAQ 7.16 and 7.22
Addendum lifted from the comments:
As Josh points out, copy+pasting the for loop directly to the console also fails to print any output. What's going on in that case is that for loops (like most everything in R) is actually a function, and it's return value (NULL) is returned invisibly, which means no printing. (This is mentioned in ?Control.)

Related

Why to use print() every time inside function in r

I have a function that have nested functions. I call the sub functions inside the main function body, but the problem is that the methods such as head() or ggplot don't print to the command line. Is there any option in R to let those function print without nesting those functions inside print() method
x <- function(sample_dataframe){
y <- function(df){
head(df)
# do more stuff on the dataframe ..
return df
}
y(sample_dataframe)
}
x()

No, there isn't. The functions head() and ggplot() never print anything. They just return an object, and R decides whether to print that object or not.
The rule in R is that objects returned at the top level will print (unless they are marked as "invisible"). There's no option to make things automatically print in other circumstances.
The philosophy behind this difference is that R is intended to be used both interactively and programmatically. In a program, if you want something to print, you should call a function to print it. Weird automatic actions just cause trouble. However, this is inconvenient if you are using R more like a calculator than a programming platform. If you want to know the value of 123 times 456, it's a lot easier to type 123*456 than to type print(123*456), so the original interactive console would automatically print things unless you asked it not to do so.
In the years since that decision, things like R Markdown documents that blur the line between programming platform and interactive calculator have come along.
You find the programmatic requirement to specify your actions inconvenient: you'd like 123*456 in a program to print its result. The trouble with that behaviour is that there are functions in R that are called for their side effects (deleting files, opening graphics devices, etc.) and some of them return information that will print when they are called interactively, but will be ignored if they are called in a program. It is already inconvenient to have to use invisible() to suppress printing of function results; it would be even more inconvenient to have to use invisible() on every function call where you didn't save the result in a variable.

How to synchronise output in Julia language?

I'm trying to loop in a cycle, output each value and then print the result of some other function.
But the output looks strange, the output of the loop mixed with the output of the other function.
Is there any way to synchronise it? I'm using Jupyter, not Julia console.

As Gnimuc K. pointed out in comments:
this has already been fixed here, but unreleased yet. you should work
on the master via Pkg.checkout("IJulia")

Is function(){} a true quine?

After poking around on the internet I wasn't able to find anyone who had written a quine in R (Edit: since writing this, have found several on SO, but am still interested in this one). So I figured I'd try my hand at coming up with one myself. My result was the (surprisingly short) code:
function(){}
which will output function(){} when run. This takes advantage of the fact that a function name without parens or arguments after it will return the function's source code.
However, a program that "looks at itself" is not generally considered a true quine. There are two things I realized I don't understand in the course of trying to decide whether I'd written a "real" quine: (1) What constitutes a program "looking at itself" (from a quine standpoint) beyond use of file i/o and (2) the extent to which function(){} (or similar commands like logical(0)) are self referential when they print themselves. The former seems to be too subjective for SO, but I was hoping for some clarification on the latter. So...
When I run function(){}, what exactly is happening that causes it to print its own "source code"? For example, is R loading an empty function into a local environment, evaluating that function, and then looking back at the code that defined it to print? Or, is it just looking at function(){} and echoing its definition right away? Is there a fundamental difference between this and
f<-function(){cat("f<-");print(f);cat("f()")}
f()
in terms of how they both print themselves when run?

You don't completely get what's going on here. Actually, the code
function(){}
doesn't do anything apart from constructing a function without arguments and body, returning it and deleting it immediately after it's returned. Its output would be NULL, so doesn't "recreate itself".
The output you see in the console, is not the output given by function(){} but by print.function. This is the S3 method that takes care of showing a function object in the console. What you actually do, is:
a <- function(){}
print(a)
rm(a)
A true R quine would be something like this:
m<-"m<-0;cat(sub(0,deparse(m),m))";cat(sub(0,deparse(m),m))
See also Wikipedia for this and other examples

This is not a true quine as it does not print anything to stdout. Whole point of Quine is that it can reproduce itself by printing. Program must create a new file or output in stdout containing its exact code.
Example of a javascript quine would be:
(function a(){console.log(`(${a}())`)}())

(function(x) print(substitute(x(x))))(function(x) print(substitute(x(x))))

Disabling output has no effect

I noticed that, under some, unknown to me circumstances, some functions, whether from base R (for example, gc()), or from external packages (for example, getCurlHandle() from RCurl), still produce output, even after explicitly disabling it via verbose = FALSE. I am curious about reasons for such behavior. The only workaround I found on SO is the recommendation to call invisible(), but for me it worked only for gc(), but not for getCurlHandle(). Would appreciate any comments and answers.

The command gc(verbose=TRUE):
prints some statistics and percentages,
AND prints the matrix that is returned by the function.
The command x=gc(verbose=TRUE) only prints the statistics.
The command gc(verbose=FALSE) only prints the returned matrix.
The command x=gc(verbose=FALSE)prints nothing.

While preparing a reproducible example, I figured this out. The source of the questionable output was not getCurlHandle(). It has been producing by another (next) function: curlSetOpt(). I took care of disabling its output by using invisible().
It was really not bad, comparing with the effort I've made for figuring out my previous R and RCurl problem. But, it is always fun and educational.

Switch R script from non-interactive to interactive

I've an R script, that takes commandline arguments, where the top line is:
#!/usr/bin/Rscript --slave
I wanted to interrupt execution in a function (so I can interactively use the data variables that have been loaded by that point to work out the next bit of code I need to write). I added this inside the function in question:
browser()
but it gets ignored. A bit of searching suggests it might be because the program is running in non-interactive mode. But even more searching has not tracked down how I switch the script out non-interactive mode so that browser() will work. Something like a browser_yes_I_really_mean_it() function.
P.S. I want to avoid altering the rest of the script if at all possible. My current approach is to copy and paste the code chunks, needed to prepare the data, into an interactive session; but as the script gets more and more complex this is getting more and more unreasonable.
UPDATE: for anyone else with the same question, it appears the answer to the actual question is that it is impossible. Once you start R in a non-interactive mode the die is cast. The given answers are therefore workarounds: either you hack your code (remembering to unhack it afterwards), or you refactor to make debugging easier. (This comment is not intended as a criticism of the answers; the suggested refactoring makes the code cleaner anyway.)

Can you just fire up R and source the file instead?
R
source("script.R")

Following mdsumner's answer, I edited my script like this:
if(!exists("argv")){
argv=commandArgs(TRUE)
if(length(argv)!=4)usage_and_exit()
}else{
if(length(argv)!=4){
stop("Must set argv as a 4 element vector. E.g. argv=c(...)")
}
}
Then no other change was needed, and I was able to do:
R
> argv=c('a','b','c','d')
> source("script.R")

In addition to the previous answer, I'd create a toplevel function (e.g. doStuff) which performs the analysis you want to perform in batch. The function takes the cmd line options as input. In the batch script you source the script that contains this function and call it. In this way you can easily run the function in interactive mode and use e.g. browser().

In some cases, the suggested solution (workaround) may not work - for example, when the R code needs to be run as a part of an existing bash script. For those cases, I suggest to write in your R script into the bash script using here document:
#!/bin/bash
R --interactive << EOT
# R code starts here
argv=c('a','b','c','d')
print(interactive())
# Rest of script contents
quit("no")
# R code ends here
EOT
This way, print(interactive()) above will yield TRUE.
Sidenote: Make sure to avoid the $ character in your R code, as this would not be processed correctly - for example, retrieve a column from a data.frame() by using df[["X1"]] instead of df$X1.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Analysis by subset does not work [duplicate] - r

Related

Why to use print() every time inside function in r

How to synchronise output in Julia language?

Is function(){} a true quine?

Disabling output has no effect

Switch R script from non-interactive to interactive

Categories

Resources