print text when calling a model - r

I hope this is not a double post. I've been looking for an answer.
I have a function that returns a rather big list. So i would like it to print some text in between all the results of the list. A bit as you know it from lm and other models.
Consider this R script
y<-function(z)
{
l<-list()
print("hello world")
l$answer<-2*z
return(l)
}
x<-y(5)
This is a small example. I tried a solution with print but this is a bad solution , simply because it executes print when i save the variable as x<-fun(5). I just want it to execute text when you ask it explicit, or even better,if you can construct your own "summary" command to a list.
Thanks for your time.

If I understood what you want to do , I think you are looking to implement the S3method print.
set the class attribute :"someclass" of the y function return value
define print.someclass
here the code:
y<-function(z)
{
l<-list()
l$answer<-2*z
## Roland comment : usually better to preserve existing classes:
class(l) <- c('someclass', class(l))
return(l)
}
print.someclass<-
function(x,...){ ## add here what you want to print
print("hello world")
}
x<-y(5)
Now when you type x at console or print(x):
x
[1] "hello world"

Related

Vectorized Operation in R causing problems with custom function

I'm writing out some functions for Inventory management. I've recently wanted to add a "photo url column" to my spreadsheet by using an API I've used successfully while initially building my inventory. My Spreadsheet header looks like the following:
SKU | NAME | OTHER STUFF
I have a getProductInfo function that returns a list of product info from an API I'm calling.
getProductInfo<- function(barcode) {
#Input UPC
#Output List of product info
CallAPI(barcode)
Process API return, remove garbage
return(info)
}
I made a new function that takes my inventory csv as input, and attempts to add a new column with product photo url.
get_photo_url_from_product_info_output <- function(in_list){
#Input GetProductInfo Output. Returns Photo URL, or nothing if
#it doesn't exist
if(in_list$DisplayStockPhotos == TRUE){
return(in_list$StockPhotoURL)
} else {
return("")
}
}
add_Photo_URL <- function(in_csv){
#Input CSV data frame, appends photourl column
#Requires SKU (UPC) assumes no photourl column
out_csv <- mutate(in_csv, photo =
get_photo_url_from_product_info_output(
getProductInfo(SKU)
)
)
}
return (out_csv)
}
#Call it
new <- add_Photo_URL(old)
My thinking was that R would simply input the SKU of the from the row, and put it through the double function call "as is", and the vectorized DPLYR function mutate would just vectorize it. Unfortunately I was running into all sorts of problems I couldn't understand. Eventually I figured out that API call was crashing because the SKU field was all messed up as it was being passed in. I put in a breakpoint and found out that it wasn't just passing in the SKU, but instead an entire list (I think?) of SKUs. Every Row all at once. Something like this:
#Variable 'barcode' inside getProductInfo function contains:
[1] 7.869368e+11 1.438175e+10 1.256983e+10 2.454357e+10 3.139814e+10 1.256983e+10 1.313260e+10 4.339643e+10 2.454328e+10
[10] 1.313243e+10 6.839046e+11 2.454367e+10 2.454363e+10 2.454367e+10 2.454348e+10 8.418870e+11 2.519211e+10 2.454375e+10
[19] 2.454381e+10 2.454381e+10 2.454383e+10 2.454384e+10 7.869368e+11 2.454370e+10 2.454390e+10 1.913290e+11 2.454397e+10
[28] 2.454399e+10 2.519202e+10 2.519205e+10 7.742121e+11 8.839291e+11 8.539116e+10 2.519211e+10 2.519211e+10 2.519211e+10
Obviously my initial getProductInfo function can't handle that, so it'll crash.
How should I modify my code, whether it be in the input or API call to avoid this vectorized operation issue?
Well, it's not totally elegant but it works.
I figured out I need to use lapply, which is usually not my strong suit. Initally I tried to nest them like so:
lapply(SKU, get_photo_url_from_product_info_output(getProductInfo())
But that didn't work. So I just came up with bright idea of making another function
get_photo_url_from_sku <- function(barcode){
return(get_photo_url_from_product_info_output(getProductInfo(barcode)))
}
Call that in the lapply:
out_csv<- mutate(in_csv, photocolumn = lapply(SKU, get_photo_url_from_sku))
And it works great. My speed is only limited by my API calls.

Customize the console printing for S4/RC objects in R

In R we can simply type the variable name in the console, the console will automatically print out the value. I have created a new S4/RC class define, and would like to create a nicer way to automatically "print" in the console. How do I edit the console printing functions for a new class?
Here is my code in the console:
ClassA<-setRefClass("ClassA",fields=list(value="numeric"))
"print.ClassA"<-function(object){
cat("--------\n")
cat(object$value,"\n")
cat("--------\n")
}
classobject<-ClassA$new(value=100)
classobject # it doesn't print nicely in the console.
#Reference class object of class "ClassA"
#Field "value":
#[1] 100
print(classobject) # this works
#--------
#100
#--------
My goal is to avoid typing "print" all the time; just type the object name in the console, it will print out nicely, just like calling print().
Thanks!
You need to define a show method for your RefClass object. Read ?setRefClass for details regarding how to write methods. This works:
#the print function: note the .self to reference the object
s<-function(){
cat("--------\n")
cat(.self$value,"\n")
cat("--------\n")
}
#the class definition
ClassA<-setRefClass("ClassA",fields=list(value="numeric"),methods=list(show=s))
classobject<-ClassA$new(value=100)
classobject
#--------
#100
#--------

Splitting an htmlParse'd HTML document while preserving the class

I'd like to scrap phone numbers from this French public directory. The thing is, it can return multiple answers, and I'd like to get them all, but I have a problem in the splitting of the parsed HTML doc.
Here is my code :
# example url for reproducibility
url_ <- "http://www.pagesjaunes.fr/recherche/departement/zc-de-vignolles-beaune-21/pagot-&-savoie---espace-aubade"
response <- GET(url_)
doc <- content(response, type="text/html", encoding = "UTF-8")
parseddoc <- htmlParse(doc)
# I think the problem lies in this next line, let's call it "line A" :
boxes <- xpathSApply(parseddoc, "//article[#class='bi-bloc blocs clearfix bi-pro']")
foreach(box = boxes) %do% {
# and also in this line, let's call it "line B" :
return_line$PJ_phone_number <- xpathApply(box, "//div[#class='item bi-contact-tel']", xmlValue)
}
}
I've tested line A, the xpathSApply() gets all the nodes with the XPath "//article[#class='bi-bloc blocs clearfix bi-pro']" (which is basically each box of result from the search on the website) and puts them into a list. I'm then going through this list with foreach. (I've tested this)
However, for line B to work, "box" needs to be of class "XMLInternalDocument". (parseddoc has class "HTMLInternalDocument" "HTMLInternalDocument" "XMLInternalDocument" "XMLAbstractDocument" for instance). But in line A, xpathSApply() returns a list of objects of class "XMLInternalElementNode" "XMLInternalNode" "XMLAbstractNode".
So my question is, how can I have line A "split" the parts of parseddoc that I need, also while keeping the same class, XMLInternalDocument ?
I hope I'm clear enough. Thanks.

how to make a loop function with "adonis"?

data(dune)
data(dune.env)
results<-list()
for (i in colnames(dune.env)){
results[[i]]<- adonis(dune ~ i, data=dune.env, permutations=99)
}
When I test each name in colnames(dune.env), it can work.
But it can not work in the loop function above. I think it is due to the i in the loop fuction has " ". How to fix it? Thanks.
I know nothing about adonis, but I do know that formulas are language objects which do not take nicely to being treated as though they were ordinary character objects.
for (i in colnames(dune.env)){
form <- as.formula(paste("dune", i, sep="~"))
results[[i]]<- adonis(form, data=dune.env, permutations=99)
}

Use for loop with if else statement in R

I am trying to create a for loop with an if else statement. My code looks the following:
for(i in 1:length(assignmentlist[,1]))
{if assignmentlist$Approve[i]=="1"
{ApproveAssignment(assignments=assignmentlist$AssignmentId[i],sandbox=T)}
else {RejectAssignment(assignments=assignmentlist$AssignmentId[i],sandbox=T)}}
whereas the "assignmentlist" looks like the following
> assignmentlist
AssignmentId Approve
1 5135 1
2 8963 0
3 6823 0
4 3287 1
Basically I would like to execute the "ApproveAssignment" function for all the entries that have a "1" in the "Approve" collumn. The problem is, that I would like to use the same index (the same i) inside the ApproveAssignment function. Unfortunately, this does not seem to work. Does anyone has a gentle way to avoid this problem?
Edit: The Approve Assignment function is a function that approves a certain assignment of Mechanical Turk over an API and is part of the MTurkR package
Any help yould be appreciated very much!
I don't get the point because the "i" of youy loop can be directly used in your function:
ApproveAssignment <- function(assignments=NULL, sandbox=NULL) cat(i, "was approved\n")
RejectAssignment <- function(assignments=NULL, sandbox=NULL) cat(i, "was rejected\n")
for(i in 1:length(assignmentlist[,1])){
if (assignmentlist$Approve[i]=="1")
ApproveAssignment(assignments=assignmentlist$AssignmentId[i],sandbox=T)
else
RejectAssignment(assignments=assignmentlist$AssignmentId[i],sandbox=T)
}
If, which I assume given I you want to use it inside it, you are the author of ApproveAssignment, you should hand the index over to the function as an additional argument.
ApproveAssignment <- function(assignments, sandbox, index) { ... }
ApproveAssignment and RejectAssignment internally loop through a vector of AssignmentIds to make an assignment-specific call to the API, so all you have to do is feed it a vector of AssignmentIds, no need for the loop or any of the conditionals.
ApproveAssignment(assignments=assignmentlist$AssignmentId[assignmentlist$Approve==1],sandbox=T)
RejectAssignment(assignments=assignmentlist$AssignmentId[!assignmentlist$Approve==1],sandbox=T)

Resources