Analyze code in dataframe with Tidycode in R

Analyze code in dataframe with Tidycode in R - r

I am trying to take R code, stored in cells of the content column of a dataframe, and analyze the functions used by applying the Tidycode package. However, I first need to convert the data to a Matahari tibble before applying an unnest_calls() function.
Here is the data:
data <- read.csv("https://github.com/making-data-science-count/TidyTuesday-Analysis/raw/master/db-tmp/cleaned%20database.csv")
I have tried doing this in a number of different ways, including extracting each row (in the content column ) as an Rfile and then reading it back in with Tidycode calls, for example:
tmp<-data$content2[1])
writeLines(tmp, "tmp.R") #I've also used save() and write()
rfile<-tidycode::read_rfiles("tmp.R")
But, I keep getting errors such as: "Error in parse(text = x) : <text>:1:14: unexpected symbol
1: library(here)library"
Ultimately, what I would like to do is analyze the different types of code per file, and keep that linked with the other data in the data dataframe, such as date and username.
Any help would be greatly appreciated!

Related

Number of Observation and variables are not equal in data frame R

I am running the abc(Approximate Bayesian Computataion) library in R. I am using the human dataset from abc.data. I run below line of code for model selection example which is working fine.
modsel.it <- postpr(stat.voight["italian",], models, stat.3pops.sim, tol=.05, method="mnlogistic")
summary(modsel.it)
I save the above mentioned human dataset data frames(stat.voight, models, stat.3pops.sim) as .csv(st,mod,stat3) respectively and run the same line of code for .csv files. It works fine but I get an error when I run the postpr functions as mentioned below
t <- postpr(st["italian",], mod, stat3, tol=.05, method="mnlogistic")
It gives me an error of Error: 'Number of summary statistics in 'target' has to be the same as 'sumstat'.
Then I checked the str (structure) of the actual dataframe and then open I saved as .csv. The one I saved as .csv is changed from the actual dataframe. Below shown are the images of the actual(stat.voight) and .csv(st) dataframes. I want to change my dataframe st dataframe be same as dataframestat.voight. Thanks

The write.csv() function has a default argument of row.names = TRUE, which writes the row names as the first column in the CSV. If you set row.names = FALSE, the row names will not be written to the file.
That said, the objects have a number of attributes that aren't written to the output files with write.csv(). As such, you're better off using saveRDS() and readRDS() to serialize these objects and reload them into R.

No applicable method for 'tidy' applied to an object of class "factor" in Tidytext

I'm starting doing text mining in R and I've some problems. I have a csv with users comments about a page. Each row is a different comment. It only has 1 column, the one that has the comments. I was trying to use Tidy in R so I import the file (read.csv) and I get a data frame with n factor levels.
The next step is try to tokenize the rows
The csv looks like this
#load the data
prueba <- read.csv(file="C:/Users/Mr & Mrs Bean/Downloads/Prueba.csv", sep=";")
#trying to tokenize
ty_prueba <- tidy(prueba)
Error in UseMethod("tidy") :
no applicable method for 'tidy' applied to an object of class "factor"
As you can see, I get that error. I've also try to convert to character that column but I get the same error. Every example I look has a text prepare to work, so it's difficult to see how the raw texts are prepared.
It's a rookie problem, so any advice will be appreciated.

I have found a solution. As someone post here now I've used read_excel (library readxl) instead of read.csv. It works for me. I suppose that it's something related to how R reads the file.

How to export an R dataframe to a Power Query table

I'm using an R script within Power Query to do some data transformations and return a scaled table.
My R code is like this:
# 'dataset'

It does seem like odd that this fails to return. A quick glance online gave this 3 minute youtube video, which uses the same method, which you are using. Further searching down a source, one may come across the Microsoft Documentation, which gives a possible reason for why there might be an issue.
When preparing and running an R script in Power BI Desktop, there are a few limitations:
Only data frames are imported, so make sure the data you want to import to Power BI is represented in a data frame
Columns that are typed as Complex and Vector are not imported, and are replaced with error values in the created table
These seem like the most obvious reasons. Betting that there is no complex columns in your dataset, I'd believe the prior is likely the reason. A quick recreation of your dataset shows that the scale functions changes your dataset into a matrix class object. This is kept by cbind, and as such output is of class matrix and not data.frame.
>dataset <- as.data.frame(abs(matrix(rnorm(1000),ncol=4)))
>class(dataset)
[1]"data.frame"
>library(dplyr)
>df_normal <- log(dataset + 1) %>%
> select(c(2:4)) %>%
> scale
>class(df_normal)
[1] "matrix"
>df_normal <- cbind(dataset[,1], df_normal)
>output <- df_normal
>class(output)
[1] "matrix"
A simple fix would then seem to be adding output <- as.data.frame(output), as this is in line with the documentation of powerBI. Maybe it would need a return like statement at the end. Adding a line at the end of the script simply stating output should fix this.
Edit
For clarification, I believe the following edited script (of yours) should return the data expected
# 'dataset' contém os dados de entrada neste script
library(dplyr)
df_normal <- log(dataset+1) %>%
select(c(2:4)) %>%
scale
df_normal <-cbind(dataset[,c(1)], df_normal)
output <- as.data.frame(df_normal)
#output ##This line might be needed without the first comment

How to output multiple pdf files from many data frame in a for loop in R

I through assign() function to name many data frame.
Use this script:
> for (i in 1:15)
{
assign(paste0('TagIDNum',i),filter(Ordf,Ordf$TagID==i))
}
Got this 15 data frame
Next step I need to output scatterplot of these 15 data frames with pairs() function and for loop to output pdf in once.
Here is my script:
for (i in 1:15)
{
pdf(paste('TagPlotNum',j,'.pdf',sep=''))
x<-paste('TagIDNum',j,sep='')
print(pairs(~x[,11]+x[,38]+x[,39]+x[,40]+x[,41]+x[,43]))
dev.off()
}
But I got this error information
Error information: incorrect number of dimensions
And I found that the x had no data, just a value as follow:
I will do some analysis in next steps, so this problem disturb for 2 days.
Post this article to ask any expert to solve this issue.
In my opinion, maybe paste() function have something to think, but I don't know how to solve this topic.
Here is my R information:
Thanks.

As per your output, x is the string "TagIDNum11", not the object with that name. You can get that however using get(), i.e.
x<-get(paste('TagIDNum',j,sep=''))
FYI, spaces are free, your code will be much more readable if you use them, i.e.
x <- get(paste('TagIDNum', j, sep=''))

Why am I getting different output from the Alteryx R tool

I using the Alteryx R Tool to sign an amazon http request. To do so, I need the hmac function that is included in the digest package.
I'm using a text input tool that includes the key and a datestamp.
Key= "foo"
datastamp= "20120215"
Here's the issue. When I run the following script:
the.data <- read.Alteryx("1", mode="data.frame")
write.Alteryx(base64encode(hmac(the.data$key,the.data$datestamp,algo="sha256",raw = TRUE)),1)
I get an incorrect result when compared to when I run the following:
write.Alteryx(base64encode(hmac("foo","20120215",algo="sha256",raw = TRUE)),1)
The difference being when I hardcode the values for the key and object I get the correct result. But if use the variables from the R data frame I get incorrect output.
Does the data frame alter the data in someway. Has anyone come across this when working with the R Tool in Alteryx.
Thanks for your input.

The issue appears to be that when creating the data frame, your character variables are converted to factors. The way to fix this with the data.frame constructor function is
the.data <- data.frame(Key="foo", datestamp="20120215", stringsAsFactors=FALSE)
I haven't used read.Alteryx but I assume it has a similar way of achieving this.
Alternatively, if your data frame has already been created, you can convert the factors back into character:
write.Alteryx(base64encode(hmac(
as.character(the.data$Key),
as.character(the.data$datestamp),
algo="sha256",raw = TRUE)),1)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Analyze code in dataframe with Tidycode in R - r

Related

Number of Observation and variables are not equal in data frame R

No applicable method for 'tidy' applied to an object of class "factor" in Tidytext

How to export an R dataframe to a Power Query table

How to output multiple pdf files from many data frame in a for loop in R

Why am I getting different output from the Alteryx R tool

Categories

Resources