PowerBI R code error - r

Following this post Gantt charts with R I stored my data in the data.frame using the same data structure as the link. Code is below (note the first 4 rows are auto generated by powerBI and I couldnt amend it.
df <- data.frame(task = c("task1", "task2", "task3"),
status = c("done", "active", "crit"),
pos = c("first_1", "first_2", "first_3"),
start = c("2014-01-06", "2014-01-09", "after first_2"),
end = c("2014-01-08", "3d", "5d"))
#Create dataframe
dataset <- data.frame(end, start, status, task, pos)
#Remove duplicated rows
# dataset <-unique(dataset)
df <- dataset
library(tidyr)
library(DiagrammeR)
library(dplyr)
mermaid(
paste0(
# mermaid "header", each component separated with "\n" (line break)
"gantt", "\n",
"dateFormat YYYY-MM-DD", "\n",
"title A Very Nice Gantt Diagram", "\n",
# unite the first two columns (task & status) and separate them with ":"
# then, unite the other columns and separate them with ","
# this will create the required mermaid "body"
paste(df %>%
unite(i, task, status, sep = ":") %>%
unite(j, i, pos, start, end, sep = ",") %>%
.$j,
collapse = "\n"
), "\n"
)
)
When running R script, I got the following error message.
No image was created. The R code did not result in creation of any visuals. Make sure your R script results in a plot to the R default device.
Please advice why this happened?
THank you
Peddie

Related

My loop works, but I get 'Error in file(con, "r") : cannot open the connection' when I try to use it on my folder of txt files

I created a loop, but am having a lot of issues when I try to use that over the files that I need the loop to go on. I have tried a series of things, and I think this is the closest I have got to making it work. The only problem I am facing now is the very last line of code, when I use lapply - it says Error in file(con, "r") : cannot open the connection.
Any help at all is appreciated! I am just starting out.
setwd<- ("C:/Users/k.ganeson/Desktop/accession_commitments")
extract_info = function(file) {
text <- readLines(("C:/Users/k.ganeson/Desktop/accession_commitments"))
clean_text <- as.data.frame(strsplit(text$text, '\\*' ), col.names = "text") %>% # split the text in different lines
# clean text getting rid of special text characters, useless whitespaces etc
mutate(text = str_replace_all(text, "\n", " "),
text = str_replace_all(text, "- ", ""),
text = str_replace_all(text,"^\\s", "")) %>%
# take out lines where there is nothing
filter(!text == " ") %>%
## Now we want to separate in two columns the type of commitments and the actual commitments paragraphs
# First we copy the commitments' paragraph in another column
mutate(paragraphs = ifelse(grepl("^[[:digit:]]", text) == T, text, NA)) %>%
# We rename column text as category to avoid confusion
rename(category = text) %>%
# In category column we don't need the paragraphs anymore so we put NAs instead
# and then fill the column with the categories
mutate(category = ifelse(grepl("^[[:digit:]]", category) == T, NA, category)) %>%
fill(category) %>%
# Now we can get rid of all lines with NAs in the paragraphs column as they are useless
filter(!is.na(paragraphs)) %>%
# Now separate all the different commitments' paragraph in different rows
mutate(paragraphs = strsplit(paragraphs, '^[[:digit:]]{1,3}\\.|\\t\\s[[:digit:]]{1,3}\\.')) %>%
unnest(paragraphs) %>%
# Make sure to also separate the last bit of javascript (starting from Download as PDF etc) so that
# it does not show up as one of the commitments
mutate(paragraphs = strsplit(paragraphs, 'Download as PDF')) %>%
unnest(paragraphs) %>%
## last cleaning
mutate(paragraphs = str_replace_all(paragraphs, "\t", "")) %>% # get rid of text special characters as we don't need them anymore
mutate(paragraphs = ifelse(grepl("javascript", paragraphs), "", paragraphs)) %>% # get rid of any potential remaning java script code that is useless
mutate(paragraphs = str_replace_all(paragraphs, "^\\s+", "")) %>% # sometimes we have empty cells in paragraphs with one or more whitespace, replace all of them by ""
filter(!paragraphs == "") # now get rid of all empty lines}
#list of all files to go through the loop
my_files = list.files(path = "C:/Users/k.ganeson/Desktop/accession_commitments",
pattern = "txt")
my_files #Apply function to these files
results = lapply(my_files, extract_info)

How to merge tables and format appripriately?

So I have the following in cityzone.txt:
"earth/city/somerset/forest/somerset-test.txt#53497",
"earth/city/nottingham/forest/nighthill.txt#53498",
"earth/city/bury/town/bishop-zone1.mp3#53695",
And the following in areasize.txt:
planet\mars\red\crater.txt;56,
pluto\distant\dwarfmoon.txt;181,
mars\hot\red\redmoon.txt;43,
earth\city\somerset\forest\somerset-test.txt;205,
earth\city\bury\town\bishop-zone1.mp3;499,
So what I need is for a new table to be created and written to an output file.
What should happen is - for each row in cityzone.txt, the title for that row should be looked up in areasize.txt. If the title exists, the areasize number from areasize.txt should be appended to the cityzone row like this:
"title#id#areasize",
With quotes and comma accordingly.
So for cityzones.txt above, the output should be thus:
"earth/city/somerset/forest/somerset-test.txt#53497#205",
"earth/city/bury/town/bishop-zone1.mp3#53695#499",
And then it should be output to a file with quote sand comma as shown.
So only 2 of the 3 cityzone.txt rows are included in the results because only 2 of the 3 rows exist in areasize.txt.
My starter code for this is really a continuation from this question:
How do I merge partial data and format it in R?
So I will add the code for this to the code in that question.
Thank you.
You can do :
library(dplyr)
library(tidyr)
#Read the text files and keep only 1st column
cityzone <- read.table('cityzone.txt')[1]
areasize <- read.table('areasize.txt', sep = ';')
#Separate columns on # and join
#Clean areasize dataframe
cityzone %>% separate(V1, c('V1', 'V2'), sep = '#') %>%
inner_join(areasize %>%
mutate(V1 = gsub('\\\\', '/', V1),
V2 = sub(',$', '', V2)),
by = 'V1') -> result
#Combine output in required format and write
cat(sprintf('"%s#%s#%s",', result$V1, result$V2.x, result$V2.y),
file = 'output.lua', sep = '\n')

Clever way to avoid for loop in R

I have a data file that follows roughly this format:
HEADER:001,v1,v2,v3...,v10
v1,v2,v3,STATUS,v5...v6
.
.
.
HEADER:006,v1,v2,v3...v10
HEADER:012,v1,v2,v3...v10
v1,v2,v3,STATUS,v5...v6
v1,v2,v3,STATUS,v5...v6
.
.
.
etc
where each block or chunk of data leads off with a comma separated line that includes the header and a unique (not necessarily sequential) number, and then there may be 0 or more lines that are identified by the STATUS keyword in the body of the chunk.
I am reading this block in using readLines and then splitting it into header lines and status lines to be read in as CSV separately, since they have a different number of variables:
datablocks <- readLines(filename, skipNul = T)
headers <- datablocks[grepl("HEADER", datablocks, useBytes = T)]
headers <- read.csv(text=headers, header= F, stringsAsFactors = F)
statuses <- datablocks[grepl("STATUS", datablocks, useBytes = T)]
statuses <- read.csv(text=statuses, header= F, stringsAsFactors = F)
Eventually, I would like to inner join this data, so that the variables from the header are included in each status line:
all <- headers %>% inner_join(statuses, by = c("ID" = "ID"))
But I need a way to add the unique ID of the header to each status line below it, until the next header. The only way I can think of doing this is with a for loop that runs over the initial full text datablock:
header_id <- NA
for(i in seq(1:length(datablocks))) {
is_header_line <- str_extract(datablocks[i], "HEADER:([^,]*)")
if(!is.na(is_header_line)) {
header_id <- is_header_line
}
datablocks[i] <- paste(datablocks[i], header_id, sep=",")
}
This works fine, but it's ugly, and not very... R-ish. I can't think of a way to vectorize this operation, since it needs to keep an external variable.
Am I missing something obvious here?
Edit
If the input looks literally like this
HEADER:001,a0,b0,c0,d0
e0,f0,g0,STATUS,h0,i0,j0,k0,l0,m0
HEADER:006,a1,b1,c1,d1
HEADER:012,a2,b2,c2,d2
e1,f1,g1,STATUS,h1,i1,j1,k1,l1,m1
e2,f2,g2,STATUS,h2,i2,j2,k2,l2,m2
The output should look like this:
e0,f0,g0,h0,i0,j0,k0,l0,m0,a0,b0,c0,d0,001
e1,f1,g1,h1,i1,j1,k1,l1,m1,a2,b2,c2,d2,012
e2,f2,g2,h2,i2,j2,k2,l2,m2,a2,b2,c2,d2,012
So there needs to be a column propagated from the parent (HEADER) to the children (STATUS) to inner join on.
EDIT:
Thanks for the clarification. The specific input and output makes it dramatically easier to avoid misunderstandings.
Here I use tidyr::separate to separate out the header label from the "a0,b0,c0,d0" part, and tidyr::fill to propagate header info down into the following status rows.
library(tidyverse)
read_table(col_names = "text",
"HEADER:001,a0,b0,c0,d0
e0,f0,g0,STATUS,h0,i0,j0,k0,l0,m0
HEADER:006,a1,b1,c1,d1
HEADER:012,a2,b2,c2,d2
e1,f1,g1,STATUS,h1,i1,j1,k1,l1,m1
e2,f2,g2,STATUS,h2,i2,j2,k2,l2,m2") %>%
mutate(status_row = str_detect(text, "STATUS"),
header_row = str_detect(text, "HEADER"),
header = if_else(header_row, str_remove(text, "HEADER:"), NA_character_)) %>%
separate(header, c("header", "stub"), sep = ",", extra = "merge") %>%
fill(header, stub) %>%
filter(status_row) %>%
mutate(output = paste(str_remove(text, "STATUS,"), stub, header, sep = ",")) %>%
select(output)
Result
# A tibble: 3 x 1
output
<chr>
1 e0,f0,g0,h0,i0,j0,k0,l0,m0,a0,b0,c0,d0,001
2 e1,f1,g1,h1,i1,j1,k1,l1,m1,a2,b2,c2,d2,012
3 e2,f2,g2,h2,i2,j2,k2,l2,m2,a2,b2,c2,d2,012

How to add line breaks within a cell on merged rows in R studio?

I would like to merge the rows into 1 single cell. So there should be 1 row which has all data together concatenated. Then, I would need the text title "Top coins today!" above it, appearing once only.
I have been able to merge the rows into 1 singular cell using str_c function with below code.
xyz_1 <- paste0('Top performing coins in the last 24 hrs!',
paste(df2$id, " - ",
df2$price_change_percentage_24h) %>%
str_c(collapse = '\n'), sep="\n")
This has worked. However, I do not know how to create a line break between "Top performing coins" and the data. It needs to be so like in the attached image. Basically, I need to create a line break in a cell.
The dataframe DPUT is below if required.
structure(list(id = c("xdai-stake", "hegic", "keep-network"),
price_change_percentage_24h = c(26.96, 26.62, 23.93)), row.names = c(43L, 36L, 38L), class = "data.frame")
The desired output should look like this - https://ibb.co/tJRTdfr. This is how it should appear and structured.
Thanks very much!!
I have solved this problem using the tibble function.
df2 <- tibble(top3subset)
xyz_1 <- paste0('Top performing coins in the last 24 hrs!', "
",
paste(df2$id, " - ",
df2$price_change_percentage_24h) %>%
str_c(collapse = '\n'), sep="\n")
This solution was the closest I got to the desired output. Frankly, it sufficed.

Power BI - creating custom r visuals relationship error

I desperately need help!
I am trying to predict drug use based on 5 characteristics: Age, Gender, Education, Ethnicity, Country. I already build a tree model in R with rpart
DrugTree3 <- rpart(formula = DrugUser ~ Age+Gender+Education+Ethnicity+Country, data = traindata)
, a logistic regression model
DrugLog <- glm(formula = DrugUser ~ Age+Gender+Ethnicity+Education+Country,data = traindata, family = binomial)
, and a knn model
KnnModel <- train(form = DrugUser~., data = ModelData,method ='knn',tuneGrid=expand.grid(.k=1:100),metric='Accuracy',trControl=trainControl(method='repeatedcv',number=10,repeats=10)) .
I saved those as RDS files and uploaded them successfully in Power BI.
I then created tables for each characterization and created okviz filters for them.
Then I tried to predict whether a customer gets predicted as a drug user or a non-drug user based on the selections in the okviz filters. This is when everything went horribly wrong:
I created a custom R visual vor each model prediction and inserted the following code in each visual:
# The following code to create a dataframe and remove duplicated rows is always executed and acts as a preamble for your script:
# dataset <- data.frame(chunk_id, model_id, model_str, AgeLabel, GenderLabel, CountryLabel, EducationLabel, EthnicityLabel)
# dataset <- unique(dataset)
# Paste or type your script code here:
library(dplyr)
from_byte_string = function(x) {
xcharvec = strsplit(x, " ")[[1]]
xhex = as.hexmode(xcharvec)
xraw = as.raw(xhex)
unserialize(xraw)
}
# R Visual imports tables with read.csv but no argument for strings_as_factors = F.
# This means some of the chunks are truncated (ie if they had a " " at the end).
# If you convert to a character and add a space if nchar == 9999 the deserialization works.
# (Thanks to Danny Shah)
dataset <- dataset %>%
mutate( model_str = as.character(model_str) ) %>%
mutate( model_str = ifelse(nchar(model_str) == 9999, paste0(model_str, " "), model_str) )
model_vct <- dataset %>%
filter(model_id == 1) %>%
distinct(model_id, chunk_id, model_str) %>%
arrange(model_id, chunk_id) %>%
pull(model_str)
finalfit.str <- paste( model_vct, collapse = "" )
finalfit <- from_byte_string(finalfit.str)
# get the user parameters
userdata <- dataset %>% select(AgeLabel,GenderLabel,CountryLabel,EducationLabel,EthnicityLabel) %>% unique()
# and then using them to make a prediction
myprediction <- predict(finalfit,newdata=data.frame(Age=userdata$AgeLabel,Gender=userdata$GenderLabel,Country=userdata$CountryLabel, Education=userdata$EducationLabel,Ethnicity=userdata$EthnicityLabel))
maxpred <- which(myprediction==max(myprediction))
myclass <- maxpred - 1
myprob <- myprediction[[maxpred]]
plot.new()
text(0.5,0.5,labels=sprintf("P(class = %s) = %s",myclass,as.character(round(myprob,2))),cex=3.5)
Error: Can't determine relationship between fields.
What has gone wrong here?
When I then clicked on the diagonal arrow to get to R Studio, this happens: Unable to construct R script data for use in external R IDE.
I need help as I am literally going crazy over this and I don't know how to resolve the issue! I would be really happy if you can help me
enter image description here
You made a error in line 34, and line 25.
Below is a fixed version of your code.
# The following code to create a dataframe and remove duplicated rows is always executed and acts as a preamble for your script:
# dataset <- data.frame(chunk_id, model_id, model_str, AgeLabel, GenderLabel, CountryLabel, EducationLabel, EthnicityLabel)
# dataset <- unique(dataset)
# Paste or type your script code here:
library(dplyr)
from_byte_string = function(x) {
xcharvec = strsplit(x, " ")[[1]]
xhex = as.hexmode(xcharvec)
xraw = as.raw(xhex)
unserialize(xraw)
}
# R Visual imports tables with read.csv but no argument for strings_as_factors = F.
# This means some of the chunks are truncated (ie if they had a " " at the end).
# If you convert to a character and add a space if nchar == 9999 the deserialization works.
# (Thanks to Danny Shah)
dataset <- dataset %>%
mutate( model_str = as.character(model_str) ) %>%
mutate( model_str = ifelse(nchar(model_str) == 9999, paste0(model_str, " "), model_str) )
model_vct <- dataset %>%
filter(model_id == 1) %>%
distinct(model_id, chunk_id, model_str) %>%
arrange(model_id, chunk_id) %>%
pull(model_str)
finalfit.str <- paste( model_vct, collapse = "" )
finalfit <- from_byte_string(finalfit.str)
# get the user parameters
userdata <- dataset %>% select(AgeLabel,GenderLabel,CountryLabel,EducationLabel,EthnicityLabel) %>% unique()
# and then using them to make a prediction
myprediction <- predict(finalfit,newdata=data.frame(Age=userdata$AgeLabel,Gender=userdata$GenderLabel,Country=userdata$CountryLabel, Education=userdata$EducationLabel,Ethnicity=userdata$EthnicityLabel))
maxpred <- which(myprediction==max(myprediction))
myclass <- maxpred - 1
myprob <- myprediction[[maxpred]]
plot.new()
text(0.5,0.5,labels=sprintf("P(class =
Good Luck!

Resources