sankey image not visualized in powerBI by R script - r

I have below sample csv data.
date,Data Center,Customer,companyID,source,target,value
6/1/2021,dcA,customer1,companyID1,step1:open_list_view,exit,1
6/1/2021,dcB,customer2,companyID2,step1:open_list_view,exit,1
6/1/2021,dcC,customer3,companyID3,step1:open_list_view,exit,1
6/2/2021,dcD,customer4,companyID4,step1:open_list_view,exit,2
6/2/2021,dcE,customer5,companyID5,step1:open_list_view,step2:switch_display_option,1
.....
Now I click 'R' icon and then enable R
Next I import the csv data to powerBI and drag and drop 'source', 'target' and 'value' columns to "Visualization->Values" section.
Then I run below R script, and want to visualize the sankey chart in PowerBI
# The following code to create a dataframe and remove duplicated rows is always executed and acts as a preamble for your script:
# dataset <- data.frame(source, target, value)
# dataset <- unique(dataset)
# Paste or type your script code here:
library(networkD3)
node_names <- unique(c(dataset$source, dataset$targe))
nodes <- data.frame(name=node_names)
links <- data.frame(source = match(dataset$source, node_names) - 1,
target = match(dataset$target, node_names) - 1,
value = dataset$value)
sankeyNetwork(Links = links, Nodes = nodes,
Source = "source",
Target = "target",
Value = "value")
but i got 'No image was created'.
how to correct the code to make the sankey chart can visualized in powerBI?

every Python and R visuals is plot. it requires the script to be plotted. add R script at end of your code.
something like
plot(source, target, type="n",)

Related

How to set-up R crosstalk filter_select "group" argument to match unlisted vector of individual character strings from shared dataframe

I am new to using the crosstalk package and I already have a unique case for which I would like to use it. I want to filter point coordinates based on any matching keyword contained in a list of keywords belonging to an individual point. I have generalized my problem to a minimal reproducible example described below.
I have a dataframe where each row is an item with one xy coordinate. Each point has a unique id. Each point also has a column called "tag" that is populated with a list of strings representing keywords associated with that point I would like to be able to search and filter on. Some points may have only one keyword, some may have multiple, and some may share keywords. I want to use the crosstalk::filter_select() function to ultimately be able to search for one keyword and see which points contain that keyword in the rendered html. For example, a search for the keyword "keyword2" should filter points that contain these example lists in the "tag" column:
filtered point 1: "keyword3" "keyword8" "keyword5" "keyword2"
filtered point 2: "keyword2"
filtered point 3: "keyword7" "keyword2"
The search should hide points that contain these example lists in the "tag" column:
hidden point 1: "keyword3" "keyword8" "keyword5" "keyword10"
hidden point 2: "keyword5"
hidden point 3: "keyword7" "keyword4"
I would like to be able to also supply "TRUE" the "multiple" argument to be able to enter more than one keyword and see what points have these in the list in the tag column of my dataframe. My example code renders but you can test this out by entering one of the keywords in the "Tag" search box. You will see the way I have the filter_select() set-up is not not appropriately filtering the points the way I am intending. Entering one keyword in the search box is filtering out points that do indeed have that keyword in its list. I believe the problem is in the "group" argument where I am supplying the unlisted "tag" column of keywords. I think I need some sort of function but I have not been able to figure out how to format it. Any help with this would be greatly appreciated. Thanks.
The code I provided below will produce this html:
This is the rendered html with no filtering. All dummy points are displayed:
This is the (undesired) result of entering a search term, one of the many keywords that may appear in the tag column of my dataframe. Only one point is displayed but there are several other points that contain that keyword in the list of keywords within the tag column.
This is the desired result. When I enter a keyword in the search box, I get all points that have that keyword in the list of associated keywords for that point. I had to manually select these rows in the table to achieve this result but I want the select_filter() function to be able to do this automatically for me.
title: "Min reproducible Ex"
author: "Me"
output:
flexdashboard::flex_dashboard:
theme: paper
#Import libraries
library(dplyr)
library(leaflet)
library(DT)
library(crosstalk)
#Create some dummy data for demonstration
keytags <- 1:10
df <- data.frame(
"id"=1:10,
"x"=seq(-90, -85.5, by=.5),
"y"= seq(30, 34.5, by=.5))
for(i in 1:nrow(df)){
tags <- as.vector(paste0("keyword", sample(keytags,i)))
df[i, "tag"][[1]] <- list(tags)
}
#Create shared df
sdf <- crosstalk::SharedData$new(df,key =~id, group="shareddata")
Interactives {data-icon="ion-stats-bars"}
Column {data-width=400}
Filters
filter_select(
id="tag",
label="Tag",
sharedData= sdf,
allLevels = TRUE,
group= ~unlist(tag)
)
Datatable
sdf %>%
DT::datatable(
filter = "top", # allows filtering on each column
extensions = c(
"Buttons", # add download buttons, etc
"Scroller" # for scrolling down the rows rather than pagination
),
rownames = FALSE, # remove rownames
style = "bootstrap",
class = "compact",
width = "100%",
options = list(
dom = "Blrtip", # specify content (search box, etc)
deferRender = TRUE,
scrollY = 300,
scroller = TRUE,
buttons = list(
I("colvis"), # turn columns on and off
"csv", # download as .csv
"excel" # download as .xlsx
)
)
)
Column
Interactive map
#Add map
#Make basemap
map <- sdf %>%
leaflet() %>%
#Add base layers
addTiles() %>%
#Add Markers
addMarkers(~ x, ~ y, popup = ~as.character(id))
map

Venn diagram in R completely blank

I am trying to create a Venn diagram for common differentially expressed genes across 3 data sets. I created a list that contains the differentially expressed genes, then I used the venn.diagram() function with the following arguments: x (which is my list of gene names in the three data sets) , filename,category.names and output. However, the Venn diagram is turning out completely blank, no category names nor numbers inside intersections.
My code looks like this:
venn.diagram(up, filename = 'venn_up.png', category.names = c('up_PC3', 'up_LAPC4', 'up_22Rv1'), output = TRUE)
Has anyone faced a similar problem? Thanks all!
Without reproducible dataset it is hard, so I created one:
genes <- paste("gene",1:1000,sep="")
x <- list(
up_PC3 = sample(genes,300),
up_LAPC4 = sample(genes,525),
up_22Rv1 = sample(genes,440)
)
You can use the following code to run a Venn diagram:
library(VennDiagram)
venn.diagram(x, filename = "venn_up.png", category.names = c('up_PC3', 'up_LAPC4', 'up_22Rv1'))
Than check at the right folder of your working directory for the output:

how to modify multiple cells in a loop in R without losing formatting?

Disclaimer: R noob here!
On a high level, I am trying to convert pdf to xls ;) The pdf is well formatted, no surprises expected. At one point I am trying to modify multiple cells using xlsx package in a loop. I've got a variable list of 3 - 5 elements and want to change the content of 7th column in .xls file, starting with 14th row. The list is coming from a PDF file (src.pdf below).
Here's the code:
library(xlsx)
library(pdftools)
library(stringr)
library(tabulizer)
library(tidyverse)
# for example data, separately download src.xls from https://file-examples-com.github.io/uploads/2017/02/file_example_XLS_100.xls, i. e. using wget -O src.xls https://file-examples-com.github.io/uploads/2017/02/file_example_XLS_100.xls
src <- xlsx::loadWorkbook(file = "src.xls")
sheets <- getSheets(src)
rows <- getRows(sheets$List1)
cc <- getCells(rows)
pdf_path <- "src.pdf"
# dest <- extract_tables("src.pdf", output="data.frame", area = list(c(163, 315, 217, 459)), guess = FALSE, header = FALSE)
dest <- extract_tables("https://unec.edu.az/application/uploads/2014/12/pdf-sample.pdf", output="data.frame", area=list(c(195,103,376,515)), guess = FALSE, header = FALSE)
#use as
#dest[[c(1,1)]][1]
#dest[[c(1,1)]][2]
#...
row = 0
for (i in 1:length(dest[[1]]$V1))
{
row = i+13
setCellValue(paste0("cc$`",row,".7`"), value = dest[[c(1,1)]][i])
}
This returns
Error in .jcall(cell, "V", "setCellValue", value) :
RcallMethod: cannot determine object class
Any ideas how to use setCellValue in a loop? I am open to using different modules as well, as long as they keep the formatting of the source .xls.
Thank you!

How to keep variable labels after raking in R Studio?

I think my problem is pretty simple to solve but as I'm totally new in R I don't know how to manage it.
So, I had to weight data from SPSS .sav dataset. First, I imported it into RStudio using foreign. Then I created another dataframe by iterake. However, after raking all variable labels are gone. I tried to copy labels from the source dataframe to weighted dataframe but - what is surprising - it seems R does not recognize source variable labels even though I see them in the RStudio window (below variable names). I mean after I load library labelled and launch var_label(SourceDF) I get only NULL values...
My goal is to copy new weighting variable into the source dataframe (and export the source back to the SPSS format) or copy variable labels from the source dataframe to the raked dataframe.
So:
How to create a weighted dataframe with variable labels (through iterake)?
OR
How to copy source variable labels to the weighted one?
This is the simplified code I created:
library(foreign)
library(iterake)
library(expss)
library(haven)
#source dataframe
df = read.spss("sourcedataset.sav", use.value.labels=TRUE, to.data.frame=TRUE)
#raking universe
uni = universe(data = df, category(name = "q1",
buckets = c("a", "b", "c", "d"),
targets = c(0.2,0.5,0.2,0.1), sum.1 = TRUE),
category(name = "q2",
buckets = c("e", "f"),
targets = c(0.8,0.2),
sum.1 = TRUE), N = 1000)
#creation of the raked dataframe
df.wgt = iterake(universe = uni)```
I you read through this thorough Documentation of variable labels you will find the necessary insight.
In short do the following:
Save the imported value and variable labels after the import
Reapply the saved labels after operations
The necessary commands are:
var_lab() #reads and sets variable labels
val_lab() #reads and sets value labels
variable_label <- var_lab(some_variable) #saving after import
#do stuff
var_lab(some_variable) <- variable_label #reapply labels

Is there a way to highlight specific nodes and edges using networkDynamic package in R?

I have a data set in csv format. I wanted to create a network object and make an animation out of this data set. I wrote a sample code in R to perform this operation.
What i want to is to highlight certain nodes (4,7) in the animation and add legends in the animation mentioning (4,7) the importance of those nodes. Any suggestions to solve this will be appreciated. The dataset is attached ::: raw_data
#Removing the saved list
rm(list = ls(all.names = T))
#loading libraries
library(networkDynamic)
#loading data
raw_data <- read.csv("arranged_data1.csv",header = TRUE)
timeData<-read.csv("arranged_data1.csv",header = TRUE)
#arranging data--------------------"onset"---beginning time of interaction:::"terminus"-----end of interaction
timeData$onset <- timeData$onset
timeData$terminus <- timeData$terminus
#finding the unique ant ids---------------"tail"--from where the interaction begins:::"head"--to where is the interaction
unique_ant_ids<-unique(c(timeData$tail,timeData$head))
#covert ids
timeData$head<- match(timeData$head,unique_ant_ids)
timeData$tail<- match(timeData$tail,unique_ant_ids)
#converting to network dynamic object
network_obj<-networkDynamic(edge.spells=timeData)
library(ndtv)
compute.animation(network_obj)
saveVideo(render.animation(enronDyn))

Resources