R: hide cells in DT::datatable based on condition - r

I am trying to create a datatable with child rows: the user will be able to click on a name and see a list of links related to that name. However, the number of itens to show is different for each name.
> data1 <- data.frame(name = c("John", "Maria", "Afonso"),
a = c("abc", "def", "rty"),
b=c("ghj","lop",NA),
c=c("zxc","cvb",NA),
d=c(NA, "mko", NA))
> data1
name a b c d
1 John abc ghj zxc <NA>
2 Maria def lop cvb mko
3 Afonso rty <NA> <NA> <NA>
I am using varsExplore::datatable2 to hide specific columns:
varsExplore::datatable2(x=data1, vars=c("a","b","c","d"))
and it produces the below result
Is it possible to modify DT::datatable in order to only render cells that are not "null"? So, for example, if someone clicked on "Afonso", the table would only render "rty", thus hiding "null" values for the other columns (for this row), while still showing those columns if the user clicked "Maria" (that doesn't have any "null").
(Should I try a different approach in order to achieve this behavior?)

A look into the inner working of varsExplore::datatable2
Following your request I took a look into the varsExplore::datatable2 source code. And I found out that varsExplore::datatable2 calls varsExplore:::.callback2 (3: means that it's not an exported function) to create the javascript code. this function also calls varsExplore:::.child_row_table2 which returns a javascript function format(row_data) that formats the rowdata into the table you see.
A proposed solution
I simply used my js knowledge to change the output of varsExplore:::.child_row_table2 and I came up with the following :
.child_row_table2 <- function(x, pos = NULL) {
names_x <- paste0(names(x), ":")
text <- "
var format = function(d) {
text = '<div><table >' +
"
for (i in seq_along(pos)) {
text <- paste(text, glue::glue(
" ( d[{pos[i]}]!==null ? ( '<tr>' +
'<td>' + '{names_x[pos[i]]}' + '</td>' +
'<td>' + d[{pos[i]}] + '</td>' +
'</tr>' ) : '' ) + " ))
}
paste0(text,
"'</table></div>'
return text;};"
)
}
the only change I did was adding the d[{pos[i]}]!==null ? ....... : '' which will only show the column pos[i] when its value d[pos[i]] is not null.
Looking at the fact that loading the package and adding the function to the global environment won't do the trick, I forked it on github and commited the changes you can now install it by running (the github repo is a read-only cran mirror can't submit pull request)
devtools::install_github("moutikabdessabour/varsExplore")
EDIT
if you don't want to redownload the package I found a solution basically you'll need to override the datatable2 function :
first copy the source code into your R file located at path/to/your/Rfile
# the data.table way
data.table::fwrite(list(capture.output(varsExplore::datatable2)), quote=F, sep='\n', file="path/to/your/Rfile", append=T)
# the baseR way
fileConn<-file("path/to/your/Rfile", open='a')
writeLines(capture.output(varsExplore::datatable2), fileConn)
close(fileConn)
then you'll have to substitute the last ligne
DT::datatable(
x,
...,
escape = -2,
options = opts,
callback = DT::JS(.callback2(x = x, pos = c(0, pos)))
)
with :
DT::datatable(
x,
...,
escape = -2,
options = opts,
callback = DT::JS(gsub("('<tr>.+?(d\\[\\d+\\]).+?</tr>')" , "(\\2==null ? '' : \\1)", varsExplore:::.callback2(x = x, pos = c(0, pos))))
)
what this code is basically doing is adding the js condition using a regular expression.
Result

Related

Everything works correct here. How we can check "title" is in dictionary , if I did not added or appended nothing but it adds itself

import csv
titles = {}
with open("sample4.csv", "r") as file:
reader = csv.DictReader(file)
for row in reader:
title = row["GameNumber"].strip().upper()
if title in titles:
titles[title] += 1
else:
titles[title] = 1
for title in titles:
print(title, titles[title])
Do you wish to replace you if/else block with something shorter?
You can use:
tiltes[title] = titles.get(title, 0) + 1

Sample python code to replace a substring value in an xlsx cell

Sample code snippet tried:
for row in range(1,sheet.max_row+1):
for col in range(1, sheet.max_column+1):
temp = None
cell_obj = sheet.cell(row=row,column=col)
temp = re.search(r"requestor", str(cell_obj.value))
if temp:
if 'requestor' in cell_obj.value:
cell_obj.value.replace('requestor',
'ABC')
Trying to replace from an xlsx cell containing value "Customer name: requestor " with value "Customer name: ABC" .How can this be achieved easily ?
I found my answer in this post:https://www.edureka.co/community/42935/python-string-replace-not-working
The replace function doesn't store the result in the same variable. Hence the solution for above:
mvar = None
for row in range(1,sheet.max_row+1):
for col in range(1, sheet.max_column+1):
temp = None
cell_obj = sheet.cell(row=row,column=col)
temp = re.search(r"requestor", str(cell_obj.value))
if temp:
if 'requestor' in cell_obj.value:
mvar = cell_obj.value.replace('requestor',
'ABC')
cell_obj.value = mvar
Just keep it simple. Instead of re and replace, search for the given value and override the cell.
The example below also gives you the ability to change 'customer name' if needed:
wb = openpyxl.load_workbook("example.xlsx")
sheet = wb["Sheet1"]
customer_name = "requestor"
replace_with = "ABC"
search_string = f"Customer name: {customer_name}"
replace_string = f"Customer name: {replace_with}"
for row in range(1, sheet.max_row + 1):
for col in range(1, sheet.max_column + 1):
cell_obj = sheet.cell(row=row, column=col)
if cell_obj.value == search_string:
cell_obj.value = replace_string
wb.save("example_copy.xlsx") # remember that you need to save the results to the file

The New York Times API with R

I'm trying to get articles' information using The New York Times API. The csv file I get doesn't reflect my filter query. For example, I restricted the source to 'The New York Times', but the file I got contains other sources also.
I would like to ask you why the filter query doesn't work.
Here's the code.
if (!require("jsonlite")) install.packages("jsonlite")
library(jsonlite)
api = "apikey"
nytime = function () {
url = paste('http://api.nytimes.com/svc/search/v2/articlesearch.json?',
'&fq=source:',("The New York Times"),'AND type_of_material:',("News"),
'AND persons:',("Trump, Donald J"),
'&begin_date=','20160522&end_date=','20161107&api-key=',api,sep="")
#get the total number of search results
initialsearch = fromJSON(url,flatten = T)
maxPages = round((initialsearch$response$meta$hits / 10)-1)
#try with the max page limit at 10
maxPages = ifelse(maxPages >= 10, 10, maxPages)
#creat a empty data frame
df = data.frame(id=as.numeric(),source=character(),type_of_material=character(),
web_url=character())
#save search results into data frame
for(i in 0:maxPages){
#get the search results of each page
nytSearch = fromJSON(paste0(url, "&page=", i), flatten = T)
temp = data.frame(id=1:nrow(nytSearch$response$docs),
source = nytSearch$response$docs$source,
type_of_material = nytSearch$response$docs$type_of_material,
web_url=nytSearch$response$docs$web_url)
df=rbind(df,temp)
Sys.sleep(5) #sleep for 5 second
}
return(df)
}
dt = nytime()
write.csv(dt, "trump.csv")
Here's the csv file I got.
It seems you need to put the () inside the quotes, not outside. Like this:
url = paste('http://api.nytimes.com/svc/search/v2/articlesearch.json?',
'&fq=source:',"(The New York Times)",'AND type_of_material:',"(News)",
'AND persons:',"(Trump, Donald J)",
'&begin_date=','20160522&end_date=','20161107&api-key=',api,sep="")
https://developer.nytimes.com/docs/articlesearch-product/1/overview

String recognition in idl

I have the following strings:
F:\Sheyenne\ROI\SWIR32_subset\SWIR32_2005210_East_A.dat
F:\Sheyenne\ROI\SWIR32_subset\SWIR32_2005210_Froemke-Hoy.dat
and from each I want to extract the three variables, 1. SWIR32 2. the date and 3. the text following the date. I want to automate this process for about 200 files, so individually selecting the locations won't exactly work for me.
so I want:
variable1=SWIR32
variable2=2005210
variable3=East_A
variable4=SWIR32
variable5=2005210
variable6=Froemke-Hoy
I am going to be using these to add titles to graphs later on, but since the position of the text in each string varies I am unsure how to do this using strmid
I think you want to use a combination of STRPOS and STRSPLIT. Something like the following:
s = ['F:\Sheyenne\ROI\SWIR32_subset\SWIR32_2005210_East_A.dat', $
'F:\Sheyenne\ROI\SWIR32_subset\SWIR32_2005210_Froemke-Hoy.dat']
name = STRARR(s.length)
date = name
txt = name
foreach sub, s, i do begin
sub = STRMID(sub, 1+STRPOS(sub, '\', /REVERSE_SEARCH))
parts = STRSPLIT(sub, '_', /EXTRACT)
name[i] = parts[0]
date[i] = parts[1]
txt[i] = STRJOIN(parts[2:*], '_')
endforeach
You could also do this with a regular expression (using just STRSPLIT) but regular expressions tend to be complicated and error prone.
Hope this helps!

How can I cut large csv files using any R packages like ff or data.table?

I want to cut large csv files (file size more than RAM size) and use them or save each in disk for later usage. Which R package is best for doing this for large files?
I haven't tried but using skip and nrows parameters in read.table or read.csv is worth a try. These are from ?read.table
skip integer: the number of lines of the data file to skip before
beginning to read data.
nrows integer: the maximum number of rows to read in. Negative and
other invalid values are ignored.
To avoid some troublesome issues at the end you need to do some error handling. In other words I don't know what happpens when skip value is greater than the number of rows in your big csv.
p.s. I also don't know whether header=TRUE is affecting skip or not, you also have to check that.
The answer given bu #berkorbay is OK and I can confirm that header can be used with skip. However, if your file is really large it gets painfully slow, as each subsequent reading after the first must skip over all previously read lines.
I had to do something similar and, after wasting quite a bit of time, I wrote a short script in PERL which fragments the original file in chuncks that you can read one after the other. It is much faster. I enclose the source here, translating some parts so that the intent is clear:
#!/usr/bin/perl
system("cls");
print("Fragment .csv file keeping header in each chunk\n") ;
print("\nEnter input file name = ") ;
$entrada = <STDIN> ;
print("\nEnter maximum number of lines in each fragment = ") ;
$nlineas = <STDIN> ;
print("\nEnter output file name stem = ") ;
$salida = <STDIN> ;
chop($salida) ;
open(IN,$entrada) || die "Cannot open input file: $!\n" ;
$cabecera = <IN> ;
$leidas = 0 ;
$fragmento = 1 ;
$fichero = $salida.$fragmento ;
open(OUT,">$fichero") || die "Cannot open output file: $!\n" ;
print OUT $cabecera ;
while(<IN>) {
if ($leidas > $nlineas) {
close(OUT) ;
$fragmento++ ;
$fichero = $salida.$fragmento ;
open(OUT,">$fichero") || die "Cannot open output file: $!\n" ;
print OUT $cabecera ;
$leidas = 0;
}
$leidas++ ;
print OUT $_ ;
}
close(OUT) ;
Just save with whatever name and execute. The first line might have to be changed if you have PERL in a diferent place (an, if you are on Windows, you migh have to invoke the script as "perl name-of-script").
One should have used read.csv.ffdf of ff package with specific parameters like this to read big file:
library(ff)
a <- read.csv.ffdf(file="big.csv", header=TRUE, VERBOSE=TRUE, first.rows=1000000, next.rows=1000000, colClasses=NA)
Once big file is read into a ff object, Subsetting ffobject into data frames can be done using:
a[1000:1000000,]
Rest of the code for subsetting and saving broken dataframes
totalrows = dim(a)[1]
row.size = as.integer(object.size(a[1:10000,])) / 10000 #in bytes
block.size = 200000000 #in bytes .IN Mbs 200 Mb
#rows.block is rows per block
rows.block = ceiling(block.size/row.size)
#nmaps is the number of chunks/maps of big dataframe(ff), nmaps = number of maps - 1
nmaps = floor(totalrows/rows.block)
for(i in (0:nmaps)){
if(i==nmaps){
df = a[(i*rows.block+1) : totalrows,]
}
else{
df = a[(i*rows.block+1) : ((i+1)*rows.block),]
}
#process df or save it
write.csv(df,paste0("M",i+1,".csv"))
#remove df
rm(df)
}
Alternatively you can first read the files into mysql using dbWriteTable and then use read.dbi.ffdf function from the ETLUtils package to read it back to R. Consider the function below;
read.csv.sql.ffdf <- function(file, name,overwrite = TRUE, header = TRUE, drv = MySQL(), dbname = "new", username = "root",host='localhost', password = "1234"){
conn = dbConnect(drv, user = username, password = password, host = host, dbname = dbname)
dbWriteTable(conn, name, file, header = header, overwrite = overwrite)
on.exit(dbRemoveTable(conn, name))
command = paste0("select * from ", name)
ret = read.dbi.ffdf(command, dbConnect.args = list(drv =drv, dbname = dbname, username = username, password = password))
return(ret)
}

Resources