Progress of Intrinsic Dimension Calculation in R - r

I am using the R packages "ider" and "intrinsicDimension". Only one function in the "intrinsicDimension" package: pcaLocalDimEst, has a verbose option, none of the functions in the "ider" has a verbose option.
Is there any way to get the progress of the calculations?
For instance, if I use the kernel version of the correlation dimension estimator for determining the intrinsic dimension:
estconvU <- convU(x=df, maxDim=20)
How do I obtain the progress of the calculation?

Type the following:
fix(convU)
Modify the first line, by adding "verbose=FALSE" to the end of the function call:
# -- - - - - - - - - - - - - - - - -vvvvvvvvvvvvv don't add this line
function (x, maxDim = 5, DM = FALSE, verbose=FALSE) # <- add this "verbose=FALSE"
# -- - - - - - - - - - - - - - - - -^^^^^^^^^^^^^ don't add this line
Then, scroll down to line 19 and add the following AFTER the for loop initialisation:
19: for (l in 1:maxDim) {
20: if(verbose) cat(paste("Working...", l, "\n")) # Add this line.
Then click the Save button at the bottom. If you made a mistake, R will complain.
If not, call the convU function but add verbose=TRUE and you should see some progress messages appear. For example, from the help page of convU:
x <- gendata(DataName='SwissRoll', n=1200)
estconvU <- convU(x=x, verbose = TRUE)
Working... 1
Working... 2
Working... 3
Working... 4
Working... 5

Related

Retrieve print function code for an S4 class/object

I am working with an S4 object (PairwiseAlignmentsSingleSubject). When I type the name of an instance of this class into the console and hit enter, I get something like the following:
Global PairwiseAlignmentsSingleSubject (1 of 1)
pattern: ATCGATCGATCGATCG
subject: -TCGATCG-TCGATC-
score: -16.23717
The actual class is much larger than just this, so I assume some print function for the class is being called. I really want to see the code used in this print function, but I cannot figure out how to pull it up. Could someone please offer some insights into this?
Thank you
Edit based on JDL's answer:
I was able to use selectMethod (I guess this class inherits its "print" function). However, the result is just:
> selectMethod("print",signature(x="BioStrings"))
Method Definition (Class "derivedDefaultMethod"):
function (x, ...)
UseMethod("print")
<bytecode: 0x5572111b6d58>
<environment: namespace:base>
Signatures:
x
target "BioStrings"
defined "ANY"
Which is still not very informative and definitely not the full code for the function. Does anyone know how I can take this further?
To get the code associated with a method, the function to use is getMethod. If you know the class you are interested in then the call is of the form below
getMethod("print",signature(x="numeric"))
which in your specific case is
getMethod("print",signature(x="PairwiseAlignmentsSingleSubject")).
If you have an object obj and want to see what code will be applied to it when you call print, then a slight modification allows this:
getMethod("print",signature(x=class(obj)))
This also works for any generic function, you can substitute "plot" etc. in place of "print".
Note that this only works for methods defined for the exact class you specified. If there isn't such a method (i.e. one based on inheritance was used instead), then replace getMethod with selectMethod.
For further information, consult the help page at ?getMethod which documents these and some similar functions.
Looking through their github is usually useful :)
They don't use 'print' but 'show'.
If you look at the source code for the class PairwiseAlignmentsSingleSubject you can see a show method, but it's only for the class PairwiseAlignmentsSingleSubjectSummary.
But in the last method for 'PairwiseAlignmentsSingleSubject' you see it creates a 'newPairwiseAlignments' and if you look a the source code for the class PairwiseAlignments you'll find the "show" method starting at line 398.
### - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
### The "show" method
###
### TODO: Maybe make the "show" method format the alignment in a SGD fashion
### i.e. split in 60-letter blocks and use the "|" character to highlight
### exact matches.
###
.show_PairwiseAlignments <- function(x)
{
x_len <- length(x)
if (x_len == 0L)
cat("Empty ")
cat(switch(type(x), "global"="Global", "overlap"="Overlap",
"local"="Local", "global-local" = "Global-Local",
"local-global"="Local-Global"),
" ", class(x), sep="")
if (x_len == 0L) {
cat("\n")
return()
}
cat(" (1 of ", x_len, ")\n", sep="")
x1 <- x[1L]
x_type <- type(x)
global.pattern <- x_type %in% c("global", "global-local")
global.subject <- x_type %in% c("global", "local-global")
p1start <- if (global.pattern)
""
else
paste0("[", start(x1#pattern#range), "]")
s1start <- if (global.subject)
""
else
paste0("[", start(x1#subject#range), "]")
width <- max(nchar(p1start), nchar(s1start))
if (width != 0L) {
width <- width + 1L
p1start <- format(p1start, justify="right", width=width)
s1start <- format(s1start, justify="right", width=width)
}
width <- getOption("width") - 9L - width
pattern1 <- toSeqSnippet(alignedPattern(x1)[[1L]], width)
subject1 <- toSeqSnippet(alignedSubject(x1)[[1L]], width)
cat("pattern:", p1start, " ", add_colors(pattern1), "\n", sep="")
cat("subject:", s1start, " ", add_colors(subject1), "\n", sep="")
cat("score:", score(x1), "\n")
}
setMethod("show", "PairwiseAlignments",
function(object) .show_PairwiseAlignments(object)
)

Error when checking partial derivatives of slinear structured metamodel

I am using the MetaModelStructuredComp Component to perform an interpolation in a 2D grid.
When checking the option to compute it with the 'slinear' method, the interpolation appears to work correctly, but when checking partial derivatives with a complex step, it returns a large error (order of 10^-1) for the derivatives with respect to the second dimension and second node (they belong to a grid point, but also the first node).
This does not happen when checking with all the other methods (cubic returning order of 10-15, and scipy_slinear with a finite difference check in the order of 10-10). The scipy_slinear check returns an analytical and numerical finite difference of that component which is almost identical to the one returned by the numerical finite difference of the slinear method (around -0.03338752, but note that the analytical returns -0.05107948)
I am not sure if its something I am missing, or if there is an error in the analytical partials for the slinear.
In my code, the first dimension is alpha with trained data of shape (12,), and second dimension mach (5,). I am checking with two outputs (C_D (12, 5) and C_L (12, 5), both having the same large error)
The LiftDragCoefficientsMetaModelPretrimmedGroup code is:
class LiftDragCoefficientsMetaModelPretrimmedGroup(om.Group):
def initialize(self):
self.options.declare('num_nodes', types=int,
desc='Number of nodes to be evaluated in the RHS')
self.options.declare('machs', default=np.arange(10),
desc='Vector of machs defining grid')
self.options.declare('alphas', default=np.arange(10),
desc='Vector of alphas defining grid')
self.options.declare('C_D_grid', default=np.zeros(10),
desc='Drag coefficients from grid')
self.options.declare('C_L_grid', default=np.zeros(10),
desc='Lift coefficients from grid')
self.options.declare('extrapolate', types=bool,
desc='Allow extrapolation if true',default=True)
self.options.declare('interp_method', types=str,
desc='Interlopation Method', default='slinear')
def setup(self):
comp=om.MetaModelStructuredComp(method=self.options['interp_method'],
extrapolate=self.options['extrapolate'] ,
vec_size=self.options['num_nodes'] )
comp.add_output('C_L', self.options['C_L_grid'].mean(), self.options['C_L_grid'])
comp.add_output('C_D', self.options['C_D_grid'].mean(), self.options['C_D_grid'])
comp.add_input('alpha', self.options['alphas'].mean(), self.options['alphas'])
comp.add_input('mach', self.options['machs'].mean(), self.options['machs'])
self.add_subsystem('comp', comp, promotes=["*"])
self.comp._no_check_partials = False # override skipping of check_partials
The code routine used is:
model = om.Group()
model.add_subsystem('InterpSubsystem',
LiftDragCoefficientsMetaModelPretrimmedGroup(num_nodes=3,
machs=rw.machs,
alphas=rw.alphas*np.pi/180.0,
C_L_grid=rw.c_Lt_grid,
C_D_grid=rw.c_Dt_grid,
interp_method='slinear',
extrapolate=False))
p = om.Problem(model)
p.setup(force_alloc_complex=True)
p.set_val('InterpSubsystem.alpha', np.array([35 * np.pi / 180, 10 * np.pi / 180, 8.5 * np.pi / 180]))
p.set_val('InterpSubsystem.mach', np.array([5, 7 , 7.5]))
p.run_model()
print(p['InterpSubsystem.C_L'])
print(p['InterpSubsystem.C_L']-np.array([rw.c_Lt_grid[8,1],rw.c_Lt_grid[3,3],0]))
print(p['InterpSubsystem.C_D'])
print(p['InterpSubsystem.C_D']-np.array([rw.c_Dt_grid[8,1],rw.c_Dt_grid[3,3],0]))
cpd = p.check_partials(compact_print=False, method='cs')
assert_check_partials(cpd, atol=1.0E-7, rtol=1.0E-7)
The error code is the following:
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
InterpSubsystem.comp: 'C_L' wrt 'alpha'
Analytic Magnitude : 1.313053e+01
Fd Magnitude : 1.313053e+01 (cs:None)
Absolute Error (Jan - Jfd) : 8.881784e-16
Relative Error (Jan - Jfd) / Jfd : 6.764226e-17
Raw Analytic Derivative (Jfor)
[[7.45312355 0. 0. ]
[0. 8.300603 0. ]
[0. 0. 6.92543442]]
Raw FD Derivative (Jfd)
[[7.45312355 0. 0. ]
[0. 8.300603 0. ]
[0. 0. 6.92543442]]
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
InterpSubsystem.comp: 'C_L' wrt 'mach'
Analytic Magnitude : 1.331524e-01
Fd Magnitude : 1.274173e-01 (cs:None)
Absolute Error (Jan - Jfd) : 1.769196e-02 *
Relative Error (Jan - Jfd) / Jfd : 1.388505e-01 *
Raw Analytic Derivative (Jfor)
[[-0.11945724 0. 0. ]
[ 0. -0.05107948 0. ]
[ 0. 0. -0.02916183]]
Raw FD Derivative (Jfd)
[[-0.11945724 0. 0. ]
[ 0. -0.03338752 0. ]
[ 0. 0. -0.02916183]]
--------------------------------
Component: InterpSubsystem.comp
--------------------------------
< output > wrt < variable > | abs/rel | norm | norm value
--------------------------- | ------- | ------ | --------------------
C_L wrt mach | abs | fwd-fd | 0.017691955836769413
C_L wrt mach | rel | fwd-fd | 0.13885048495831204
C_D wrt mach | abs | fwd-fd | 0.004933779456698817
C_D wrt mach | rel | fwd-fd | 0.05287403976054622
So, what I think is happening here (and what I stumbled upon with my random table) is that 7.0 is one of the points on which your Mach data is defined. When you use the 'slinear' method, the derivatives are discontinuous at that point. It's just one of the disadvantages of the method.
The discrepancy in the check happens because the bracketing algorithm, which figures out which bin you are interpolating, chooses the "left" bin when you are on a grid point, but the finite difference or complex step chooses the "right" bin because the default check direction is "forward".
To alleviate future confusion, we're going to set the check direction to "backward" for the structured meta model component whenever the 'slinear' method is chosen. The derivatives "matched" when I made this change. This change should make it into OpenMDAO 3.11.

How to create julia color scheme for displaying Ct scan Makie.jl

I use makie.jl with slicesNumb for visualization of PET/CT scans, I have 3d array of attenuation values and I display heatmap with changing slices using slider - this works well I have two problems
I do not know how to be able to define custom colormaps (basically I need to be able to specify that all above some threshold value will be black and all below white and values between will have grey values proportional to attenuation value).
2)I would like to be able to display to display over my image (tachnically heatmap) another ones where I would be able to controll transparency - alpha value of pixels - in order to display some annotations/ PET ...
code that works but without those 2 functionalities and how it looks
using GLMakie
```#doc
simple display of single image - only in transverse plane
```
function singleCtScanDisplay(arr ::Array{Number, 3})
fig = Figure()
sl_x = Slider(fig[2, 1], range = 1:1:size(arr)[3], startvalue = 40)
ax = Axis(fig[1, 1])
hm = heatmap!(ax, lift(idx-> arr[:,:, floor(idx)], sl_x.value) ,colormap = :grays)
Colorbar(fig[1, 2], hm)
fig
end
Thanks for help !
You can use Colors and ColorSchemeTools, but you will need to add the top and bottom of the scheme according to your thresholds.
using Colors, ColorSchemeTools
truemin = 0
truemax = 600
max_shown_black = 20
min_shown_white = 500
data = rand(truemin:truemax, (500, 500, 20))
grayscheme = [fill(colorant"black", max_shown_black - truemin + 1);
collect(make_colorscheme(identity, identity, identity,
length = min_shown_white - max_shown_black - 1));
fill(colorant"white", truemax - min_shown_white + 1)]
For controlling alpha, I would add a popup window with an alpha slider. Take a look at some of the distributable DICOM tools for examples.
I finally managed it basically I load 3 dimensional data stored in hdf5 (I loaded it into hdf5 from raw using python)
It enables viewing transverse slices and annotate 3d pathes in a mask that will be displayed over main image
exmpleH = #spawnat persistenceWorker Main.h5manag.getExample()
minimumm = -1000
maximumm = 2000
arrr= fetch(exmpleH)
imageDim = size(arrr)
using GLMakie
maskArr = Observable(BitArray(undef, imageDim))
MyImgeViewer.singleCtScanDisplay(arrr, maskArr,minimumm, maximumm)
Now definition of the required modules
```#doc
functions responsible for displaying medical image Data
```
using DrWatson
#quickactivate "Probabilistic medical segmentation"
module MyImgeViewer
using GLMakie
using Makie
#using GeometryBasics
using GeometricalPredicates
using ColorTypes
using Distributed
using GLMakie
using Main.imageViewerHelper
using Main.workerNumbers
## getting id of workers
```#doc
simple display of single image - only in transverse plane we are adding also a mask that
arrr - main 3 dimensional data representing medical image for example in case of CT each voxel represents value of X ray attenuation
minimumm, maximumm - approximately minimum and maximum values we can have in our image
```
function singleCtScanDisplay(arrr ::Array{Number, 3}, maskArr , minimumm, maximumm)
#we modify 2 pixels just in order to make the color range constant so slices will be displayed in the same windows
arrr[1,1,:].= minimumm
arrr[2,1,:].= maximumm
imageDim = size(arrr) # dimenstion of the primary image for example CT scan
slicesNumb =imageDim[3] # number of slices
#defining layout variables
scene, layout = GLMakie.layoutscene(resolution = (600, 400))
ax1 = layout[1, 1] = GLMakie.Axis(scene, backgroundcolor = :transparent)
ax2 = layout[1, 1] = GLMakie.Axis(scene, backgroundcolor = :transparent)
#control widgets
sl_x =layout[2, 1]= GLMakie.Slider(scene, range = 1:1: slicesNumb , startvalue = slicesNumb/2 )
sliderXVal = sl_x.value
#color maps
cmwhite = cgrad(range(RGBA(10,10,10,0.01), stop=RGBA(0,0,255,0.4), length=10000));
greyss = createMedicalImageColorSchemeB(200,-200,maximumm, minimumm )
####heatmaps
#main heatmap that holds for example Ct scan
currentSliceMain = GLMakie.#lift(arrr[:,:, convert(Int32,$sliderXVal)])
hm = GLMakie.heatmap!(ax1, currentSliceMain ,colormap = greyss)
#helper heatmap designed to respond to both changes in slider and changes in the bit matrix
currentSliceMask = GLMakie.#lift($maskArr[:,:, convert(Int32,$sliderXVal)])
hmB = GLMakie.heatmap!(ax1, currentSliceMask ,colormap = cmwhite)
#adding ability to be able to add information to mask where we clicked so in casse of mit matrix we will set the point where we clicked to 1
indicatorC(ax1,imageDim,scene,maskArr,sliderXVal)
#displaying
colorB = layout[1,2]= Colorbar(scene, hm)
GLMakie.translate!(hmB, Vec3f0(0,0,5))
scene
end
```#doc
inspired by https://github.com/JuliaPlots/Makie.jl/issues/810
Generaly thanks to this function the viewer is able to respond to clicking on the slices and records it in the supplied 3 dimensional AbstractArray
ax - Axis which store our heatmap slices which we want to observe wheather user clicked on them and where
dims - dimensions of main image for example CT
sc - Scene where our axis is
maskArr - the 3 dimensional bit array that has exactly the same dimensions as main Array storing image
sliceNumb - represents on what slide we are on currently on - ussually it just give information from slider
```
function indicatorC(ax::Axis,dims::Tuple{Int64, Int64, Int64},sc::Scene,maskArr,sliceNumb::Observable{Any})
register_interaction!(ax, :indicator) do event::GLMakie.MouseEvent, axis
if event.type === MouseEventTypes.leftclick
println("clicked")
##async begin
#appropriately modyfing wanted pixels in mask array
#async calculateMouseAndSetmaskWrap(maskArr, event,sc,dims,sliceNumb)
#
#
# println("fetched" + fetch(maskA))
# finalize(maskA)
#end
return true
#print("xMouse: $(xMouse) yMouse: $(yMouse) compBoxWidth: $(compBoxWidth) compBoxHeight: $(compBoxHeight) calculatedXpixel: $(calculatedXpixel) calculatedYpixel: $(calculatedYpixel) pixelsNumbInX $(pixelsNumbInX) ")
end
end
end
```#doc
wrapper for calculateMouseAndSetmask - from imageViewerHelper module
given mouse event modifies mask accordingly
maskArr - the 3 dimensional bit array that has exactly the same dimensions as main Array storing image
event - mouse event passed from Makie
sc - scene we are using in Makie
```
function calculateMouseAndSetmaskWrap(maskArr, event,sc,dims,sliceNumb)
maskArr[] = calculateMouseAndSetmask(maskArr, event,sc,dims,sliceNumb)
end
end #module
and helper methods
```#doc
functions responsible for helping in image viewer - those functions are meant to be invoked on separate process
- in parallel
```
using DrWatson
#quickactivate "Probabilistic medical segmentation"
module imageViewerHelper
using Documenter
using ColorTypes
using Colors, ColorSchemeTools
using Makie
export calculateMouseAndSetmask
export createMedicalImageColorSchemeB
# using AbstractPlotting
```#doc
given mouse event modifies mask accordingly
maskArr - the 3 dimensional bit array that has exactly the same dimensions as main Array storing image
event - mouse event passed from Makie
sc - scene we are using in Makie
```
function calculateMouseAndSetmask(maskArr, event,sc,dims,sliceNumb)
#position from top left corner
xMouse= Makie.to_world(sc,event.data)[1]
yMouse= Makie.to_world(sc,event.data)[2]
#data about height and width in layout
compBoxWidth = 510
compBoxHeight = 510
#image dimensions - number of pixels from medical image for example ct scan
pixelsNumbInX =dims[1]
pixelsNumbInY =dims[2]
#calculating over which image pixel we are
calculatedXpixel =convert(Int32, round( (xMouse/compBoxWidth)*pixelsNumbInX) )
calculatedYpixel = convert(Int32,round( (yMouse/compBoxHeight)*pixelsNumbInY ))
sliceNumbConv =convert(Int32,round( sliceNumb[] ))
#appropriately modyfing wanted pixels in mask array
return markMaskArrayPatch( maskArr ,CartesianIndex(calculatedXpixel, calculatedYpixel, sliceNumbConv ),2)
end
```#doc
maskArr - the 3 dimensional bit array that has exactly the same dimensions as main Array storing image
point - cartesian coordinates of point around which we want to modify the 3 dimensional array from 0 to 1
```
function markMaskArrayPatch(maskArr, pointCart::CartesianIndex{3}, patchSize ::Int64)
ones = CartesianIndex(patchSize,patchSize,patchSize) # cartesian 3 dimensional index used for calculations to get range of the cartesian indicis to analyze
maskArrB = maskArr[]
for J in (pointCart-ones):(pointCart+ones)
diff = J - pointCart # diffrence between dimensions relative to point of origin
if cartesianTolinear(diff) <= patchSize
maskArrB[J]=1
end
end
return maskArrB
end
```#doc
works only for 3d cartesian coordinates
cart - cartesian coordinates of point where we will add the dimensions ...
```
function cartesianTolinear(pointCart::CartesianIndex{3}) :: Int16
abs(pointCart[1])+ abs(pointCart[2])+abs(pointCart[3])
end
```#doc
creating grey scheme colors for proper display of medical image mainly CT scan
min_shown_white - max_shown_black range over which the gradint of greys will be shown
truemax - truemin the range of values in the image for which we are creating the scale
```
#taken from https://stackoverflow.com/questions/67727977/how-to-create-julia-color-scheme-for-displaying-ct-scan-makie-jl/67756158#67756158
function createMedicalImageColorSchemeB(min_shown_white,max_shown_black,truemax,truemin ) ::Vector{Any}
# println("max_shown_black - truemin + 1")
# println(max_shown_black - truemin + 1)
# println(" min_shown_white - max_shown_black - 1")
# println( min_shown_white - max_shown_black - 1)
# println("truemax - min_shown_white + 1")
# println(truemax - min_shown_white + 1)
return [fill(colorant"black", max_shown_black - truemin + 1);
collect(make_colorscheme(identity, identity, identity,
length = min_shown_white - max_shown_black - 1));
fill(colorant"white", truemax - min_shown_white + 1)]
end
end #module

Convert Firefox bookmarks JSON file to markdown

Background
I want to show part of my bookmarks on my Hugo website. The bookmarks from Firefox can be saved in JSON format, this is the source. The result should represent the nested structure somehow, in a format of a nested list, treeview or accordion. The source files of contents on the website are written in markdown. I want to generate a markdown file from the JSON input.
As I searched for possible solutions:
treeview or accordion: HTML, CSS and Javascript needed. I could not nest accordions with the <details> tag. Also, seems like overkill at the moment.
unordered list: can be done with bare markdown.
I chose to generate an unordered nested list from JSON. I would like to do this with R.
Input/output
Input sample: https://gist.github.com/hermanp/c01365b8f4931ea7ff9d1aee1cbbc391
Preferred output (indentation with two spaces):
- Info
- Python
- [The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)
- [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)
- [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
- [Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)
- Frontend
- [CodePen](https://codepen.io/)
- [JavaScript](https://www.javascript.com/)
- [CSS-Tricks](https://css-tricks.com/)
- [Butterick’s Practical Typography](https://practicaltypography.com/)
- [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)
- [Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)
- [Client-Side Web Development](https://info340.github.io/)
- [Stack Overflow](https://stackoverflow.com/)
- [HUP](https://hup.hu/)
- [Hope in Source](https://hopeinsource.com/)
Bonus preferred output: show favicons before links, like below (other suggestion welcomed, like loading them from the website's server instead of linking):
- ![https://cdn.sstatic.net/Sites/stackoverflow/Img/apple-touch-icon.png?v=c78bd457575a][Stack Overflow](https://stackoverflow.com/)
Attempt
generate_md <- function (file) {
# Encoding problem with tidyjson::read_json
bmarks_json_lite <- jsonlite::fromJSON(
txt = paste0("https://gist.githubusercontent.com/hermanp/",
"c01365b8f4931ea7ff9d1aee1cbbc391/raw/",
"33c21c88dad35145e2792b6258ede9c882c580ec/",
"bookmarks-example.json"))
# This is the start point, a data frame
level1 <- bmarks_json_lite$children$children[[2]]
# Get the name of the variable to modify it.
# Just felt that some abstraction needed.
varname <- deparse(substitute(level1))
varlevel <- as.integer(substr(varname, nchar(varname), nchar(varname)))
# Get through the data frame by its rows.
for (i in seq_len(nrow(get(varname)))) {
# If the type of the element in the row is "text/x-moz-place",
# then get its title and create a markdown list element from it.
if (get(varname)["type"][i] == "text/x-moz-place"){
# The two space indentation shall be multiplied as many times
# as deeply nested in the lists (minus one).
md_title <- paste0(strrep(" ", varlevel - 1),
"- ",
get(varname)["title"][i],
"\n")
# Otherwise do this and also get inside the next level.
} else if (get(varname)["type"][i] == "text/x-moz-place-container") {
md_title <- paste0(strrep(" ", varlevel - 1),
"- ",
get(varname)["title"][i],
"\n")
# I know this is not good, just want to express my thought.
# Create the next, deeper level's variable, whoose name shall
# represent the depth in the nest.
# Otherwise how can I multiply the indentation for the markdown
# list elements? It depends on the name of this variable.
varname <- paste0(regmatches(varname, regexpr("[[:alpha:]]+", varname)),
varlevel + 1L)
varlevel <- varlevel + 1L
assign(varname, get(varname)["children"][[i]])
# The same goes on as seen at the higher level.
for (j in seq_len(nrow(get(varname)))){
if (get(varname)["type"][i] == "text/x-moz-place"){
md_title <- paste0(strrep(" ", varlevel - 1),
"- ",
get(varname)["title"][i],
"\n")
} else if (get(varname)["type"][i] == "text/x-moz-place-container") {
md_title <- paste0(strrep(" ", varlevel - 1),
"- ",
get(varname)["title"][i],
"\n")
varname <- paste0(regmatches(varname, regexpr("[[:alpha:]]+", varname)),
varlevel + 1L)
varlevel <- varlevel + 1L
assign(varname, get(varname)["children"][[i]])
for (k in seq_len(nrow(get(varname)))){
# I don't know where this goes...
# Also I need to paste somewhere the md_title strings to get the
# final markdown output...
}
}
}
}
}
}
Question
How can I recursively grab and paste strings from this JSON file? I tried to search for tips in recursion, but it's quite a hard topic. Any suggestion, package, function, link will be welcomed!
I know you asked for a solution in R.
Just as a suggestion, here is a solution using jq, as it is very suitable for json transformations.
#!/bin/bash
BOOKMARKS='FirefoxBookmarks.json'
jq -r '
def bookmark($iconuri; $title; $uri):
if $iconuri != null then "![\($iconuri)]" else "" end +
"[\($title)](\($uri))";
def bookmarks:
(objects | to_entries[]
| if .value | type == "array" then (.value | bookmarks)
else .value end ) //
(arrays[] | [bookmarks] | " - \(.[0])", " \(.[1:][])" );
(.. | .children? | arrays)
|= map(if .uri != null then {bookmark: bookmark(.iconuri; .title; .uri)}
else {title} end +
{children})
| del(..| select(length == 0)) # remove empty children and empty titles
| del(..| select(length == 0)) # remove objects that got empty because of previous deletion
| del(..| objects | select(has("title") and (.children | length == 0))) # remove objects with title but no children
| .children # remove root level
| bookmarks
' < "$BOOKMARKS"
output:
- Könyvjelzők eszköztár
- Info
- Python
- ![fake-favicon-uri:https://www.freecodecamp.org/news/the-python-guide-for-beginners/][The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)
- [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)
- [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
- ![https://github.githubassets.com/favicons/favicon.svg][Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)
- Frontend
- ![https://static.codepen.io/assets/favicon/favicon-touch-de50acbf5d634ec6791894eba4ba9cf490f709b3d742597c6fc4b734e6492a5a.png][CodePen](https://codepen.io/)
- ![https://www.javascript.com/etc/clientlibs/pluralsight/main/images/favicons/android-chrome-192x192.png][JavaScript](https://www.javascript.com/)
- ![https://css-tricks.com/apple-touch-icon.png][CSS-Tricks](https://css-tricks.com/)
- [Butterick’s Practical Typography](https://practicaltypography.com/)
- [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)
- ![https://www.smashingmagazine.com/images/favicon/app-icon-512x512.png][Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)
- ![https://info340.github.io/img/busy-spider-icon.png][Client-Side Web Development](https://info340.github.io/)
- ![https://cdn.sstatic.net/Sites/stackoverflow/Img/apple-touch-icon.png?v=c78bd457575a][Stack Overflow](https://stackoverflow.com/)
- ![https://hup.hu/profiles/hupper/themes/hup_theme/favicon.ico][HUP](https://hup.hu/)
- [Hope in Source](https://hopeinsource.com/)
After I watched a few videos on recursion and saw a few code examples, I tried, manually stepped through the code and somehow managed to do it with recursion. This solution is independent on the nestedness of the bookmarks, therefore a generalized solution for everyone.
Note: all the bookmarks were in the Bookmarks Toolbar in Firefox. This is highlighted in the generate_md function. You can tackle with it there. If I improve the answer later, I will make it more general.
library(jsonlite)
# This function recursively converts the bookmark titles to unordered
# list items.
recursive_func <- function (level) {
md_result <- character()
# Iterate through the current data frame, which may have a children
# column nested with other data frames.
for (i in seq_len(nrow(level))) {
# If this element is a bookmark and not a folder, then grab
# the title and construct a list item from it.
if (level[i, "type"] == "text/x-moz-place"){
md_title <- level[i, "title"]
md_uri <- level[i, "uri"]
md_iconuri <- level[i, "iconuri"]
# Condition: the URLs all have schema (http or https) part.
# If not, filname will be a zero length character vector.
host_url <- regmatches(x = md_uri,
m = regexpr(pattern = "(?<=://)[[:alnum:].-]+",
text = md_uri,
perl = T))
md_link <- paste0("[", md_title, "]", "(", md_uri, ")")
md_listitem <- paste0("- ", md_link, "\n")
# If this element is a folder, then get into it, call this
# function over it. Insert two space (for indentation) in
# the generated sting before every list item. Paste this
# list of items to the folder list item.
} else if (level[i, "type"] == "text/x-moz-place-container") {
md_title <- level[i, "title"]
md_listitem <- paste0("- ", md_title, "\n")
md_recurs <- recursive_func(level = level[i, "children"][[1]])
md_recurs <- gsub("(?<!(\\w ))-(?= )", " -", md_recurs, perl = T)
md_listitem <- paste0(md_listitem, md_recurs)
}
# Collect and paste the list items of the current data frame.
md_result <- paste0(md_result, md_listitem)
}
# Return the (sub)list of the data frame.
return(md_result)
}
generate_md <- function (jsonfile) {
# Encoding problem with tidyjson::read_json
bmarks_json_lite <- fromJSON(txt = jsonfile)
# This is the start point, a data frame. It represents the
# elements inside the Bookmarks Toolbar in Firefox.
level1 <- bmarks_json_lite$children$children[[2]]
# Do not know how to make it prettier, but it works.
markdown_result <- recursive_func(level = level1)
return(markdown_result)
}
You can run the generate_md function with the example.
generate_md(paste0("https://gist.githubusercontent.com/hermanp/",
"c01365b8f4931ea7ff9d1aee1cbbc391/raw/",
"33c21c88dad35145e2792b6258ede9c882c580ec/",
"bookmarks-example.json"))
# Output
[1] "- Info\n - Python\n - [The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)\n - [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)\n - [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)\n - [Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)\n - Frontend\n - [CodePen](https://codepen.io/)\n - [JavaScript](https://www.javascript.com/)\n - [CSS-Tricks](https://css-tricks.com/)\n - [Butterick’s Practical Typography](https://practicaltypography.com/)\n - [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)\n - [Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)\n - [Client-Side Web Development](https://info340.github.io/)\n - [Stack Overflow](https://stackoverflow.com/)\n - [HUP](https://hup.hu/)\n - [Hope in Source](https://hopeinsource.com/)\n"
You can cat it and write it to a file also with writeLines. But bevare! In Windows environments, you probably need to turn useBytes = TRUE to get the correct characters in the file. Reference: UTF-8 file output in R
cat(generate_md(paste0("https://gist.githubusercontent.com/hermanp/",
"c01365b8f4931ea7ff9d1aee1cbbc391/raw/",
"33c21c88dad35145e2792b6258ede9c882c580ec/",
"bookmarks-example.json")))
# Output
- Info
- Python
- [The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)
- [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)
- [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
- [Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)
- Frontend
- [CodePen](https://codepen.io/)
- [JavaScript](https://www.javascript.com/)
- [CSS-Tricks](https://css-tricks.com/)
- [Butterick’s Practical Typography](https://practicaltypography.com/)
- [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)
- [Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)
- [Client-Side Web Development](https://info340.github.io/)
- [Stack Overflow](https://stackoverflow.com/)
- [HUP](https://hup.hu/)
- [Hope in Source](https://hopeinsource.com/)
There was a problem with the regex part. If there are bookmarks with some - title (space, hyphen, space) characters in their titles, these hyphens will also be "indented" as the list items.
# Input JSON
https://gist.github.com/hermanp/381eaf9f2bf5f2b9cdf22f5295e73eb5
cat(generate_md(paste0("https://gist.githubusercontent.com/hermanp/",
"381eaf9f2bf5f2b9cdf22f5295e73eb5/raw/",
"76b74b2c3b5e34c2410e99a3f1b6ef06977b2ec7/",
"bookmarks-example-hyphen.json")))
# Output (two space indentation) markdown:
- Info
- Python
- [The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)
- [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)
- [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
- [Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)
- Frontend
- [CodePen](https://codepen.io/)
- [JavaScript - Wikipedia](https://en.wikipedia.org/wiki/JavaScript) # correct
- [CSS-Tricks](https://css-tricks.com/)
- [Butterick’s Practical Typography](https://practicaltypography.com/)
- [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)
- [Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)
- [Client-Side Web Development](https://info340.github.io/)
- [Stack Overflow](https://stackoverflow.com/)
- [HUP](https://hup.hu/)
- [Hope in Source](https://hopeinsource.com/)
I posted another question about this problem. After some hint and try I answered my own question.

RCurl: Display progress meter in Rgui

Using R.exe or Rterm.exe, this gives an excellent progress meter.
page=getURL(url="ftp.wcc.nrcs.usda.gov", noprogress=FALSE)
In Rgui I am limited to:
page=getURL(url="ftp.wcc.nrcs.usda.gov",
noprogress=FALSE, progressfunction=function(down,up) print(down))
which gives a very limited set of download information.
Is there a way to improve this?
I start doubting that with standard R commands it is possible to reprint overwriting the current line, which is what RCurl does in non-GUI mode.
I am glad to tell that I was wrong. At least for a single line, \r can do the trick. In fact:
conc=function(){
cat(" abcd")
cat(" ABCD", '\n')
}
conc()
# abcd ABCD
But:
over=function(){
cat(" abcd")
cat("\r ABCD", "\n")
}
over()
# ABCD
That given, I wrote this progressDown function, which can monitor download status rewriting always on the same same line:
library(RCurl) # Don't forget
### Callback function for curlPerform
progressDown=function(down, up, pcur, width){
total=as.numeric(down[1]) # Total size as passed from curlPerform
cur=as.numeric(down[2]) # Current size as passed from curlPerform
x=cur/total
px= round(100 * x)
## if(!is.nan(x) && px>60) return(pcur) # Just to debug at 60%
if(!is.nan(x) && px!=pcur){
x= round(width * x)
sc=rev(which(total> c(1024^0, 1024^1, 1024^2, 1024^3)))[1]-1
lb=c('B', 'KB', 'MB', 'GB')[sc+1]
cat(paste(c(
"\r |", rep.int(".", x), rep.int(" ", width - x),
sprintf("| %g%s of %g%s %3d%%",round(cur/1024^sc, 2), lb, round(total/1024^sc, 2), lb, px)),
collapse = ""))
flush.console() # if the outptut is buffered, it will go immediately to console
return(px)
}
return(pcur)
}
Now we can use the callback with curlPerform
curlProgress=function(url, fname){
f = CFILE(fname, mode="wb")
width= getOption("width") - 25 # you can make here your line shorter/longer
pcur=0
ret=curlPerform(url=url, writedata=f#ref, noprogress=FALSE,
progressfunction=function(down,up) pcur<<-progressDown(down, up, pcur, width),
followlocation=T)
close(f)
cat('\n Download', names(ret), '- Ret', ret, '\n') # is success?
}
Running it with a small sample binary:
curlProgress("http://www.nirsoft.net/utils/websitesniffer-x64.zip", "test.zip")
the intermediate output at 60% is (no # protection):
|................................. | 133.74KB of 222.75KB 60%
where KB, will be adjusted to B, KB, MB, GB, based on total size.
Final output with success status, is:
|.......................................................| 222.61KB of 222.75KB 100%
Download OK - Ret 0
Note, the output line width is relative to R width option (which controls the maximum number of columns on a line) and can be customised changing the curlProgress line:
width= getOption("width") - 25
This is enough for my needs and solves my own question.
Here's a simple example using txtProgressBar. Basically, just do a HEAD request first to get the file size of the file you want to retrieve, then setup a txtProgressBar with that as its max size. Then you use the progressfunction argument to curlPerform to call setTxtProgressBar. It all works very nicely (unless there is no "content-length" header, in which case this code works by just not printing a progress bar).
url <- 'http://stackoverflow.com/questions/21731548/rcurl-display-progress-meter-in-rgui'
h <- basicTextGatherer()
curlPerform(url=url, customrequest='HEAD',
header=1L, nobody=1L, headerfunction=h$update)
if(grepl('Transfer-Encoding: chunked', h$value())) {
size <- 1
} else {
size <- as.numeric(strsplit(strsplit(h$value(),'\r\nContent-Type')[[1]][1],
'Content-Length: ')[[1]][2])
}
bar <- txtProgressBar(0, size)
h2 <- basicTextGatherer()
get <- curlPerform(url=url, noprogress=0L,
writefunction=h2$update,
progressfunction=function(down,up)
setTxtProgressBar(bar, down[2]))
h2$value() # return contents of page
The output is just a bunch of ====== across the console.
What about:
curlProgress=function(url, fname){
f = CFILE(fname, mode="wb")
prev=0
ret=curlPerform(url=url, writedata=f#ref, noprogress=FALSE,
progressfunction=function(a,b){
x=round(100*as.numeric(a[2])/as.numeric(a[1]))
if(!is.nan(x) && x!=prev &&round(x/10)==x/10) prev<<-x else x='.'
cat(x)
}, followlocation=T)
close(f)
cat(' Download', names(ret), '- Ret', ret, '\n')
}
?
It prints dots or percent download divisible by 10 and breaks line on 50%.
And with a small 223 KB file:
curlProgress("http://www.nirsoft.net/utils/websitesniffer-x64.zip", "test.zip")
it sounds like this:
................10...............20................30...............40...............50
..............................70...............80...............90...............100... Download OK - Ret 0
I start doubting that with standard R commands it is possible to reprint overwriting the current line, which is what RCurl does in non-GUI mode.

Resources