I want to save my Spark DataFrame into directory using spark_write_* function like this:
spark_write_csv(df, "file:///home/me/dir/")
but if the directory is already there I will get error:
ERROR: org.apache.spark.sql.AnalysisException: path file:/home/me/dir/ already exists.;
When I'm working on the same data, I want to overwrite this dir - how can I achieve this? In documentation there is one parameter:
mode Specifies the behavior when data or table already exists.
but it doesn't say what value you should use.
Parameter mode should simply have value "overwrite":
spark_write_csv(df, "file:///home/me/dir/", mode = "overwrite")
Related
I tried using the Set Global Variable keyword but I am not sure if I am doing this correct. I was hoping to make 2 files have access to this variable. Here is my variables section. I am using VS code with Robot plugin for syntax highlighting. The error show when I run this
is:
Invalid variable name 'Set Global Variable'.robotcode.diagnostics(ModelError)
*** Variables ***
Set Global Variable ${VERBOSE} 0
${SERIAL_PORT} None
Is there a special library I need to import to use Set Global Variable?
My use case is that I have 2 robot files both of them need to know if Verbose mode is enabled. I pass verbose to file_1.robot file via command line, I was hoping I could also pass the verbose variable to a Resource file_2.robot, but I am unable to do this because
in my second file there is no "command line argument to pass in to it"
Is there a way from file_1.robot I can set/update a variable in file_2.robot ?
For file one i can do this via command line, but for file 2 I was hoping something like this would exist:
Resource ../resources/Serial.robot -v Verbose:0
(in this case Serial.robot is the infamous file 2 )
To make things even simpler I dont need Verbose in File 1 , i Just need it to pass it on to the resource file somehow
Set Global Variable is a keyword that could be used inside test cases or custom keywords. What you need is to define variable (link to documentation).
I pass verbose to file_1.robot file via command line, I was hoping I could also pass the verbose variable to a Resource file_2.robot, but I am unable to do this because in my second file there is no "command line argument to pass in to it"
Nope, you are passing global variable visible from everywhere. Take a look at documentation about scopes and priorities for variables.
If you want to define once and use in multiple places you could create file with common variables and import in both files. Example:
Here you define:
# BaseVariables.robot
*** Variables ***
${VERBOSE} 0
And use:
# file_1.robot
*** Settings ***
Resource BaseVariables.robot
and in second file
# file_2.robot
*** Settings ***
Resource BaseVariables.robot
I'd like to create a log file which is named after the originally run Julia file, for example here julia foo.jl I'd want foo.jl. From within a Julia session how can I get this information>
The global constant PROGRAM_FILE is set to the script name.
This can be done by inspecting the stack
# first get the top of the stack
f = stacktrace()[1]
# then get the file's name as a string, note the is absolute.
abs_filename = String(f.file)
println(abs_filename)
# to get only the filename use
println(basename(abs_filename))
When I run zarr.open('result.zarr', mode='r') I get the following error:
FSPathExistNotDir: path exists but is not a directory: %r
According to the example in the Zarr documentation located at https://zarr.readthedocs.io/en/stable/tutorial.html#persistent-arrays, this zarr.open() function should return a zarr.core.Array:
z2 = zarr.open('data/example.zarr', mode='r')
np.all(z1[:] == z2[:])
How come the zarr.open() function is looking for a directory in my case?
I see my confusion. For me, example.zarr is the name of the file (it seems I wrongly named it), and not a directory.
I was also confused because zarr.open() creates a zarray but does not open an existing zarray like the function name implies.
From kaggle.com/kneroma/zarr-files-and-l5kit-data-for-dummies:
z1 = zarr.open('data/example.zarr', mode='w', shape=(10000, 10000), chunks=(1000, 1000), dtype='i4') z1
The array above will store its configuration metadata and all
compressed chunk data in a directory called ‘data/example.zarr’
relative to the current working directory.
The zarr.convenience.open() function provides a convenient way to
create a new persistent array or continue working with an existing
array.
I want to simulate a specific case of the UNIX rename() function to rename directories. Its MAN page specifies:
oldpath can specify a directory. In this case, newpath must either not exist, or it must specify an empty directory.
I want to simulate the last case:
newpath exists and is empty.
So I have created the directories foo_dir and bar_dir and called MoveFileEx() in order to rename foo_dir to bar_dir. Here is the code without error management:
mkdir("foo_dir");
mkdir("bar_dir");
MoveFileEx("foo_dir", "bar_dir", MOVEFILE_REPLACE_EXISTING)
But MoveFileEx() always fails with error 5 (access denied). I have tried with other flags for MoveFileEx() without success.
Must I manually remove bar_dir if it exists and is empty before calling MoveFileEx()? Or is there another solution?
I have not tried ReplaceFile() yet.
I want to use a relative file path as a command line argument but as the example and assessment below will demonstrate, the variable passes \..\ as a string, it doesn't evaluate it.
Can I can force the command line to parse and expand the variable as a string?
: For example: I have a R script file I want to launch from the command line:
Set RPath=C:\Program Files\R\R-3.1.0\bin\Rscript.exe
SET RScript=%CD%\..\..\HCF_v9.R
SET SourceFile=%CD%\..\Source\
ECHO String used for Source Location - %SourceFile%
"%RPath%" "%RScript%" %SourceFile%
The inclusion of \..\ works in the call to R as an external program because the batch file can resolve it's own commands.
The variable of SourceFile however doesn't work because the SourceFile variable hasn't expanded \..\, it has just included it as part of the string and R can't process \..\
You can use the for replaceable parameters to resolve to the real path
for %%a in ("..\..\HCF_v9.R") do set "RScript=%%~fa"
#MC ND has provided the batch file approach; an R-centric approach would be to pass the current directory to R, and modify it there.
; batch file
Set RPath=C:\Program Files\R\R-3.1.0\bin\Rscript.exe
SET RScript=%CD%\..\..\HCF_v9.R
"%RPath%" "%RScript%" %CD%
# in R
srcpath <- commandArgs(TRUE)[1]
srcpath <- normalizePath(file.path(srcpath, "../Source"))