mpirun: token slots not supported - mpi

I try to run a programme with:
~/mpich3/bin/mpirun --hostfile hosts_8_12.txt python simulation.py
but I get this error:
[mpiexec#pomegranate] HYDU_process_mfile_token (utils/args/args.c:296): token slots not supported at this time
[mpiexec#pomegranate] HYDU_parse_hostfile (utils/args/args.c:343): unable to process token
[mpiexec#pomegranate] mfile_fn (ui/mpich/utils.c:336): error parsing hostfile
[mpiexec#pomegranate] match_arg (utils/args/args.c:152): match handler returned error
[mpiexec#pomegranate] HYDU_parse_array (utils/args/args.c:174): argument matching returned error
[mpiexec#pomegranate] parse_args (ui/mpich/utils.c:1596): error parsing input array
[mpiexec#pomegranate] HYD_uii_mpx_get_parameters (ui/mpich/utils.c:1648): unable to parse user arguments
[mpiexec#pomegranate] main (ui/mpich/mpiexec.c:153): error parsing parameters
Here is my hostfile:
c00 slots=12
c01 slots=12
c02 slots=12
c03 slots=12
c04 slots=12
c05 slots=12
c06 slots=12
c07 slots=12
I am using mpich-3.1.3. When I run the programme without specifying the slots in my hostfile it works well. Do you have an idea where the problem could come from?

I believe the slots keyword is used in Open MPI, not MPICH. The hostfile is a non-standard thing that each implementation specified in its own way. For MPICH, you can see the details here, but the short version is that your file should look like this:
c00:12
c01:12
c02:12
c03:12
c04:12
c05:12
c06:12
c07:12

Related

Is there an alternative to save_kable() in R?

I cannot use save_kable() to save tables created with knitr::kable() and knitr::kableExtra as images. It looks like this is due to PhantomJS being blocked by admins. Is there another function that could help me save them? This is the error i get:
Error in process_initialize(self, private, command, args, stdin, stdout, …: ! Native call to processx_exec failed Caused by error in
chain_call(c_processx_exec, command, c(command, args), pty, pty_options, …: ! create process
'C:\Users\user1\AppData\Roaming/PhantomJS/phantomjs.exe' (system error
1260, This program is blocked due to group policy. Contact the systems responsible person for more information. )
#win/processx.c:1040 (processx_exec)
PD: I translated to English the message that comes after "system error 1260...".

Files not being parsed by Gnatchop

Gnatchop - I am trying to run several files through gnatchop and I am getting 3 error messages for every file. I originally thought that the error was simply the permissions were wrong. But I changed the permissions and I still get the errors.
file.a: parse errors detected
file.a: chop may not be successful
file.a: error parsing offset info
Is there something I need to do to the files before I run them through Gnatchop?

PHP CodeSniffer: ERROR: The specified sniff code "Generic.Files.LineEndings.InvalidEOLChar" is invalid

My attempt to exclude the check for the EOL char on my Windows machine always results in this error message:
>vendor\bin\phpcs.bat --standard=PSR2 --exclude=Generic.Files.LineEndings.InvalidEOLChar src\version.php
ERROR: The specified sniff code "Generic.Files.LineEndings.InvalidEOLChar" is invalid
Run "phpcs --help" for usage information
Can't figure out what I'm doing wrong. I have installed PHP CodeSniffer via composer and am running version 3.4.0.
The --exclude CLI argument accepts 3-part sniffs codes, but you've passed in a 4-part error code.
In your case, the sniff code is Generic.Files.LineEndings and that sniff only generates a single error code, so you'll be fine ignoring the entire sniff:
vendor\bin\phpcs.bat --standard=PSR2 --exclude=Generic.Files.LineEndings src\version.php
If you want to exclude individual error codes, or if you just want to lock down a standard for your project, you'll need to use a ruleset.xml file: https://github.com/squizlabs/PHP_CodeSniffer/wiki/Annotated-Ruleset

Get data from OpenDap server that requires authentication using R

I'm trying to get data from an OPeNDAP server using R and the ncdf4 package. However, the nasa eosdis server requires username / password. How can I pass this info using R?
Here is what I'm trying to do:
require(ncdf4)
f1 <- nc_open('https://disc2.gesdisc.eosdis.nasa.gov/opendap/TRMM_L3/TRMM_3B42.7/2018/020/3B42.20180120.15.7.HDF')
And the error message:
Error in Rsx_nc4_get_vara_double: NetCDF: Authorization failure syntax
error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or
SCAN_ERROR context: HTTP^ Basic: Access denied. Var: nlat Ndims: 1
Start: 0 Count: 400 Error in ncvar_get_inner(d$dimvarid$group_id,
d$dimvarid$id, default_missval_ncdf4(), : C function
R_nc4_get_vara_double returned error
I tried the url https://username:password#disc2.... but that did not work also.
Daniel,
The service you are accessing is using third-party redirection to authenticate users. Therefore the simple way of providing credentials in the URL doesn't work.
You need to create 2 files.
A .dodsrc file (a RC file for the netcdf-c library) with the following content
HTTP.COOKIEFILE=.cookies
HTTP.NETRC=.netrc
A .netrc file, in the location referenced in the .dodsrc, with your credentials:
machine urs.earthdata.nasa.gov
login YOURUSERNAMEHERE
password YOURPASWORDHERE
You can find more details at
https://www.unidata.ucar.edu/software/netcdf/docs/md__Users_wfisher_Desktop_v4_86_81-prep_netcdf-c_docs_auth.html
Regards
Antonio
unfortunately, even after defining the credentials and their location
ncdf4::nc_open("https://gpm1.gesdisc.eosdis.nasa.gov/opendap/GPM_L3/GPM_3IMERGDE.06/2020/08/3B-DAY-E.MS.MRG.3IMERG.20200814-S000000-E235959.V06.nc4")
still returns
Error in Rsx_nc4_get_vara_double: NetCDF: Authorization failure
The same happens when using ncdump from a terminal:
$ ncdump https://gpm1.gesdisc.eosdis.nasa.gov/opendap/GPM_L3/GPM_3IMERGDE.06/2020/08/3B-DAY-E.MS.MRG.3IMERG.20200814-S000000-E235959.V06.nc4
returns
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or
SCAN_ERROR context: HTTP^ Basic: Access denied. NetCDF: Authorization
failure Location: file
/build/netcdf-KQb2aQ/netcdf-4.6.0/ncdump/vardata.c; line 473

StreamSets HTTP Client

I'm working with StreamSets on a Cloudera Distribution, trying to ingest some data from this website http://files.data.gouv.fr/sirene/
I've encountered some issues choosing the parameters of both the HTTP Client and the Hadoop FS Destination.
https://image.noelshack.com/fichiers/2017/44/2/1509457504-streamsets-f.jpg
I get this error : HTTP_00 - Cannot parse record: java.io.IOException: org.apache.commons.compress.archivers.ArchiveException: No Archiver found for the stream signature
I'll show you my configuration.
HTTP Client :
General
Name : HTTP Client INSEE
Description : Client HTTP SIRENE
On Record Error : Send to Error
HTTP
Resource URL : http://files.data.gouv.fr/sirene/
Headers : sirene_ : sirene_
Mode : Streaming
Per-Status Actions
HTTP Statis Code : 500 | Action for status : Retry with exponential backoff |
Base Backoff Interval (ms) : 1000 | Max Retries : 10
HTTP Method : GET
Body Time Zone : UTC (UTC)
Request Transfert Encoding : BUFFERED
HTTP Compression : None
Connect Timeout : 0
Read Timeout : 0
Authentication Type : None
Use OAuth 2
Use Proxy
Max Batch Size (records) : 1000
Batch Wait Time (ms) : 2000
Pagination
Pagination Mode : None
TLS
UseTLS
Timeout Handling
Action for timeout : Retry immediately
Max Retries : 10
Data Format
Date Format : Delimited
Compression Format : Archive
File Name Pattern within Compressed Directory : *.csv
Delimiter Format Type : Custom
Header Line : With Header Line
Max Record Length (chars) : 1024
Allow Extra Columns
Delimiter Character : Semicolon
Escape Character : Other \
Quote Character : Other "
Root Field Type : List-Map
Lines to Skip : 0
Parse NULLs
Charset : UTF-8
Ignore Control Characters
Hadoop FS Destination :
General
Name : Hadoop FS 1
Description : Writing into HDFS
Stage Library : CDH 5.7.6
Produce Events
Required Fields
Preconditions
On Record Error : Send to Error
Output Files
File Type : Whole File
Files Prefix
Directory in Header
Directory Template : /user/pap/StreamSets/sirene/
Data Time Zone : UTC (UTC)
Time Basis : ${time:now()}
Use Roll Attribute
Validate HDFS Permissions : ON
Skip file recovery : ON
Late Records
Late Record Time Limit (secs) : ${1 * HOURS}
Late Record Handling : Send to error
Data Format
Data Format : whole File
File Name Expression : ${record:value('/fileInfo/filename')}
Permissions Expression : 777
File Exists : Overwrite
Include Checksum in Events
... so what am I doing wrong ? :(
It looks like http://files.data.gouv.fr/sirene/ is returning a file listing, rather than a compressed archive. This is a tricky one, since there isn't a standard way to iterate through such a listing. You might be able to read http://files.data.gouv.fr/sirene/ as text, then use the Jython evaluator to parse out the zip file URLs, retrieve, decompress and parse them, adding the parsed records to the batch. I think you'd have problems with this method, though, as all the records would end up in the same batch, blowing out memory.
Another idea might be to use two pipelines - the first would use HTTP client origin and a script evaluator to download the zipped files and write them to a local directory. The second pipeline would then read in the zipped CSV via the Directory origin as normal.
If you do decide to have a go, please engage with the StreamSets community via one of our channels - see https://streamsets.com/community
I'm writing the Jython evaluator. I'm not familiar with the available constants/objects/records as presented in comments. I tried to adapt this python script into the Jython evaluator :
import re
import itertools
import urllib2
data = [re.findall(r'(sirene\w+.zip)', line) for line in open('/home/user/Desktop/filesdatatest.txt')]
data_list = filter(None, data)
data_brackets = list(itertools.chain(*data_list))
data_clean = ["http://files.data.gouv.fr/sirene/" + url for url in data_brackets]
for url in data_clean:
urllib2.urlopen(url)
records = [re.findall(r'(sirene\w+.zip)', record) for record in records] gave me this error message SCRIPTING_05 - Script error while processing record: javax.script.ScriptException: TypeError: expected string or buffer, but got in at line number 50
filesdatatest.txt contains things like :
Listing of /v1/AUTH_6032cb4c2159474684c8df1da2e2b642/storage/sirene/
Name Size Date
../
README.txt 2Ki 2017-10-11 03:31:57
sirene_201612_L_M.zip 1Gi 2017-01-05 00:12:08
sirene_2017002_E_Q.zip 444Ki 2017-01-05 00:44:58
sirene_2017003_E_Q.zip 6Mi 2017-01-05 00:45:01
sirene_2017004_E_Q.zip 2Mi 2017-01-05 03:37:42
sirene_2017005_E_Q.zip 2Mi 2017-01-06 03:40:47
sirene_2017006_E_Q.zip 2Mi 2017-01-07 05:04:04
so I know how to parse records.

Resources