faker npm package for get name with specific gender - faker

I use faker npm package in Node v10.8.0 and npm v6.3.0 for generate user's with gender, but doesn't work for me :c
See code below
const genders = ['male', 'female'];
let gender = faker.random.arrayElement(genders);
let name = faker.name.firstName(gender);
And the result is something like this :
gender : male
name : Lourdes

According to the comments/docs in the faker lib, firstName is picked randomly if it doesn't exist in your locale data. I would strongly suggest not making your tests or code depend on gender-specific info, since that's fuzzy and is subject to change.

Related

Not able to access certain JSON properties in Autoloader

I have a JSON file that is loaded by two different Autoloaders.
One uses schema evolution and besides replacing spaces in the json property names, writes the json directly to a delta table, and I can see all the values are there properly.
In the second one I am mapping to a defined schema and only use a subset of properties. So use a lot of withColumn and then a select to narrows to my defined column list.
Autoloader definition:
df = (spark
.readStream
.format('cloudFiles')
.option('cloudFiles.format', 'json')
.option('multiLine', 'true')
.option('cloudFiles.schemaEvolutionMode','rescue')
.option('cloudFiles.includeExistingFiles','true')
.option('cloudFiles.schemaLocation', bronze_schema)
.option('cloudFiles.inferColumnTypes', 'true')
.option('pathGlobFilter','*.json')
.load(upload_path)
.transform(lambda df: remove_spaces_from_columns(df))
.withColumn(...
Writer:
df.writeStream.format('delta') \
.queryName(al_stream_name) \
.outputMode('append') \
.option('checkpointLocation', checkpoint_path) \
.option('mergeSchema', 'true') \
.trigger(once = True) \
.table(bronze_table)
Issue is that some of the source columns are ok load and I get their values, and others are constantly null in the output table.
For example:
.withColumn('vl_rating', col('risk_severity.value')) # works
.withColumn('status', col('status.name')) # always null
...
.select(
'rating',
'status',
...
json is quite simple, these are all string values, they are always populated. The same code works against another simular json file in another autoloader without issue.
I have run out of ideas to fault find on this. My imports are minimal, outside of Autoloader the JSON loads fine.
e.g
%python
import pyspark.sql.functions as psf
jsontest = spark.read.option('inferSchema','true').json('dbfs:....json')
df = jsontest.withColumn('status', psf.col('status.name')).select('status')
display(df)
Results in the values of the status.name property of the json file
Any ideas would be greatly appreciated.
I have found generally what is causing this. Interesting cause!
I am scanning a whole directory of json files, and the schema evolves over time (as expected). But when I clear out the autoloader schema and checkpoint directories and only scan the latest json file it all works correctly.
So what I surmise is that something in schema evolution with the older json files causes Autoloader to get into a state where it will not put certain properties into the stream to the writer.
If anyone has any recommendation on how to implement some data quality analysis in an Autoloader I would be most appreciative if you would share.

None of the keys entered are valid keys - R

I am trying to learn how to manipulate microarrays for differential expression analysis. While I am trying to add some annotation I can not find the keytype related to:
select(hugene10sttranscriptcluster.db,
keys = my_keys,
columns = c("GENENAME", "SYMBOL"),
keytype = "PROBEID")
-------------------------------------------------------
Error in .testForValidKeys(x, keys, keytype, fks) :
None of the keys entered are valid keys for 'PROBEID'. Please use the keys method to see a listing of valid arguments.
Being the keys:
my_keys
---------------------------------------------------------------------
[1] "16650045" "16650047" "16650049" "16650051" "16650053" "16650055" "16650057" "16650059"
I tried every possible type from keytypes(hugene10sttranscriptcluster.db) with no successful result:
"16650045" %in% keys(hugene10sttranscriptcluster.db, "GENEID")
------------------------------------------------------------------
[1] FALSE
Is there any documentation/alternative where I can find it. I have been looking through the documentation (Array Express) but did not help me. I am also not sure; is it possible that I require a different package (hugene10sttranscriptcluster.db)?
Effectively, I did have a problem with the package. If anyone has the same problem just try to look for the annotation of the microarray in the documentation (pd.hugene.2.0.st in my case) to install and use the proper package (hugene20sttranscriptcluster.db)

Create groups of functions with same name within single module without classes

I am writing a card game simulator. During play, I want different players to have different strategies. My idea is to have 2 functions specified for a given agent than can be imported. These functions would have the same signatures. I know how to do this with classes and inheritance, but I'm trying to code this project entirely functionally. Here is what I have if I were to do it with classes:
class Agent:
def __init__(self,position):
self.pos = position
def flip_two(self,gs):
pass
def regular_move(self,gs):
pass
class Random_Agent(Agent):
def flip_two(self,gs):
#some code that alters gs randomly
def regular_move(self,gs):
#some code that alters gs randomly
class etc_Agents(Agent):
.
.
.
The best answer I can think of so far is to put each agent in a new file, since modules would be a way to group the functions.
Thanks for any insight!
If you wish to group related functions, you could place them in a dictionary like so:
options = {
'a' : {'sameName' : lambda x: x*2},
'b' : {'sameName' : lambda x: x**2}
}
print(options['a']['sameName'](5))
print(options['b']['sameName'](5))

quotes around author name in package created from devtools

So, I'm creating a script to extend the functionality of devtools::create() and I'm noticing some slightly odd behavior when I double check things with utils::maintainer. Here's a MWE where I set the Authors#R section of the description file through the devtools.desc.author option:
options(devtools.desc.license = "AGPL-3")
options(devtools.desc.author = "'Joe Dirt <joe#durt.ee> [aut, cre]'")
descArgs <- list(Package = "testPkg",
Title = "testPkg",
Description = "some desc.")
options(devtools.desc = descArgs)
devtools::create(path = "testPkg", check = TRUE)
Now, if you go ahead and run devtools::install("testPkg", quiet=TRUE), and then maintainer("testPkg") you get
> maintainer("testPkg")
[1] "'Joe Dirt' <joe#durt.ee>"
So my question is: why is the maintainer's name quoted, here?
This seems to be an issue with how the Maintainer field is auto generated from Authors#R. See:http://cran.r-project.org/doc/manuals/r-release/R-exts.html
Both ‘Author’ and ‘Maintainer’ fields can be omitted if a suitable ‘Authors#R’ field is given. This field can be used to provide a refined and machine-readable description of the package “authors” (in particular specifying their precise roles), via suitable R code. The roles can include ‘"aut"’ (author) for full authors, ‘"cre"’ (creator) for the package maintainer, and ‘"ctb"’ (contributor) for other contributors, ‘"cph"’ (copyright holder), among others. See ?person for more information. Note that no role is assumed by default. Auto-generated package citation information takes advantage of this specification. The ‘Author’ and ‘Maintainer’ fields are auto-generated from it if needed when building5 or installing.
Therefore, you should use the person function to specify the author list as follows:
options(devtools.desc.author ="c(person('Joe','Dirt',email='joe#durt.ee',role=c('aut','cre')))")

Bug in zc.recipe.cmmi?

If I provide a variable with an embedded space in the environment as follows:
environment =
CPPFLAGS="-D_GNU_SOURCE -I${openssl:location}/include"
I get this error:
ValueError: dictionary update sequence element #1 has length 1; 2 is required
Is this a bug? Is there a workaround?
It's a shortcoming in zc.recipe.cmmi; it cannot handle environment variables without spaces. There is a patch available in the bugtracker for the recipe.
I am not currently aware of a workaround for this other than applying the patch. You can apply the patch on existing eggs using the collective.recipe.patch recipe, which should work in this case too (untried):
[buildout]
parts =
patch-z.r.cmmi
yourcmmipart
[patch-z.r.cmmi]
recipe = collective.recipe.patch
egg = zc.recipe.cmmi <= 1.3.4
patch = patches/environ_section_trunk_r101308.patch
This assumes you have a patches suddirectory with the patch from the bug downloaded. The part needs to be listed before your cmmi part to be executed before that part (or you can fabricate a dependency).
An alternative solution is to just abuse the recipe's 'configure-command' like so:
[buildthis]
recipe = zc.recipe.cmmi
...
configure-command =
export CPPFLAGS="-D_GNU_SOURCE -I${openssl:location}/include";
./configure

Resources