Suppose I have 2 dictionaries:
Dict #1:
statedict = {'Alaska': '02', 'Alabama': '01', 'Arkansas': '05', 'Arizona': '04', 'California':'06', 'Colorado': '08', 'Connecticut': '09','DistrictOfColumbia': '11', 'Delaware': '10', 'Florida': '12', 'Georgia': '13', 'Hawaii': '15', 'Iowa': '19', 'Idaho': '16', 'Illinois': '17', 'Indiana': '18', 'Kansas': '20', 'Kentucky': '21', 'Louisiana': '22', 'Massachusetts': '25', 'Maryland': '24', 'Maine': '23', 'Michigan': '26', 'Minnesota': '27', 'Missouri': '29', 'Mississippi': '28', 'Montana': '30', 'NorthCarolina': '37', 'NorthDakota': '38', 'Nebraska': '31', 'NewHampshire': '33', 'NewJersey': '34', 'NewMexico': '35', 'Nevada': '32', 'NewYork': '36', 'Ohio': '39', 'Oklahoma': '40', 'Oregon': '41', 'Pennsylvania': '42', 'PuertoRico': '72', 'RhodeIsland': '44', 'SouthCarolina': '45', 'SouthDakota': '46', 'Tennessee': '47', 'Texas': '48', 'Utah': '49', 'Virginia': '51', 'Vermont': '50', 'Washington': '53', 'Wisconsin': '55', 'WestVirginia': '54', 'Wyoming': '56'}
Dict #2:
master_dict = {'01': ['01034','01112'], '06': ['06245', '06025, ''06007'], '13': ['13145']}
*The actual master_dict is much longer.
Basically, I want to replace the 2-digit keys in master_dict with the long name keys in statedict. How do I do this? I am trying to use the following, but it doesn't quite work.
for k, v in master_dict.items():
for state, fip in statedict.items():
if k == fip:
master_dict[k] = statedict[state]
You can use a dictionary comprehension to make a lookup table mapping values to keys. A second dictionary comprehension performs the lookups to replace numbers with words:
lookup = {v: k for k, v in statedict.items()}
result = {lookup[k]: v for k, v in master_dict.items()}
print(result)
Output:
{'Alabama': ['01034', '01112'],
'California': ['06245', '06025, 06007'],
'Georgia': ['13145']}
Try it here
Related
I want to select all l labelled vertices along with their t,n labelled vertices grouped by their neighbours. Also I want to apply a limit on the length of the neighbours. For ex for neighbour limit = 2 something like below should be output.
[
{"l1",{[t1,t2]}, {}},
{"l2",{[t3]}, {[n1]}},
{"l3",{[]}, {[n2,n3]}}
]
For ex for neighbour limit = 1 something like below should be output.
[
{"l1",{[t1,t2]}, {}},
{"l2",{[t3]}, {[n1]}},
{"l3",{[]}, {[n2]}}
]
grelify link https://gremlify.com/xun4v83y54/2
g.addV('Vertex').as('1').property(single, 'name', 'l1').property(single, 'label', 'l').
addV('Vertex').as('2').property(single, 'name', 'l2').property(single, 'label', 'l').
addV('Vertex').as('3').property(single, 'name', 'l3').property(single, 'label', 'l').
addV('Tag').as('4').property(single, 'name', 't1').property(single, 'label', 't').
addV('Tag').as('5').property(single, 'name', 't2').property(single, 'label', 't').
addV('Tag').as('6').property(single, 'name', 't3').property(single, 'label', 't').
addV('neighbour1').as('7').property(single, 'name', 'n1').property(single, 'label', 'n').
addV('neighbour2').as('8').property(single, 'name', 'n2').property(single, 'label', 'n').
addV('neighbour3').as('9').property(single, 'name', 'n3').property(single, 'label', 'n').
addE('connected').from('1').to('4').
addE('connected').from('1').to('5').
addE('connected').from('2').to('6').
addE('connected').from('2').to('7').
addE('connected').from('3').to('8')
addE('connected').from('3').to('9')
For this output you can try doing something like this:
g.V().hasLabel('l').
group().by('name').
by(out('connected').fold()
project('t', 'n').
by(unfold().hasLabel('t').limit(1).values('name').fold()).
by(unfold().hasLabel('n').limit(1).values('name').fold())
.unfold().select(values).fold())
You can change the limit to be the max number of neighbors you want from each type.
example: https://gremlify.com/ecw6j2ajh5p
Trying to request ERA5 data. The request is limited by size, and the system will auto reject any requests bigger than the limit. However, one wants to be as close to the request limit as possible as each request takes few hours to be processed by Climate Data Store (CDS).
For example, I have a vector of years <- seq(from = 1981, to = 2019, by = 1) and a vector of variables <- c("a", "b", "c", "d", "e"...., "z"). The max request size is 11. Which means length(years) * length(variables) must be smaller or equal to 11.
For each request, I have to provide a list containing character vectors for years and variables. For example:
req.list <- list(year = c("1981", "1982", ..."1991"), variable = c("a")) This will work since there are 11 years and 1 variable.
I thought about using expand.grid() then use row 1-11, row 12-22, ...and unique() value each column to get the years and variable for request. But this approach sometimes will lead to request size too big:
req.list <- list(year = c("2013", "2014", ..."2018"), variable = c("a", "b")) is rejected since length(year) * length(variable) = 12 > 11.
Also I am using foreach() and doParallel to create multiple requests (max 15 requests at a time)
If anyone has a better solution please share (minimize the number of unique combos while obeying the request size limit), thank you very much.
The limit is set in terms of number of fields, which one can think of as number of "records" in the grib sense. Usually the approach suggested is to leave the list of variables, and shorter timescales in the retrieval command and then loop over the years (longer times). This is a matter of choice though for ERA5 as the data is all on cache, not on tape drive, with tape drive based requests it is important to retrieve data on the same tape with a single request (i.e. if you use the CDS to retrieve seasonal forecasts or other datasets that are not ERA5).
this is a simple looped example:
import cdsapi
c = cdsapi.Client()
yearlist=[str(s) for s in range(1979,2019)]
for year in yearlist:
c.retrieve(
'reanalysis-era5-single-levels',
{
'product_type': 'reanalysis',
'format': 'netcdf',
'variable': [
'10m_u_component_of_wind', '10m_v_component_of_wind', '2m_dewpoint_temperature',
'2m_temperature',
],
'year': year,
'month': [
'01', '02', '03',
'04', '05', '06',
'07', '08', '09',
'10', '11', '12',
],
'day': [
'01', '02', '03',
'04', '05', '06',
'07', '08', '09',
'10', '11', '12',
'13', '14', '15',
'16', '17', '18',
'19', '20', '21',
'22', '23', '24',
'25', '26', '27',
'28', '29', '30',
'31',
],
'time': [
'00:00', '01:00', '02:00',
'03:00', '04:00', '05:00',
'06:00', '07:00', '08:00',
'09:00', '10:00', '11:00',
'12:00', '13:00', '14:00',
'15:00', '16:00', '17:00',
'18:00', '19:00', '20:00',
'21:00', '22:00', '23:00',
],
},
'data'+year+'.nc')
I presume you can parallelize this with foreach although I've never tried, I'm presuming it won't help too much as there is a job limit per user which is set quite low, so you will just end up with a large number of jobs in the queue there...
For Example In a Collection there are 10 or more Documents I want to create the list of that collection according to the DocumentsId and the list contains only values not keys.
Document 1-->
'Num1':'1',
'Num2':'2',
'Num3': '3',
'Num4': '4',
'Num5': '5',
'Num6': '6',
'Num7': '7',
'Num8': '8',
'Num9': '9',
'Num10': '10',
Document 2-->
'Num1':'1',
'Num2':'2',
'Num3': '3',
'Num4': '4',
'Num5': '5',
'Num6': '6',
'Num7': '7',
'Num8': '8',
'Num9': '9',
'Num10': '10',
The list be like this for every Document
List document1 = [1,2,3,4,5,6,7,8,9,10]
List document2 = [11,12,13,14,15,16,17,18,19,20]
And the every list name is the documentid
Assuming the 'Collection' is a simple map object you can do:
Map temp= { 'Num1':'1',
'Num2':'2',
'Num3': '3',
'Num4': '4',
'Num5': '5',
'Num6': '6',
'Num7': '7',
'Num8': '8',
'Num9': '9',
'Num10': '10',};
var newList=temp.values.toList();
If it is a custom object you can use the map method :
myobjectList.map((object)=>object.id).toList();
It will iterate all the elements in the collection and 'map' every one to the property you need.
EDIT:
Dinamically you cannot change a variable name, but you can change a variable attributes.
What you wish can be achieved as:
List<List<int>> documents;
//access each document by its index
List document1 = [1,2,3,4,5,6,7,8,9,10]
documents[0]//would be equal to accessing document1
OR use a map:
Map map={document.name:[1,2,3,4,5]};
map['document2']=[6,7,8,9,10];
I think it is pretty straightforward. All I am trying to do is update the original dictionary's 'code' with that of another dictionary which has the value. I get a feeling 2 for loops and an IF loop can be further shortened to get the answer. In my actual problem, I have few 1000's of dicts that I have to update. Thanks guys!
Python:
referencedict = {'A': 'abc', 'B': 'xyz'}
mylistofdict = [{'name': 'John', 'code': 'A', 'age': 28}, {'name': 'Mary', 'code': 'B', 'age': 32}, {'name': 'Joe', 'code': 'A', 'age': 43}]
for eachdict in mylistofdict:
for key, value in eachdict.items():
if key == 'code':
eachdict[key] = referencedict[value]
print mylistofdict
Output:
[{'age': 28, 'code': 'abc', 'name': 'John'}, {'age': 32, 'code': 'xyz', 'name': 'Mary'}, {'age': 43, 'code': 'abc', 'name': 'Joe'}]
There is no need to loop over all values of eachdict, just look up code directly:
for eachdict in mylistofdict:
if 'code' not in eachdict:
continue
eachdict['code'] = referencedict[eachdict['code']]
You can probably omit the test for code being present, your example list always contains a code entry, but I thought it better to be safe. Looking up the code in the referencedict structure assumes that all possible codes are available.
I used if 'code' not in eachdict: continue here; the opposite is just as valid (if 'code' in eachdict), but this way you can more easily remove the line if you do not need it, and you save yourself an indent level.
referencedict = {'A': 'abc', 'B': 'xyz'}
mylistofdict = [{'name': 'John', 'code': 'A', 'age': 28}, {'name': 'Mary', 'code': 'B', 'age': 32}, {'name': 'Joe', 'code': 'A', 'age': 43}]
for x in mylistofdict:
try:
x['code']=referencedict.get(x['code'])
except KeyError:
pass
print(mylistofdict)
is there a way to tell if column is an auto increment (aka serial, aka identity) column when using the ODBC catalog functions (like SQLColumns)? I'm particularly interested in MySQL as a source.
I don't know about mysql but SQLColumns for some drivers return additional fields which can tell you whether a column is an identity column. e.g.,
SQL> create table mje(a int identity);
$ perl -e 'use DBI;use Data::Dumper;my $h = DBI->connect("dbi:ODBC:xx","xx","xx");my $s = $h->column_info(undef, undef, "mje", q/%/); print Dumper($s->{NAME});print $s->dump_results;'
$VAR1 = [
'TABLE_CAT',
'TABLE_SCHEM',
'TABLE_NAME',
'COLUMN_NAME',
'DATA_TYPE',
'TYPE_NAME',
'COLUMN_SIZE',
'BUFFER_LENGTH',
'DECIMAL_DIGITS',
'RADIX',
'NULLABLE',
'REMARKS',
'COLUMN_DEF',
'SQL_DATA_TYPE',
'SQL_DATETIME_SUB',
'CHAR_OCTET_LENGTH',
'ORDINAL_POSITION',
'IS_NULLABLE',
'SS_IS_SPARSE',
'SS_IS_COLUMN_SET',
'SS_IS_COMPUTED',
'SS_IS_IDENTITY',
'SS_UDT_CATALOG_NAME',
'SS_UDT_SCHEMA_NAME',
'SS_UDT_ASSEMBLY_TYPE_NAME',
'SS_XML_SCHEMACOLLECTION_CATALOG_NAME',
'SS_XML_SCHEMACOLLECTION_SCHEMA_NAME',
'SS_XML_SCHEMACOLLECTION_NAME',
'SS_DATA_TYPE'
];
'master', 'dbo', 'mje', 'a', '4', 'int identity', 10, 4, '0', '10', '0', undef, undef, '4', undef, undef, 1, 'NO', '0', '0', '0', '1', undef, undef, undef, undef, undef, undef, '56'
1 rows
The above is for the Easysoft MS SQL Server ODBC driver which a) supports a SS_IS_IDENTITY and b) shows 'int identity' for the type name.