Use dictionary keys and associated values as wildcards in snakemake - dictionary

I have a great number of analyses that need to be done in one go, and thus I thought that I can make a dictionary and parse the keys and values as wildcards (every snakemake run needs two wildcards to be used).
My dict will look like this:
myDict= {
"Apple": ["fruity","red","green"]
"Banana": ["fruity,"yellow"]
}
Here the first key in the dictionary will be wildcard1, here {Apple}, with the first value as wildcard2, here {fruity}, and run snakemake with these two until the final rule is has been run.
Then the same key will again be used ({Apple} as wildcard1) with the second associated value, here {red}, as wildcard2, and run snakemake until the last rule has been run.
Then after the final value belonging to {Apple} has been used as wildcard2, switch over to {Banana} as wildcard1 with its first value, {fruity} as wildcard2.
This will go on until all keys and their associated values have been used as wildcards and snakemake will stop. (That is keys as wildcard1, and their values as wildcard2).
My question is if this is possible, and if so, how can I achieve that?

I bet there is a way to do it with a single expand, but you can use a more verbose list comprehension. I'll take the files to be {wc1}_{wc2}.out for wildcards 1 and 2. Then you have
myDict= {
"Apple": ["fruity","red","green"],
"Banana": ["fruity","yellow"]
}
inputs = [expand('{wc1}_{wc2}.out',
wc1=key, wc2=value)
for key, value in myDict.items()]
# inputs = [['Apple_fruity.out', 'Apple_red.out', 'Apple_green.out'], ['Banana_fruity.out', 'Banana_yellow.out']]
rule all:
input: inputs
Edited to address comment:
To make two lists, keys and values, you can use
keys = []
values = []
for key, value in myDict.items():
for v in value:
keys.append(key)
values.append(v)
print(keys) # ['Apple', 'Apple', 'Apple', 'Banana', 'Banana']
print(values) # ['fruity', 'red', 'green', 'fruity', 'yellow']

Related

Create a perl hash from a db select

Having some trouble understanding how to create a Perl hash from a DB select statement.
$sth=$dbh->prepare(qq{select authorid,titleid,title,pubyear from books});
$sth->execute() or die DBI->errstr;
while(#records=$sth->fetchrow_array()) {
%Books = (%Books,AuthorID=> $records[0]);
%Books = (%Books,TitleID=> $records[1]);
%Books = (%Books,Title=> $records[2]);
%Books = (%Books,PubYear=> $records[3]);
print qq{$records[0]\n}
print qq{\t$records[1]\n};
print qq{\t$records[2]\n};
print qq{\t$records[3]\n};
}
$sth->finish();
while(($key,$value) = each(%Books)) {
print qq{$key --> $value\n};
}
The print statements work in the first while loop, but I only get the last result in the second key,value loop.
What am I doing wrong here. I'm sure it's something simple. Many thanks.
OP needs better specify the question and do some reading on DBI module.
DBI module has a call for fetchall_hashref perhaps OP could put it to some use.
In the shown code an assignment of a record to a hash with the same keys overwrites the previous one, row after row, and the last one remains. Instead, they should be accumulated in a suitable data structure.
Since there are a fair number of rows (351 we are told) one option is a top-level array, with hashrefs for each book
my #all_books;
while (my #records = $sth->fetchrow_array()) {
my %book;
#book{qw(AuthorID TitleID Title PubYear)} = #records;
push #all_books, \%book;
}
Now we have an array of books, each indexed by the four parameters.
This uses a hash slice to assign multiple key-value pairs to a hash.
Another option is a top-level hash with keys for the four book-related parameters, each having for a value an arrayref with entries from all records
my %books;
while (my #records = $sth->fetchrow_array()) {
push #{$books{AuthorID}}, $records[0];
push #{$books{TitleID}}, $records[1];
...
}
Now one can go through authors/titles/etc, and readily recover the other parameters for each.
Adding some checks is always a good idea when reading from a database.

How to filter on dictionary with multiples keys

I have a dictionnary with double key which look like this:
{('Year', 'prix'): 130546.87449454193,
('Year', 'departement'): 11591.47409694357,
('Year', 'annee'): 34.28496633835407,
('Year', 'kilometrage'): 414330.13854019763,
('price', 'prix'): 324162.66684322944,
('price', 'departement'): 466290.81724082783,
('price', 'annee'): 454736.63137143303,
('price', 'kilometrage'): 117557.09720242623}
I want to filter only on the first part of my key which is a tuple. In other words I want to get this result if I specify in my code 'Year':
{('Year', 'prix'): 130546.87449454193,
('Year', 'departement'): 11591.47409694357,
('Year', 'annee'): 34.28496633835407,
('Year', 'kilometrage'): 414330.13854019763}
I reached my result, but only after multiple lines of code. I am wondering if there is a way to do it smoothly.
Thanks in advance
You can use dict comprehension:
# Suppose you have your dict stored in dct
# The line below returns a filtered dictionary, including keys with first tuple element 'Year'
{key: dct[key] for key in dct.keys() if key[0]=='Year'}

Merging Value of one dictionary as key of another

I have 2 Dictionaries:
StatePops={'AL':4887871, 'AK':737438, 'AZ':7278717, 'AR':3013825}
StateNames={'AL':'Alabama', 'AK':'Alaska', 'AZ':'Arizona', 'AR':'Arkansas'}
I am trying to merge so the Value of StateNames is the Key for StatePops.
Ex.
{'Alabama': 4887871, 'Alaska': 737438, ...
I also have to display the name of states w/ population over 4million.
Any help is appreciated!!!
You have not specified in what programming language you want this problem to be solved.
Nonetheless, here is a solution in Python.
state_pops = {
'AL': 4887871,
'AK': 737438,
'AZ':7278717,
'AR':3013825
}
state_names = {
'AL':'Alabama',
'AK':'Alaska',
'AZ':'Arizona',
'AR':'Arkansas'
}
states = dict([([state_names[k],state_pops[k]]) for k in state_pops])
final = {k:v for k, v in states.items() if v > 4000000}
print(states)
print(final)
First, you can merge two dictionaries with the predefined dict python function in the states variable as such. Here, k is an iterator and it is used as index for state_names and state_pops.
Then, store the filtered dictionary in final where the states.items() is used to access the keys and values in states and type-cast it as a string with the str function.
There may be more simpler solutions but this is as far as I can optimize the problem.
Hope this helps.
Dictionary Keys cannot be changed in Python. You need to either add the modified key-value pair to the dictionary and then delete the old key, or you can create a new dictionary. I'd opt for the second option, i.e., creating a new dictionary.
myDict = {}
for i in StatePops:
myDict.update({StateNames[i] : StatePops[i]})
This outputs myDict as
{'Alabama': 4887871, 'Alaska': 737438, 'Arizona': 7278717, 'Arkansas': 3013825}

Update dictionary key inside list using map function -Python

I have a dictionary of phone numbers where number is Key and country is value. I want to update the key and add country code based on value country. I tried to use the map function for this:
print('**Exmaple: Update phone book to add Country code using map function** ')
user=[{'952-201-3787':'US'},{'952-201-5984':'US'},{'9871299':'BD'},{'01632 960513':'UK'}]
#A function that takes a dictionary as arg, not list. List is the outer part
def add_Country_Code(aDict):
for k,v in aDict.items():
if(v == 'US'):
aDict[( '1+'+k)]=aDict.pop(k)
if(v == 'UK'):
aDict[( '044+'+k)]=aDict.pop(k)
if (v == 'BD'):
aDict[('001+'+k)] =aDict.pop(k)
return aDict
new_user=list(map(add_Country_Code,user))
print(new_user)
This works partially when I run, output below :
[{'1+952-201-3787': 'US'}, {'1+1+1+952-201-5984': 'US'}, {'001+9871299': 'BD'}, {'044+01632 960513': 'UK'}]
Notice the 2nd US number has 2 additional 1s'. What is causing that?How to fix? Thanks a lot.
Issue
You are mutating a dict while iterating it. Don't do this. The Pythonic convention would be:
Make a new_dict = {}
While iterating the input a_dict, assign new items to new_dict.
Return the new_dict
IOW, create new things, rather than change old things - likely the source of your woes.
Some notes
Use lowercase with underscores when defining variable names (see PEP 8).
Lookup values rather than change the input dict, e.g. a_dict[k] vs. a_dict.pop(k)
Indent the correct number of spaces (see PEP 8)

In Biztalk mapper how to use split array concept

Required suggestion on below part.please any one give solution.
We have mapping from 850 to FlatFile
X12/PO1Loop1/PO1/PO109 and I need to map to field VALUE which is under record Option which is unbounded.
Split PO109 into substrings delimited by '.', foreach subsring after the first, create new Option with value=substring
So in input sample we have value like 147895632qwerqtyuui.789456123321456987
Similarly the field repeats under POLoop1.
So I need to split value based on (.) then pass a value to value field under option record(unbounded).
I tried using below code snippet
public string SplitValues(string strValue)
{
string[] arrValue = strValue.Split(".".ToCharArray());
foreach (string strDisplay in arrValue)
{
return strDisplay;
}
}
But it doesn't works, and I am not really familiar with the String methods and I am not sure if there's an easy way to do this. I have a String which contains couple of values delimited with "." .
So I need to separate values based on delimiter(.) and pass value to field.
How can I do this
As I mentioned, not too clear what is your objective, but I think you want to split a node that has some kind of delimiter into multiple nodes... if so, Try this: https://seroter.wordpress.com/2008/10/07/splitting-delimited-values-in-biztalk-maps/
He is doing exactly that. Given a node with a|b|c|d as value, output multiple nodes, each containing the value after splitted by |, so node1 = a, node2 = b, node3 = c, node4 = d.

Resources