I want to do update a map doing the following:
def updateInfo() do
person = %{ person | name : "new name", age : person.age +1, observation : changeObs(person.age)}
end
def changeObs(age), when age >= 18, do: "Adult"
def changeObs(age), do: "kid"
If I updateInfo() and the age is 17, I would expect the person to change the observation to "Adult". But it is not working. I thought the updates of the map were done sequentially, but apparently not, so I cannot rely on the fact that the age now is 18. I can do this if I split the update like so:
person = %{ person | name : "new name", age : person.age +1}
person = %{ person | observation : changeObs(person.age)}
Is there a way to keep it all the update in one line, relying on previous updates of the attributes on the map?
Well, as far as I can tell, you can just cache the new value in a variable:
new_age = person.age + 1
person = %{ person | name: "new name", age: new_age, observation: changeObs(new_age)}
Or you could piple the changes like this:
person
|> Map.put(:name, "new_name")
|> Map.put(:age, new_age)
|> Map.put(:observation, changeObs(new_age))
First of all, the syntax of updating maps uses colons, not equal signs. Your code raises SyntaxError exception.
There are plenty of ways to accomplish the task as oneliner:
person = with age <- person.age + 1,
do: %{person | age: age,
observation: (if age >= 18, do: "Adult", else: "Kid")}
Or:
person = (age = person.age + 1;
%{person | age: age,
observation: (if age >= 18, do: "Adult", else: "Kid")})
Or a pipe chain:
{_, person} = person
|> Map.put(:age, person.age + 1)
|> Map.get_and_update(:age, fn age ->
{age, age + 1}
end)
The idiomatic would be the last one. My fave would be the first one.
Related
I am absolutely in love with ADX time series capabilities; having worked tons on sensor data with Python. Below are the requirements for my case:
Handle Sensor data tags at different frequencies -- bring them to all to 1 sec frequency (if in milliseconds, aggregate over a 1sec interval)
Convert stacked data to unstacked data.
Join with another dataset which has multiple "string-labels" by timestamp, after unstack.
Do linear interpolation on some columns, and forward fill in others (around 10-12 in all).
I think with below query I have gotten the first three done; but unable to use series_fill_linear directly on column. The docs say this function requires a dynamic type as input. The error message is helpful:
series_fill_linear(): argument #1 was not of an expected data type: dynamic
Is it possible to apply series_fill_linear where I'm already using pack instead of using pack again. How can I apply this function selectively by Tag; and make my overall query more readable? It's important to note that only sensor_data table requires both series_fill_linear and series_fill_forward; label_data only requires series_fill_forward.
List item
sensor_data
| where timestamp > datetime(2020-11-24 00:59:59) and timestamp <datetime(2020-11-24 12:00:00)
| where device_number =='PRESSURE_599'
| where tag_name in ("tag1", "tag2", "tag3", "tag4")
| make-series agg_value = avg(value) default = double(null) on timestamp in range (datetime(2020-11-24 00:59:59), datetime(2020-11-24 12:00:00), 1s) by tag_name
| extend series_fill_linear(agg_value, double(null), false) //EDIT
| mv-expand timestamp to typeof(datetime), agg_value to typeof(double)
| summarize b = make_bag(pack(tag_name, agg_value)) by timestamp
| evaluate bag_unpack(b)
|join kind = leftouter (label_data
| where timestamp > datetime(2020-11-24 00:58:59) and timestamp <datetime(2020-11-24 12:00:01)
| where device_number =='PRESSURE_599'
| where tag != "PRESSURE_599_label_Raw"
| summarize x = make_bag(pack(tag, value)) by timestamp
| evaluate bag_unpack(x)) on timestamp
| project timestamp,
MY_LINEAR_COL_1 = series_fill_linear(tag1, double(null), false),
MY_LINEAR_COL_2 = series_fill_forward(tag2),
MY_LABEL_1 = series_fill_forward(PRESSURE_599_label_level1),
MY_LABEL_2 = series_fill_forward(PRESSURE_599_label_level2)
EDIT: I ended up using extend with case to handle different cases of interpolation.
// let forward_tags = dynamic({"tags": ["tag2","tag4"]}); unable to use this in query as "forward_tags.tags"
sensor_data
| where timestamp > datetime(2020-11-24 00:59:59) and timestamp <datetime(2020-11-24 12:00:00)
| where device_number = "PRESSURE_599"
| where tag_name in ("tag1", "tag2", "tag3", "tag4") // use a variable here instead?
| make-series agg_value = avg(value)
default = double(null)
on timestamp
in range (datetime(2020-11-24 00:59:59), datetime(2020-11-24 12:00:00), 1s)
by tag_name
| extend agg_value = case (tag_name in ("tag2", "tag3"), // use a variable here instead?
series_fill_forward(agg_value, double(null)),
series_fill_linear(agg_value, double(null), false)
)
| mv-expand timestamp to typeof(datetime), agg_value to typeof(double)
| summarize b = make_bag(pack(tag_name, agg_value)) by timestamp
| evaluate bag_unpack(b)
| join kind = leftouter (
label_data // don't want to use make-series here, will be unecessary data generation since already in 'ss' format.
| where timestamp > datetime(2020-11-24 00:58:59) and timestamp <datetime(2020-11-24 12:00:01)
| where tag != "PRESSURE_599_label_Raw"
| summarize x = make_bag(pack(tag, value)) by timestamp
| evaluate bag_unpack(x)
)
on timestamp
I was wondering if it is possible in KQL to pass a list of strings inside a query/fxn to use as shown below. I have commented where I think a list of strings could be passed to make the code more readable.
Now, I just need to fill_forward the label columns (MY_LABEL_1, MY_LABEL_2); which are a result of the below query. I would prefer the code is added on to the main query, and the final result is a table with all columns; Here is a sample table based on my case's result.
datatable (timestamp:datetime, tag1:double, tag2:double, tag3:double, tag4:double, MY_LABEL_1: string, MY_LABEL_2: string)
[
datetime(2020-11-24T00:01:00Z), 1, 3, 6, 9, "x", "foo",
datetime(2020-11-24T00:01:01Z), 1, 3, 6, 9, "", "",
datetime(2020-11-24T00:01:02Z), 1, 3, 6, 9,"", "",
datetime(2020-11-24T00:01:03Z), 1, 3, 6, 9,"y", "bar",
datetime(2020-11-24T00:01:04Z), 1, 3, 6, 9,"", "",
datetime(2020-11-24T00:01:05Z), 1, 3, 6, 9,"", "",
]
Series functions in ADX only work on dynamic arrays. You can apply a selective fill function using case() function, by replacing this line:
| extend series_fill_linear(agg_value, double(null), false) //EDIT
With something like the following:
| extend agg_value = case(
tag_name == "tag1", series_fill_linear(agg_value, double(null), false),
tag_name == "tag2", series_fill_forward(agg_value),
series_fill_forward(agg_value)
)
Edit:
Here is an example of string column fill-forward workaround:
let T = datatable ( Timestamp: datetime, Employee: string )
[ datetime(2020-01-01), "Bob",
datetime(2021-01-02), "",
datetime(2021-01-03), "Alice",
datetime(2021-01-04), "",
datetime(2021-01-05), "",
datetime(2021-01-06), "Alan",
datetime(2021-01-07), "",
datetime(2021-01-08), "" ]
| sort by Timestamp asc;
let employeeLookup = toscalar(T | where isnotempty(Employee) | summarize make_list(Employee));
T
| extend idx = row_cumsum(tolong(isnotempty(Employee)))
| extend EmployeeFilled = employeeLookup[idx - 1]
| project-away idx
Timestamp
Employee
EmployeeFilled
2021-01-01 00:00:00.0000000
Bob
Bob
2021-01-02 00:00:00.0000000
Bob
2021-01-03 00:00:00.0000000
Alice
Alice
2021-01-04 00:00:00.0000000
Alice
2021-01-05 00:00:00.0000000
Alice
2021-01-06 00:00:00.0000000
Alan
Alan
2021-01-07 00:00:00.0000000
Alan
2021-01-08 00:00:00.0000000
Alan
Regarding your requirement to convert the time series in many frequencies to a common one, have a look at series_downsample_fl() function library
In Azure Data Explorer, I am trying to use both the 'project' and 'distinct' keywords.
The table records have 3 fields I want to use the 'project' on:
CowName
CowType
CowNum
CowLabel
But there are many other fields in the table such as Date, Measurement, etc, that I do not want to return.
Cows
| project CowName, CowType, CowNum, CowLabel
However, I want to avoid duplicate records of CowName and CowNum, so I included
Cows
| project CowName, CowType, CowNum, CowLabel
| distinct CowName, CowNum
But when I do this, the only columns that are returned are CowName and CowNum. I am now missing CowType and CowLabel entirely.
Is there a way to use both 'project' and 'distinct' without them interfering with each other?
Is there a different approach I should take?
You can do:
Cows
| distinct CowName, CowType, CowNum
or, if you don't want to have distinct values of CowType - and just have any value of it:
Cows
| summarize any(CowType) by CowName, CowNum
References:
Summarize operator: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/summarizeoperator
Distinct operator:https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/distinctoperator
any() aggregation function: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/any-aggfunction
You can use this
| summarize any(CowType, CowLabel) by CowName, CowNum
To visualize how this will work take the following sample table/query:
let CowTable = datatable(CowNum:int, CowName:string, CowType:string, CowLabel:string, DontWantThis:int)
[
1, "Bob", "Bull", "label1", 99,
2, "Tipsy", "Heifer", "label1", 98,
3, "Milly", "Heifer", "label2", 99,
4, "Bob", "Bull", "label2", 87,
4, "Bob", "Bull", "label2", 77,
2, "Hanna", "Heifer", "label1", 98,
];
CowTable
| summarize any(CowType, CowLabel) by CowName, CowNum
Results:
Note that we do not see CowNum 4 listed twice, however we do see CowNum 2 listed twice; this is because those rows are unique in regard to the CowName & CowNum. We also see Bob listed twice (not 3 times); this is because 2 of the Bob entries are unique in regard to CowName/CowNum, but 2 of the Bob entries are not unique in regard to CowName/CowNum.
If you truly only want results where the CowName is unique and the CowNum is also distinct you can do this in a 2-step summarize:
CowTable
| summarize any(CowName, CowType, CowLabel) by CowNum
| summarize any(CowNum, any_CowType, any_CowLabel) by any_CowName
//normalize column names
| project CowNum = any_CowNum, CowName = any_CowName, CowType = any_any_CowType, CowLabel = any_any_CowLabel
Results:
I have written this code
import scrapy
class YellowPages(scrapy.Spider):
name = 'yp'
start_urls = [
"https://www.yellowpages.com/search?search_terms=agent&geo_location_terms=Los%20Angeles%2C%20CA&page=1",
]
def parse(self, response):
agent_name = response.xpath("//a[#class='business-name']/span/text()").extract()
phone_number = response.xpath("//div[#class='phones phone primary']/text()").extract()
address = response.xpath("//div[#class='street-address']/text()").extract()
locality = response.xpath("//div[#class='locality']/text()").extract()
data = zip(agent_name, phone_number, address, locality)
for item in data:
info = {
#'page' : response.url,
'Agent name': item[0],
'Phone number': item[1],
'Address': item[2],
'Locality':item[3],
}
yield info
next_page_href = response.xpath('//a[#class= "next ajax-page"]/#href').extract()[0]
next_page = "https://www.yellowpages.com"+next_page_href
if next_page is not None:
yield scrapy.Request(response.urljoin(next_page), callback=self.parse)
But now I want to have ratings added to my CSV file. bt the rating number is written in word.
like this.
<div class="result-rating three ">
On the webpage this rating is shown by stars and the number of the total stars is written in word in the code.
I want to get that rating in number. Anyone know how will I able extract the words into numbers??
Assuming the rating is from one to five, you can maintain an array of these words (one to five) and detect them in the string.
Something like this:
word_number_mapping = {'one': 1, 'two': 2, 'three': 3, 'four': 4, 'five': 5}
rating_value = None
rating_text = response.css('.result-rating::attr(class)').extract()
if rating_text:
for k, v in word_number_mapping.items():
if k in rating_text:
rating_value = v + 0.5 if 'half' in rating_text else v
Hope it helps.
I'm trying to dynamically create nested map like below in code.
def people = [
[name: 'Ash', age: '21', gender: 'm'],
[name: 'Jo', age: '22', gender: 'f'],
[name: 'etc.', age: '42', gender: 'f']
]
So I can search it like below
person = people.findAll {item ->
item.gender == 'm' &&
item.age == '21'}
My problem is that whilst I can dynamically create one dimensional maps in code, I don't know how to dynamically combine maps in code to create nested map e.g. let's assume in code I have created two maps name1 and name2. How do I add them to people map so they are nested like above example?
def people = [:]
def name1 = [name:'ash', age:'21', gender:'m']
def name2 = [name:'Jo', age:'22', gender:'f']
I've searched / tried so many posts without success. Below is close, but does not work :(
people.put((),(name1))
people.put((),(name2))
In your example, people is a list of maps, not a nested map
So you can simply do:
def people = []
def name1 = [name:'ash', age:'21', gender:'m']
def name2 = [name:'Jo', age:'22', gender:'f']
Then:
people += name1
people += name2
Or define it in one line:
def people = [name1, name2]
I am very new to lua and my plan is to create a table. This table (I call it test) has 200 entries - each entry has the same subentries (In this example the subentries money and age):
This is a sort of pseudocode:
table test = {
Entry 1: money=5 age=32
Entry 2: money=-5 age=14
...
Entry 200: money=999 age=72
}
How can I write this in lua ? Is there a possibility ? The other way would be, that I write each subentry as a single table:
table money = { }
table age = { }
But for me, this isn't a nice way, so maybe you can help me.
Edit:
This question Table inside a table is related, but I cannot write this 200x.
Try this syntax:
test = {
{ money = 5, age = 32 },
{ money = -5, age = 14 },
...
{ money = 999, age = 72 }
}
Examples of use:
-- money of the second entry:
print(test[2].money) -- prints "-5"
-- age of the last entry:
print(test[200].age) -- prints "72"
You can also turn the problem on it's side, and have 2 sequences in test: money and age where each entry has the same index in both arrays.
test = {
money ={1000,100,0,50},
age={40,30,20,25}
}
This will have better performance since you only have the overhead of 3 tables instead of n+1 tables, where n is the number of entries.
Anyway you have to enter your data one way or another. What you'd typically do is make use some easily parsed format like CSV, XML, ... and convert that to a table. Like this:
s=[[
1000 40
100 30
0 20
50 25]]
test ={ money={},age={}}
n=1
for balance,age in s:gmatch('([%d.]+)%s+([%d.]+)') do
test.money[n],test.age[n]=balance,age
n=n+1
end
You mean you do not want to write "money" and "age" 200x?
There are several solutions but you could write something like:
local test0 = {
5, 32,
-5, 14,
...
}
local test = {}
for i=1,#test0/2 do
test[i] = {money = test0[2*i-1], age = test0[2*i]}
end
Otherwise you could always use metatables and create a class that behaves exactly like you want.