I setup a constraint that does not constraint the solver in pyomo.
The constraint is the following:
def revenue_positive(model,t):
for t in model.T:
return (model.D[t] * model.P[t]) >= 0
model.positive_revenue = Constraint(model.T, rule=revenue_positive)
while the model parameters are:
model = ConcreteModel()
model.T = Set(doc='quarter of year', initialize=df.index.tolist(), ordered=True)
model.P = Param(model.T, initialize=df['price'].to_dict(), within=Any, doc='Price for each quarter')
model.C = Var(model.T, domain=NonNegativeReals)
model.D = Var(model.T, domain=NonNegativeReals)
income = sum(df.loc[t, 'price'] * model.D[t] for t in model.T)
expenses = sum(df.loc[t, 'price'] * model.C[t] for t in model.T)
profit = income - expenses
model.objective = Objective(expr=profit, sense=maximize)
# Solve the model
solver = SolverFactory('cbc')
solver.solve(model)
df dataframe is:
df time_stamp price Status imbalance Difference Situation ... week month hour_of_day day_of_week day_of_year yearly_quarter
quarter ...
0 2021-01-01 00:00:00 64.84 Final 16 -3 Deficit ... 00 1 0 4 1 1
1 2021-01-01 00:15:00 13.96 Final 38 2 Surplus ... 00 1 0 4 1 1
2 2021-01-01 00:30:00 12.40 Final 46 1 Surplus ... 00 1 0 4 1 1
3 2021-01-01 00:45:00 7.70 Final 65 14 Surplus ... 00 1 0 4 1 1
4 2021-01-01 01:00:00 64.25 Final 3 -9 Deficit ... 00 1 1 4 1 1
The objective is to constraint the solver not to accept a negative revenue. As such it does not work as the solver passes 6 negative revenue values through. Looking at the indices with negative revenue, it appears the system chooses to sell at a negative price to buy later at a price even "more" negative, so from an optimization standpoint, it is ok. I would like to check the difference in results if we prohibit the solver to do that. Any input is welcome as after many searches on the web, still not the right way to write it correctly.
I did a pprint() of the constraint that returned:
positive_revenue : Size=35040, Index=T, Active=True
UPDATE following new constraint code:
def revenue_positive(model,t):
return model.D[t] * model.P[t] >= 0
model.positive_revenue = Constraint(model.T, rule=revenue_positive)
Return the following error:
ERROR: Rule failed when generating expression for constraint positive_revenue
with index 283: ValueError: Invalid constraint expression. The constraint
expression resolved to a trivial Boolean (True) instead of a Pyomo object.
Please modify your rule to return Constraint.Feasible instead of True.
Error thrown for Constraint 'positive_revenue[283]'
ERROR: Constructing component 'positive_revenue' from data=None failed:
ValueError: Invalid constraint expression. The constraint expression
resolved to a trivial Boolean (True) instead of a Pyomo object. Please
modify your rule to return Constraint.Feasible instead of True.
Error thrown for Constraint 'positive_revenue[283]'
Traceback (most recent call last):
File "/home/olivier/Desktop/Elia - BESS/run_imbalance.py", line 25, in <module>
results_df = optimize_year(df)
File "/home/olivier/Desktop/Elia - BESS/battery_model_imbalance.py", line 122, in optimize_year
model.positive_revenue = Constraint(model.T, rule=revenue_positive)
File "/home/olivier/anaconda3/lib/python3.9/site-packages/pyomo/core/base/block.py", line 542, in __setattr__
self.add_component(name, val)
File "/home/olivier/anaconda3/lib/python3.9/site-packages/pyomo/core/base/block.py", line 1087, in add_component
val.construct(data)
File "/home/olivier/anaconda3/lib/python3.9/site-packages/pyomo/core/base/constraint.py", line 781, in construct
self._setitem_when_not_present(
File "/home/olivier/anaconda3/lib/python3.9/site-packages/pyomo/core/base/indexed_component.py", line 778, in _setitem_when_not_present
obj.set_value(value)
File "/home/olivier/anaconda3/lib/python3.9/site-packages/pyomo/core/base/constraint.py", line 506, in set_value
raise ValueError(
ValueError: Invalid constraint expression. The constraint expression resolved to a trivial Boolean (True) instead of a Pyomo object. Please modify your rule to return Constraint.Feasible instead of True.
Error thrown for Constraint 'positive_revenue[283]'
So there are 2 issues w/ your constraint. It isn't clear if one is a cut & paste issue or not.
The function call to make the constraint appears to be indented and inside of your function after the return statement, making it unreachable code. Could be just the spacing in your post.
You are incorrectly adding a loop inside of your function. You are passing in the parameter t as a function argument and then you are blowing it away with the for loop, which only executes for the first value of t in T then hits the return statement. Remove the loop. When you use the rule= structure in pyomo it will call the rule for each instance of the set that you are using in the Constraint(xx, rule=) structure.
So I think you should have:
def revenue_positive(model, t):
return model.D[t] * model.P[t] >= 0
model.positive_revenue = Constraint(model.T, rule=revenue_positive)
Updated re: the error you added.
The error cites the 283rd index. My bet is that price[283] is zero, so you are multiplying by a zero and killing your variable.
You could add a check within the function that checks if the price is zero, and in that case, just return pyo.Constraint.Feasible, which is the trivial return that doesn't influence the model (or crash)
Related
working on an economic optimization problem with pyomo, I would like to add a constraint to prevent the product of the commodity quantity and its price to go below zero (<0), avoiding a negative revenue. It appears that all the data are in a dataframe and I can't setup a constraint like:
def positive_revenue(model, t)
return model.P * model.C >=0
model.positive_rev = Constraint(model.T, rule=positive_revenue)
The system returns the error that the price is a scalar and it cannot process it. Indeed the price is set as such in the model:
model.T = Set(doc='quarter of year', initialize=df.quarter.tolist(), ordered=True)
model.P = Param(initialize=df.price.tolist(), doc='Price for each quarter')
##while the commodity is:
model.C = Var(model.T, domain=NonNegativeReals)
I just would like to apply that for each timestep (quarter of hour here) that:
price(t) * model.C(t) >=0
Can someone help me to spot the issue ? Thanks
Here are more information:
df dataframe:
df time_stamp price Status imbalance
quarter
0 2021-01-01 00:00:00 64.84 Final 16
1 2021-01-01 00:15:00 13.96 Final 38
2 2021-01-01 00:30:00 12.40 Final 46
index = quarter from 0 till 35049, so it is ok
Here is the df.info()
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 time_stamp 35040 non-null datetime64[ns]
1 price 35040 non-null float64
2 Status 35040 non-null object
3 imbalance 35040 non-null int64
I modified the to_list() > to_dict() in model.T but still facing the same issue:
KeyError: "Cannot treat the scalar component 'P' as an indexed component" at the time model.T is defined in the model parameter, set and variables.
Here is the constraint where the system issues the error:
def revenue_positive(model,t):
for t in model.T:
return (model.C[t] * model.P[t]) >= 0
model.positive_revenue = Constraint(model.T,rule=revenue_positive)
Can't figure it out...any idea ?
UPDATE
Model works after dropping an unfortunate 'quarter' column somewhere...after I renamed the index as quarter.
It runs but i still get negative revenues, so the constraints seems not working at present, here is how it is written:
def revenue_positive(model,t):
for t in model.T:
return (model.C[t] * model.P[t]) >= 0
model.positive_revenue = Constraint(model.T,rule=revenue_positive)
What am I missing here ? Thanks for help, just beginning
Welcome to the site.
The problem you appear to be having is that you are not building your model parameter model.P as an indexed component. I believe you likely want it to be indexed by your set model.T.
When you make indexed params in pyomo you need to initialize it with some key:value pairing, like a python dictionary. You can make that from your data frame by re-indexing your data frame so that the quarter labels are the index values.
Caution: The construction you have for model.T and this assume there are no duplicates in the quarter names.
If you have duplicates (or get a warning) then you'll need to do something else. If the quarter labels are unique you can do this:
import pandas as pd
import pyomo.environ as pyo
df = pd.DataFrame({'qtr':['Q5', 'Q6', 'Q7'], 'price':[12.80, 11.50, 8.12]})
df.set_index('qtr', inplace=True)
print(df)
m = pyo.ConcreteModel()
m.T = pyo.Set(initialize=df.index.to_list())
m.price = pyo.Param(m.T, initialize=df['price'].to_dict())
m.pprint()
which should get you:
price
qtr
Q5 12.80
Q6 11.50
Q7 8.12
1 Set Declarations
T : Size=1, Index=None, Ordered=Insertion
Key : Dimen : Domain : Size : Members
None : 1 : Any : 3 : {'Q5', 'Q6', 'Q7'}
1 Param Declarations
price : Size=3, Index=T, Domain=Any, Default=None, Mutable=False
Key : Value
Q5 : 12.8
Q6 : 11.5
Q7 : 8.12
2 Declarations: T price
edit for clarity...
NOTE:
The first argument when you create a pyomo parameter is the indexing set. If this is not provided, pyomo assumes that it is a scalar. You are missing the set as shown in my example and highlighted with arrow here: :)
|
|
|
V
m.price = pyo.Param(m.T, initialize=df['price'].to_dict())
Also note, you will need to initialize model.P with a dictionary as I have in the example, not a list.
I am trying to make a program to prompt a user for input until they enter a number within a specific range.
What is the best approach to make sure the code does not error out when I enter a letter, a symbol, or a number outside of the specified range?
In alternative to parse, you can use tryparse:
tryparse(type, str; base)
Like parse, but returns either a value of the requested type, or
nothing if the string does not contain a valid number.
The advantage over parse is that you can have a cleaner error handling without resorting to try/catch, which would hide all exceptions raised within the block.
For example you can do:
while true
print("Please enter a whole number between 1 and 5: ")
input = readline(stdin)
value = tryparse(Int, input)
if value !== nothing && 1 <= value <= 5
println("You entered $(input)")
break
else
#warn "Enter a whole number between 1 and 5"
end
end
Sample run:
Please enter a whole number between 1 and 5: 42
┌ Warning: Enter a whole number between 1 and 5
└ # Main myscript.jl:9
Please enter a whole number between 1 and 5: abcde
┌ Warning: Enter a whole number between 1 and 5
└ # Main myscript.jl:9
Please enter a whole number between 1 and 5: 3
You entered 3
This is one possible way to achieve this sort of thing:
while true
print("Please enter a whole number between 1 and 5: ")
input = readline(stdin)
try
if parse(Int, input) <= 5 || parse(Int, input) >= 1
print("You entered $(input)")
break
end
catch
#warn "Enter a whole number between 1 and 5"
end
end
Sample Run:
Please enter a whole number between 1 and 5: 2
You entered 2
See this link for how to parse the user input into an int.
I can only give you picture of data I'm working with or the character that creates my problems in .csv file. I don't know how to get that character.
This pillar character is stopping fread working. Is there away to escape it? readr read_csv works through them with no problem. I have tried to drop, make it character column, use comment.char = "", but nothing seems to work.
Here what I'm hoping to get out (what I get out with read_csv)
# A tibble: 5 x 4
X1 trade date trade_condition
<dbl> <dbl> <date> <chr>
1 2902 28.3 2019-01-14 -12------P----
2 2903 28.0 2019-01-14 P
3 2904 28.0 2019-01-14 P
4 2905 28.0 2019-01-14 P
5 2906 28.1 2019-01-14 P
I'm using data.table_1.12.0
Here is Verbose = T
omp_get_max_threads() = 8
omp_get_thread_limit() = 2147483647
DTthreads = 0
RestoreAfterFork = true
Input contains no \n. Taking this to be a filename to open
[01] Check arguments
Using 8 threads (omp_get_max_threads()=8, nth=8)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
show progress = 1
0/1 column will be read as integer
[02] Opening the file
Opening file C:/Users/Markku/Desktop/KONECRANES_2019.01.14/trades.csv
File opened, size = 592KB (606768 bytes).
Memory mapped ok
[03] Detect and skip BOM
[04] Arrange mmap to be \0 terminated
\n has been found in the input and different lines can end with different line endings (e.g. mixed \n and \r\n in one file). This is common and ideal.
[05] Skipping initial rows if needed
Positioned on line 1 starting: <<,trade,date,trade_condition,sy>>
[06] Detect separator, quoting rule, and ncolumns
Detecting sep automatically ...
sep=',' with 100 lines of 9 fields using quote rule 0
Detected 9 columns on line 1. This line is either column names or first data row. Line starts as: <<,trade,date,trade_condition,sy>>
Quote rule picked = 0
fill=false and the most number of columns found is 9
[07] Detect column types, good nrow estimate and whether first row is column names
Number of sampling jump points = 10 because (606767 bytes from row 1 to eof) / (2 * 27623 jump0size) == 10
Type codes (jump 000) : 57AAAA5AA Quote rule 0
A line with too-few fields (4/9) was found on line 4 of sample jump 7. Most likely this jump landed awkwardly so type bumps here will be skipped.
A line with too-few fields (4/9) was found on line 13 of sample jump 9. Most likely this jump landed awkwardly so type bumps here will be skipped.
Type codes (jump 010) : 57AAAA5AA Quote rule 0
'header' determined to be true due to column 2 containing a string on row 1 and a lower type (float64) in the rest of the 858 sample rows
=====
Sampled 858 rows (handled \n inside quoted fields) at 11 jump points
Bytes from first data row on line 2 to the end of last row: 606683
Line length: mean=213.01 sd=86.78 min=59 max=372
Estimated number of rows: 606683 / 213.01 = 2849
Initial alloc = 5698 rows (2849 + 100%) using bytes/max(mean-2*sd,min) clamped between [1.1*estn, 2.0*estn]
=====
[08] Assign column names
[09] Apply user overrides on column types
After 0 type and 0 drop user overrides : 57AAAA5AA
[10] Allocate memory for the datatable
Allocating 9 column slots (9 - 0 dropped) with 5698 rows
[11] Read the data
jumps=[0..1), chunk_size=606683, total_size=606683
Restarting team from jump 0. nSwept==0 quoteRule==1
jumps=[0..1), chunk_size=606683, total_size=606683
Restarting team from jump 0. nSwept==0 quoteRule==2
jumps=[0..1), chunk_size=606683, total_size=606683
Restarting team from jump 0. nSwept==0 quoteRule==3
jumps=[0..1), chunk_size=606683, total_size=606683
Read 2903 rows x 9 columns from 592KB (606768 bytes) file in 00:00.014 wall clock time
[12] Finalizing the datatable
Type counts:
2 : int32 '5'
1 : float64 '7'
6 : string 'A'
=============================
0.003s ( 21%) Memory map 0.001GB file
0.007s ( 50%) sep=',' ncol=9 and header detection
0.000s ( 0%) Column type detection using 858 sample rows
0.000s ( 0%) Allocation of 5698 rows x 9 cols (0.000GB) of which 2903 ( 51%) rows used
0.004s ( 29%) Reading 1 chunks (0 swept) of 0.579MB (each chunk 2903 rows) using 1 threads
+ 0.000s ( 0%) Parse to row-major thread buffers (grown 0 times)
+ 0.002s ( 14%) Transpose
+ 0.002s ( 14%) Waiting
0.000s ( 0%) Rereading 0 columns due to out-of-sample type exceptions
0.014s Total
Warning message:
In fread(trades_file, verbose = T) :
Stopped early on line 2905. Expected 9 fields but found 4. Consider fill=TRUE and comment.char=. First discarded non-empty line: <<2903,28.04,2019-01-14,"P>>
I would like to remove the duplicate records from my large .xdf file trans.xdf.
Here is the file details:
File name: /poc/revor/data/trans.xdf
Number of observations: 1000000000
Number of variables: 5
Number of blocks: 40
Compression type: zlib
Variable information:
Var 1: CARD_ID, Type: character
Var 2: SE_NO, Type: character
Var 3: r12m_cv, Type: numeric, Low/High: (-2348.7600, 40587.3900)
Var 4: r12m_roc, Type: numeric, Low/High: (0.0000, 231.0000)
Var 5: PROD_GRP_CD, Type: character
Also below is the sample data of the file:
CARD_ID SE_NO r12m_cv r12m_roc PROD_GRP_CD
900000999000000000 1045815024 110 1 1
900000999000000000 1052487253 247.52 2 1
900000999000000000 9999999999 38.72 1 1
900000999000000000 1090389768 1679.96 16 1
900000999000000000 1091226035 0 1 1
900000999000000000 1091241208 538.68 4 1
900000999000000000 9999999999 83 1 1
900000999000000000 1091468041 148.4 3 1
900000999000000000 1092640358 3.13 1 1
900000999000000000 1093468692 546.29 1 1
I have tried using rxDataStep function to use its transform parameter to call to unique() function over the .xdf file. Below is the code for the same:
uniq_dat <- function( dataList )
{
datalist <- unique(datalist)
return(datalist)
}
rxDataStepXdf(inFile = "/poc/revor/data/trans.xdf",outFile = "/poc/revor/data/trans.xdf",transformFunc = uniq_dat,overwrite = TRUE)
But was getting below error:
Error in unique(datalist) : object 'datalist' not found
Error in transformation function: Error in unique(datalist) : object 'datalist' not found
Error in rxCall("RxDataStep", params) :
So anybody could point out the mistake that I am doing here or if there is a better way to remove the duplicate records from the .Xdf file. I am avoiding loading the data into inmemory dataframe as the data is pretty huge.
I am running the above code in Revolution R Environment over HDFS.
If the same can be obtained by any other approach then the example for the same would be appreciated.
Thanks for the help in advance :)
Cheers,
Amit
you can remove the duplicate values providing removeDupKeys=TRUE parameter for rxSort() function. For example for your case:
XdfFilePath <- file.path("<your file's fully qualified path>/trans.xdf")
rxSort(inData = XdfFilePath,sortByVars=c("CARD_ID","SE_NO","r12m_cv","r12m_roc","PROD_GRP_CD"), removeDupKeys=TRUE)
if you want to remove duplicate records based on a specific key column, for example, based on SE_NO column
set the key value as sortByVars="SE_NO"
I'm pretty new in statistics:
fisher = function(idxToTest, idxATI){
idxDependent=c()
dependent=c()
p = c()
for(i in c(1:length(idxToTest)))
{
tbl = table(data[[idxToTest[i]]], data[[idxATI]])
rez = fisher.test(tbl, workspace = 20000000000)
if(rez$p.value<0.1){
dependent=c(dependent, TRUE)
if(rez$p.value<0.1){
idxDependent = c(idxDependent, idxToTest[i])
}
}
else{
dependent = c(dependent, FALSE)
}
p = c(p, rez$p.value)
}
}
This is the function I use. It seems to work.
What I understood until now is that I have to pass as first parameter data like:
Men Women
Dieting 10 30
Non-dieting 5 60
My data comes from a CSV:
data = read.csv('***.csv', header = TRUE, sep=',');
My first problem is that I don't know how to converse from:
Loan.Purpose Home.Ownership
lp_value_1 ho_value_2
lp_value_1 ho_value_2
lp_value_2 ho_value_1
lp_value_3 ho_value_2
lp_value_2 ho_value_3
lp_value_4 ho_value_2
lp_value_3 ho_value_3
to:
ho_value_1 ho_value_2 ho_value_3
lp_value1 0 2 0
lp_value2 1 0 1
lp_value3 0 1 1
lp_value4 0 1 0
The second issue is that I don't know what the second parameter should be
POST UPDATE: This is what I get using fisher.test(myTable):
Error in fisher.test(test) : FEXACT error 501.
The hash table key cannot be computed because the largest key
is larger than the largest representable int.
The algorithm cannot proceed.
Reduce the workspace size or use another algorithm.
where myTable is:
MORTGAGE NONE OTHER OWN RENT
car 18 0 0 5 27
credit_card 190 0 2 38 214
debt_consolidation 620 0 2 87 598
educational 5 0 0 3 7
...
Basically, fisher tests only work on smallish data sets because they require alot of memory. But all is good because chi-square tests make minimal additional assumptions and are easier on the computer. Just do:
chisq.test(Loan.Purpose,Home.Ownership)
to get your p-values.
Make sure you read through and understand the help page for chisq.test, especially the examples at the bottom.
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/chisq.test.html
Then look at a mosaicplot to see the quantities like:
mosaicplot(Loan.Purpose,Home.Ownership)
this reference explains how mosaicplots work.
http://alumni.media.mit.edu/~tpminka/courses/36-350.2001/lectures/day12/