Decreasing support threshold for arules in R

Decreasing support threshold for arules in R - r

I am working on association rules that are considered outliers. I noticed that arules does not show results for rules that have a support less than .10. Is there any way that I could view rules that have a support of .1 (10%) or less?
I tried the following code to try to filter out rules with less than a .1 support. I suspect rules with less than a .1 support do not show up because there would be too many? In any case, here's the code I'm using to see rules with less than a .1 support. By the way, this code works when I want to see greater than anything over .1 .
rulesb = rulesa[quality(rulesa)$support<0.1]

From the examples in ?apriori:
data("Adult")
## Mine association rules.
rules <- apriori(Adult,
parameter = list(supp = 0.5, conf = 0.9, target = "rules"))
summary(rules)
Set supp to the value you like.

Related

How to use Simulated Annealing in R (GenSA) for a function with discrete variables with a few options without pattern?

I want to use Simulated Annealing. My objective function exist of multiple variables, for some of them there are only a few options possible. I saw the same question on Stack here:
How to use simulated annealing for a function with discrete paremeters?, but there was no answer but a reference to: How to put mathematical constraints with GenSA function in R.
I don't understand how to apply the advice from the second link to my situation (but I think the answer can be found there).
For example:
v <- c(50, 50, 25, 25)
lower <- c(0,0,0,20)
upper <- c(100,100,50,40)
out <- GenSA(v, lower = lower, upper = upper, fn = efficientFunction)
Assume that the fourth parameter, v[4], only can be in {20,25,30,35,40}. They suggested the use of Lagrange multipliers, hence, I was thinking of something like: lambda * ceil(v[4] / 5). Is this a good idea ?
But what can I do it the sample space of a variable does not have a nice pattern, for example third parameter, v[3], only can be in {0,21,33,89,100}. I don't understand why a Lagrange multiplier can help in this situation. Do I need to make the form of my parameters different that they follow a pattern or is there another option?
In case Lagrange multipliers are the only option, I'll end up with with 8 of these formulations in my objective. It seems to me that there is another option, but I don't know how!
With kind regards and thanks in advance,
Roos

With SA, you could start with a very simple neighbourhood sheme,
pick 1 of the parameters, and change it by selecting a new valid setting, 1 above, or 1 below the current one (we assume that they have a order, like I feel is your case).
There are no Lagrange multipliers involved in SA as I know. But there are many variations and maybe some with Constrainsts or other make use of them.

Is it possible to set initial values to use in optimisation?

I'm currently using SQSLP, and defining my design variables like so:
p.model.add_design_var('indeps.upperWeights', lower=np.array([1E-3, 1E-3, 1E-3]))
p.model.add_design_var('indeps.lowerWeights', upper=np.array([-1E-3, -1E-3, -1E-3]))
p.model.add_constraint('cl', equals=1)
p.model.add_objective('cd')
p.driver = om.ScipyOptimizeDriver()
However, it insists on trying [1, 1, 1] for both variables. I can't override with val=[...] in the component because of how the program is structured.
Is it possible to get the optimiser to accept some initial values instead of trying to set anything without a default value to 1?

By default, OpenMDAO initializes variables to a value of 1.0 (this tends to avoid unintentional divide-by-zero if things were initialized to zero).
Specifying shape=... on input or output results in the variable values being populated by 1.0
Specifying val=... uses the given value as the default value.
But that's only the default values. Typically, when you run an optimization, you need to specify initial values of the variables for the given problem at hand. This is done after setup, through the problem object.
The set_val and get_val methods on problem allow the user to convert units. (using Newtons here for example)
p.set_val('indeps.upperWeights', np.array([1E-3, 1E-3, 1E-3]), units='N')
p.set_val('indeps.upperWeights', np.array([-1E-3, -1E-3, -1E-3]), units='N')
There's a corresponding get_val method to retrieve values in the desired units after optimization.
You can also access the problem object as though it were a dictionary, although doing so removes the ability to specify units (you get the variable values in its native units).
p['indeps.upperWeights'] = np.array([1E-3, 1E-3, 1E-3])
p['indeps.upperWeights'] = np.array([-1E-3, -1E-3, -1E-3])

difference between FD steps and scaling for the design variables

if i have a design variable that has lower and upper bounds of 0 and 1e6 and an initial value of 1e5
it surely is very insensitive to the default finite difference steps of 1e-6
is the correct way of overcoming this problem ;
a) change FD step size f.e. to 5e4
b) scale the design variable with 'scaler' of 1e6 and set the lower upper bounds to 0 and 1, while keeping the default FD steps.

I think "a" is your best bet if you are using the latest (OpenMDAO 2.x).
When you call declare_partials for a specific derivative in a component, or when you call approx_totals on a group, you can pass in an optional argument called "step", which contains the desired stepsize. Since your variable spans [0, 1e6], I think maybe a step size between 1e1 and 1e3 would work for you.
Idea "b" wouldn't actually work at present for fixing the FD problem. The step size you declare is applied to the unscaled value of the input, so you would still have the same precision problem. This is true for both kinds of scaling (1. specified on add_output, and 2. specified on add_design_var). Note though that you may still want to scale this problem anyway because the optimizer may work better on a scaled problem. If you do this then, you should still declare the larger "step" size mentioned above.
BTW, another option is to use a relative stepsize in the 'fd' calculation by setting the "step_calc" argument to "rel". This turns the absolute stepsize into a relative stepsize. However, I don't recommend this here because your range includes zero, and when it is close to zero, the stepsize falls back to an absolute one to prevent it from being too tiny.

How to find Confidence of association rule in Apriori algorithm

I am using Apriori algorithm to identify the frequent item sets of the customer.Based on the identified frequent item sets I want to prompt suggest items to customer when customer adds a new item to his shopping list.Assume my one identified frequent set is [2,3,5].My question is;
if user has already added item 2 and item 5, i want check the confidence of the rule to suggest item 3. for that;
confidence = support of (2,3,5)/ support (3) ?
OR
confidence = support of (2,3,5)/ support (2,5)?
which equation is correct? please help!!

If the association rule is (2,5) -> (3), than is X = (2,5) and Y = (3). The confidence of an association rule is the support of (X U Y) divided by the support of X. Therefore, the confidence of the association rule is in this case the support of (2,5,3) divided by the support of (2,5).

Suppose A^B -> C then
Confidence = support(A^B->C)
i.e. a number of transactions in which all three items are present / support(A,B)
i.e. a number of transactions in which both A and B are present.
So the answer is confidence= support(2,5,3)/support (2,5)

If you just want the answer without any explanation:
confidence = support of (2,3,5)/ support (2,5) in your question is the answer.

What is your antecedent?
Stop.treating equarions as black boxes you need to lok up. understand them or you will fail.

Quering association rules created by weka apriori algorithm

The aprori algorithm produces a large number of rules, is there any way query/filter the resulting ruleset, for example looking for rules with specific items appearing in the antecedent, or rules with specific size?

I can help you only partially.
When using apriori algorithm in R, you can add an option to specify the size of the output rules. The attribute to do that is maxlen.
In this case, for example, I want to obtain only rules with size 2.
rules <- apriori(orders, parameter = list(supp = 0.01, conf = 0.5,
maxlen=2))
By this attribute we will obtain rules such as:
[item1] -> [item2]
Daniele

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Decreasing support threshold for arules in R - r

From the examples in ?apriori: data("Adult") ## Mine association rules. rules <- apriori(Adult, parameter = list(supp = 0.5, conf = 0.9, target = "rules")) summary(rules) Set supp to the value you like.

Related

How to use Simulated Annealing in R (GenSA) for a function with discrete variables with a few options without pattern?

Is it possible to set initial values to use in optimisation?

difference between FD steps and scaling for the design variables

How to find Confidence of association rule in Apriori algorithm

Quering association rules created by weka apriori algorithm

Categories

Resources