Problems with convergency when using solnp function in R - r

When solving a portfolio optimization problem with an additional 1-norm constraint on the portfolio weights, I faced with convergency problems.
Description of the exercise:
For given N assets with T observations of their returns find the value of 1Norm constraint Theta, such that the last period portfolio return (T-th one) is maximized. That is, solve the problem: min_w w'COVw$ s.t. w1+w2+...+wN=1 and |w1|+|w2|+...+|wN|<=Theta and, out of all values of Theta, choose the one with the maximum value of w'r_T, where r_T is the vector of assets returns in the last period and COV is the variance-covariance matrix of asset returns.
Description of the problem:
First I tried the "naive" approach: with a grid of 0.001 for Theta from 1 to 6 I was going to solve the portfolio optimization problem and compute the last period portfolio return for each Theta. The idea was to choose then the value of Theta with the largest corresponding last period return. However, I noticed that for quite a few values of Theta the solnp function did not converge. The problem occured mostly for small values of Theta: from 1 to 3. For larger values no problems with convergency were detected.
The second approach was to use solnp function twice: first as the function to find Theta and second as the inner part of the objective function. However, I could not find reliable estimates in this way: the values I got did not deliver the optimal solution. Apparently, the objective function is not smooth, but gosolnp function does not find the solution.
The code with data (6 assets with 120 return observations) is provided below. Any suggestions are welcome.
> exp_d
[,1] [,2] [,3] [,4] [,5] [,6]
1 1.3724 0.9081 -0.0695 5.7168 1.9642 1.4222
2 0.6095 1.5075 5.3842 2.7154 2.6838 6.3154
3 -2.6779 -0.1359 -0.4374 1.4287 0.0709 -0.7967
4 -3.5365 -4.3572 -2.0112 -3.5898 -2.3460 -4.0970
5 3.1210 3.6608 2.0944 3.1292 2.8965 3.4614
6 2.7364 1.8411 3.2639 2.9678 2.6067 2.3950
7 -1.0001 -0.3782 3.9316 -0.2621 0.0347 4.4635
8 3.9022 6.3784 6.6192 5.0044 3.5568 8.6305
9 -1.6000 -0.9889 -3.1676 1.3025 0.2071 -2.4764
10 -1.3184 0.8741 3.4796 3.0510 -0.7634 -0.5452
11 5.5482 3.3467 13.3256 5.4076 5.1017 7.4921
12 -1.5484 1.3040 -3.9474 -0.9628 -3.0156 -1.6326
13 4.3331 5.2347 3.9846 9.2369 6.7429 7.2603
14 2.3503 -2.2044 0.8600 3.9191 1.2181 -1.9651
15 2.3981 2.2316 0.3990 5.3864 4.3919 5.9674
16 -0.1633 -2.1458 -5.8357 -3.6349 -4.2840 -6.6219
17 10.4346 8.0620 10.2275 7.0560 6.7676 6.6346
18 5.5505 2.6016 2.4506 2.4954 1.8547 3.4755
19 3.2031 2.7804 3.5948 -0.4774 -0.3667 -2.3168
20 -4.7913 -1.7203 -4.1271 -0.6762 -1.1395 -2.7296
21 7.3930 8.6229 9.4570 12.2800 6.1327 7.8254
22 4.2158 10.6845 9.9723 2.9145 6.0000 4.4979
23 7.5326 1.9540 2.5740 2.6065 -0.1128 0.6388
24 -8.5131 -8.3044 -6.8294 -3.6094 -4.1224 -5.4164
25 -0.4048 -0.4017 -0.8867 1.3590 0.1098 0.9017
26 6.1240 5.0517 3.6990 8.7368 5.3867 6.9468
27 5.7317 4.4538 6.1762 3.4108 1.9153 4.4896
28 6.8299 3.1244 1.6621 1.3590 1.4325 2.0067
29 8.6705 11.1936 12.2831 11.1602 13.2781 13.1497
30 0.9055 -0.5953 -0.6462 0.9332 -0.0008 1.2917
31 -0.0340 1.9379 1.4480 6.5262 4.8373 2.6307
32 0.7414 1.1014 0.3820 -0.5791 0.8306 3.1476
33 -5.9533 -3.4602 -4.1597 -1.3835 2.2098 -0.0642
34 -0.0822 2.7549 0.5136 2.2172 1.1145 2.8362
35 -10.2009 -9.3603 -12.8907 -5.7297 -4.1622 -6.1709
36 6.9673 9.4356 6.4064 12.4593 9.3500 7.5052
37 -0.0681 -0.7029 3.8633 3.5471 6.2018 4.9529
38 -0.8885 -0.4641 -0.4544 7.2353 12.5430 6.9497
39 -4.2569 -4.3105 -3.6218 -4.4211 -4.9685 -6.3175
40 -21.7261 -19.3351 -19.8329 -24.1865 -14.0943 -10.3723
41 -16.4096 -9.8425 -12.3366 -14.2535 -10.9314 -7.6677
42 -2.7511 -2.7249 -1.8488 3.3988 0.5515 1.3170
43 6.2347 9.4395 8.4631 7.1735 3.3113 2.9221
44 1.3502 1.2075 4.3779 3.4639 1.8818 1.2221
45 9.0009 11.0661 10.7647 6.8160 8.3348 5.3573
46 -6.0956 -1.7406 -3.4814 -2.3610 -1.7939 -6.6523
47 -3.7611 -2.3132 -2.9578 0.7061 -2.2643 -1.3650
48 -16.9330 -16.6497 -18.9287 -17.8490 -13.1565 -11.8439
49 6.5208 3.7280 2.5406 4.4017 4.2198 5.2644
50 -3.7646 -1.6135 -2.5073 1.1969 -0.8000 -1.6232
51 -13.6823 -14.6721 -19.1619 -12.0325 -11.7394 -17.0858
52 -10.1007 -7.6801 -10.7443 -9.4278 -7.3264 -11.4835
53 0.2234 -2.7838 -2.6023 -2.3616 -2.8339 -6.6002
54 -10.4236 -11.0266 -17.2866 -5.8057 -9.1776 -9.7245
55 9.9121 11.3602 15.4627 4.7170 6.9823 13.5911
56 13.2032 11.6577 17.0900 12.8706 6.7357 12.1756
57 -4.3440 -2.9204 -6.9601 -4.1971 -10.3577 -8.8826
58 -12.8949 -14.0212 -18.2229 -8.8412 -11.1070 -11.3490
59 -9.6709 -8.7006 -13.8440 -11.3932 -15.9343 -20.3630
60 10.1262 7.7295 20.3174 12.8509 16.1956 25.2657
61 -8.5867 -8.9411 -5.7363 -5.1668 -10.1775 -12.2143
62 -1.6114 -1.1797 -3.7504 0.8986 -1.5948 0.0531
63 -26.5722 -30.0521 -33.8286 -28.8671 -28.1493 -35.1131
64 4.2319 9.2071 5.4593 9.5401 3.2588 11.7214
65 -7.7335 -7.2915 -9.3902 -6.1288 -16.6307 -14.5665
66 -13.7987 -16.5088 -22.4192 -11.5667 -18.8221 -20.6686
67 2.4924 3.9264 10.5116 -3.4347 1.9132 6.6353
68 -2.4754 1.9822 1.3833 7.1142 1.7366 0.3507
69 -10.5259 -12.1366 -10.6244 -10.2031 -15.1403 -14.7460
70 -19.3072 -17.1449 -16.0430 -18.1275 -20.1476 -18.5418
71 -17.1387 -24.8400 -17.4741 -19.2184 -25.9531 -25.4360
72 0.9540 6.2481 1.0364 -1.7110 0.0545 8.8476
73 31.9086 36.1125 63.2475 28.1955 48.6209 67.7847
74 48.7266 55.3725 83.5754 31.3211 50.9661 62.5277
75 -2.8054 -4.5841 -12.4737 -1.0366 -6.5149 -5.0087
76 -16.2535 -19.6450 -23.8672 -10.7514 -17.5126 -23.2272
77 -0.2972 -8.2786 -13.2053 -2.7223 -8.9288 -16.3816
78 -5.1581 -4.2529 -11.0631 6.8503 1.0438 -3.5633
79 4.3896 1.3914 7.8950 -0.0120 3.0297 8.7685
80 -17.8468 -18.4137 -20.8881 -14.9493 -16.5816 -17.3738
81 7.0790 6.1577 16.0332 0.9569 9.8608 6.7577
82 45.4453 54.8024 56.5610 33.7390 51.8933 57.4905
83 59.9378 62.1965 73.3394 16.9717 26.7781 41.6260
84 32.9373 23.3035 18.5882 11.3683 15.5089 21.9496
85 -14.1385 -12.7712 -7.2418 -8.8583 -13.0573 -9.2208
86 10.5585 10.1102 8.2144 10.5958 15.7784 18.8615
87 -7.6121 -13.3352 -20.9042 -9.1454 -12.3603 -19.3866
88 -8.4949 -12.4200 -13.8362 -6.1614 -10.5922 -17.7448
89 4.4492 4.6618 6.1086 9.6900 12.2772 12.6924
90 4.7702 4.0567 0.4565 2.1093 1.8286 3.3536
91 21.5203 28.5643 38.3403 9.8669 16.8246 24.1368
92 -0.2935 1.1796 5.3723 -2.3815 -3.1241 -3.8100
93 4.7722 2.0674 -0.0727 -0.0029 -0.1362 -0.6200
94 2.8356 -1.3170 -1.8303 -1.7780 -2.3152 -4.6160
95 -7.3381 -9.8907 -12.0504 -6.4033 -8.5879 -13.5106
96 3.7672 0.0811 -2.3100 2.6280 2.4811 2.8615
97 -18.0226 -21.5042 -24.4760 -8.3110 -11.7051 -23.1790
98 10.9235 9.4125 12.2230 6.0718 4.4464 5.7691
99 -0.9255 -0.5806 -3.5672 -0.2597 -0.1487 -0.1050
100 -0.9250 -1.6380 -4.1937 -0.4115 -2.7080 -7.3690
101 17.4218 15.4433 12.5284 10.2402 5.0137 10.6573
102 4.9409 1.7554 1.5981 0.7507 0.5333 -2.2372
103 -5.5523 -3.3988 -3.0325 -3.0099 -2.8838 -9.3128
104 -3.5225 -4.8589 -5.8780 -0.8581 -1.6838 -12.9875
105 -5.9234 -7.4692 -11.2668 -2.9715 -3.3289 -7.7499
106 6.7174 9.3484 10.2238 8.0311 9.8871 13.0598
107 -2.4801 2.3317 1.7345 3.1370 4.3390 3.9570
108 -1.7502 5.9262 0.8071 6.3821 5.6136 8.0813
109 9.1528 12.9358 12.7537 6.9882 6.3196 17.1148
110 4.2227 8.6796 14.4462 2.5694 2.2001 3.7475
111 5.0917 5.5331 0.5575 4.6757 0.6768 1.1198
112 10.9064 10.8504 6.8068 6.6016 7.9769 6.0328
113 6.6969 10.4475 18.9743 3.0389 5.6087 14.5739
114 5.7713 10.1556 2.2152 2.9835 5.8064 8.6858
115 10.3194 7.6727 22.3771 3.2881 9.5299 12.2406
116 1.9010 6.5601 6.7824 1.9137 2.8060 7.1039
117 0.5096 2.3380 0.8324 2.6378 0.0632 -1.0088
118 -14.3931 -13.9743 -15.4640 -7.1563 -8.3505 -10.2494
119 4.9011 5.5856 8.6767 5.0578 5.1107 6.5508
120 -2.2080 -0.2588 -1.2498 3.6325 2.1402 0.2676
#define equality constraint function
equal <- function(x) c(sum(x))
#define inequality constraint function
in_inequal <- function(x) c(sum(abs(x)))
#define objective function1
obj_f <- function(x) {
int_r <- t(x)%*%V_C_M%*%x
c(as.numeric(int_r))
}
#define objective function2
ex_obj_f <- function(x) {
tteta <- x
port_w <- solnp(rep(1/n,n), fun = obj_f, eqfun=equal, eqB=1,
ineqfun = in_inequal, ineqLB = 0, ineqUB = tteta, control = list(trace=0))
lp_ret <- exp_d[nrow(exp_d),]%*%port_w$pars
-lp_ret
}
#First "naive" attempt
exp_d <- as.matrix(exp_d)
n <- 6
V_C_M <- cov(exp_d)
res <- matrix(0:0, nrow = 5000, ncol = 3)
for (i in 1:5000) {
tteta <- 1 + i*0.001
port_w <- solnp(rep(1/n,n), fun = obj_f, eqfun=equal, eqB=1, ineqfun = in_inequal, ineqLB = 0, ineqUB = tteta, control = list(trace=0))
lp_ret <- exp_d[nrow(exp_d),]%*%port_w$pars
res[i,1] <- tteta
res[i,2] <- lp_ret
res[i,3] <- port_w$convergence
}
#Second Approach (the result really depends on the starting value of the parameter)
tt_op1 <- solnp(pars = 1.5, fun = ex_obj_f, LB = 1, UB = 10, control = list(trace=1))
tt_op2 <- gosolnp(pars = 1.5, fun = ex_obj_f, LB = 1, UB = 10)
P.S. I have read posts with similar problems here, but coud not find a solution to my question.

Your model can be formulated as a pure QP (quadratic programming) problem instead of a difficult nonlinear problem with a nonlinear non-differentiable constraint.
The constraint
sum(i, |x(i)|) <= Theta
can be linearized in different ways. One possible reformulation is
-y(i) <= x(i) <= y(i)
sum(i, y(i)) <= Theta
non-negative (or free) variable y(i)
Now you can solve the model with a QP solver instead of a general purpose NLP solver.

Related

How to get the last elements of the values of selected keys of a dictionary in DolphinDB?

dt=dict(STRING,ANY)
dt[`AAPL]=10 11 12
dt[`AMZN]=61 62 63
dt[`NFLX]=34 35 36 37;
For this dictionary in DolphinDB, how can I take the last element of the value for AAPL and NFLX to get the vector [12,37]?
Try this:
def f(d, symbols){
return each(x->d[x].tail(), symbols)
}
f(dt, `AAPL`NFLX);

Plain text to Hexadecimal manually

How to manually convert a plain text to hexadecimal ?
Eg Hexadecimal form of Hello
P.S I do not need code but the manual way to convert.
--Convert the string to its ASCII form
--Convert ASCII(decimal) to Hex
Eg Hello in ASCII is
H is 72 ,e is 101, l is 108 , o is 111
And the Hex value of
72 is 48
101 is 65
108 is 6c
111 is 6f
So the Hex representation of Hello is 48656c6c6f
For example Hello present in text take that string character-wise, where H=72(int value) to HEXADECIMAL
DIVISION= 72 / 16 RESULT = 4 REMAINDER (in HEX)= 8(4.5-4=0.5,0.5*16=8)
DIVISION=4 / 16 RESULT = 0 REMAINDER (in HEX)= 4
Till Result becomes zero
ANSWER H=48(hex)
likewise for for all
finally,Hello=48656c6c6f

Wrong number of rows when reading a CSV file

I have a CSV file that I wrote with a Perl script. Files opens fine in Excel or in Simple Text and looks fine. It has 9 rows. However, when I count it with nrow() or dim(), I get 8 rows. This is causing downstream problems. Headers are 'a' to 'j'. Thanks.
a 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
b 0.401374 0.467736 0.582949 0.751601 0.860567 0.967758 0.965143 0.961866 0.863406 0.914746 0.984586 0.950531 0.935572 0.949083 0.968802 0.958067 0.980222 0.9917 1.009155 1.013709 1.008558 0.99945 0.988164 0.976623 0.973183 0.96519 0.968162 0.966721 0.962864 0.965214 0.968562 0.97235 0.981299 0.99698 1.013786 1.033542 1.050533 1.060338 1.072083 1.067729 1.057589 1.053562 1.030205 1.013217 0.994013 0.986159 0.981776 0.974559 0.975097 0.969779 0.969334 0.960085 0.963134 0.963621 0.961985 0.963223 0.957363 0.980404 0.962947 0.974328 0.969675 0.976323 0.974097 0.966781 0.972603 0.962981 0.975821 0.958069 0.980906 0.975684 0.943835 0.948154 0.94311 0.942586 1.022319 1.009415 1.021423 1.059047 1.085726 1.010326 1.036282 1.057417 1.046533 1.159883 1.204652 1.151679 1.244229 1.301202 1.490301 1.304381 1.712297 1.348033 0.736757 0.640583 1.474143 5.664327 2.547607 9.543845 9.572942 4.721692 0
c 0.483217 0.29612 0.31702 0.543388 0.691817 0.734183 0.772058 0.881707 0.942905 0.921662 0.970798 0.953243 0.945404 1.019665 0.938993 0.971219 0.959108 0.987285 0.991304 1.027208 0.994463 0.984487 0.998657 0.978592 0.96603 0.961446 0.957071 0.955184 0.957707 0.954644 0.970809 0.962456 0.973713 0.991673 1.012059 1.029588 1.042737 1.06048 1.065989 1.07043 1.060842 1.046754 1.035313 1.01837 0.998625 0.985907 0.981162 0.979541 0.977763 0.976078 0.968934 0.968159 0.967233 0.97003 0.969417 0.973832 0.973617 0.984223 0.976866 0.979505 0.985046 0.977616 0.987978 0.976532 0.974292 0.982313 0.975786 0.972815 1.004171 0.974393 0.977434 0.9359 0.960213 0.985705 1.020929 1.011589 1.006536 0.988384 1.037618 1.004525 1.0499 1.075382 1.126694 1.097262 1.145451 1.138151 1.268054 1.364637 1.548332 1.784365 1.66168 1.857999 1.281119 0.714744 1.409833 5.417217 2.436466 15.516732 14.648507 4.515705 0
d 0.54739 0.417737 0.560592 0.762408 0.840282 0.906248 0.970471 0.949707 0.933483 0.934403 1.07911 0.96818 1.019784 0.984101 0.96848 0.962378 0.981269 1.010261 1.036639 1.020298 0.996359 1.002746 0.986174 0.987546 0.975991 0.963343 0.967528 0.968886 0.967459 0.962992 0.966011 0.973625 0.982147 0.995917 1.010114 1.029 1.04789 1.059755 1.07154 1.072574 1.060199 1.052861 1.040088 1.017165 0.996716 0.989731 0.970404 0.974642 0.970293 0.967025 0.964511 0.962078 0.966636 0.960035 0.957345 0.967206 0.964344 0.972463 0.970353 0.971953 0.965436 0.968887 0.979595 0.967244 0.978083 0.956349 0.976509 0.990198 0.967315 0.965619 0.937825 0.963115 0.937972 0.940783 0.950582 0.999596 0.964397 1.073948 1.011812 0.992207 0.968892 1.019393 1.036893 1.040682 1.136172 1.175936 1.370799 1.626169 1.540309 1.521391 1.696523 1.335615 1.526301 0.740462 2.008275 3.367288 3.155172 17.020668 5.690853 9.356391 0
e 0.534257 0.623387 0.658379 1.021547 1.086113 1.10879 1.092341 1.047527 0.978066 1.113138 1.097839 1.081836 1.061449 1.01633 0.977861 1.064722 0.993365 1.099759 1.082891 1.097126 1.068604 1.050802 1.035536 1.020507 1.005109 1.010964 1.00586 0.999783 0.998753 0.995041 0.991949 0.991496 0.99574 1.009774 1.028723 1.05391 1.05445 1.076628 1.073423 1.073404 1.061363 1.047512 1.033238 1.004406 0.996751 0.969424 0.958341 0.960392 0.945673 0.947249 0.95037 0.943966 0.933722 0.930457 0.930735 0.925822 0.932857 0.93624 0.926209 0.926974 0.921879 0.92411 0.938514 0.936167 0.946051 0.938521 0.91792 0.927171 0.927905 0.930573 0.941126 0.905906 0.885595 0.890934 0.956747 0.993943 1.004912 0.966991 1.029596 0.934891 0.902882 1.005912 1.055131 1.060036 1.210456 1.204307 1.363757 1.383982 1.301273 1.834122 2.071989 1.468085 2.066713 1.317749 1.516236 7.76809 3.930527 17.669452 11.815548 0 0
f 0.580985 0.297751 0.444868 0.651545 0.850767 0.88177 1.045047 1.069708 1.007082 0.970515 0.995463 1.077379 0.956378 0.9633 0.963782 0.972252 1.001651 1.012825 1.024981 1.039174 1.018857 1.014041 1.004354 0.982018 0.985064 0.985249 0.976443 0.974376 0.972889 0.971678 0.975724 0.976216 0.984206 0.996217 1.017416 1.036441 1.049075 1.06143 1.068939 1.070938 1.062538 1.047006 1.032694 1.014813 0.99588 0.979417 0.969517 0.974295 0.968714 0.964257 0.964937 0.959818 0.956047 0.95799 0.95596 0.950887 0.954934 0.958854 0.961795 0.962765 0.970278 0.96375 0.963601 0.961951 0.956463 0.963495 0.957578 0.955705 0.988666 0.975476 0.967505 0.956554 0.927677 0.955343 0.973189 1.010146 1.057061 0.998942 1.042087 1.069688 1.010457 1.050207 1.037386 1.131603 1.180845 1.164758 1.302756 1.670756 1.413374 1.596161 1.643926 1.543092 0.756803 0.536158 1.850752 6.163239 0.799615 9.58566 14.422327 0 0
g 0.368286 0.133209 0.189854 0.329541 0.312233 0.371553 0.55966 0.663372 0.678283 0.811317 0.896647 0.887872 0.919798 0.945003 0.895565 0.968837 0.991214 0.987583 1.00316 1.020707 1.015521 0.985502 0.998961 0.976184 0.98338 0.973889 0.9674 0.968549 0.966232 0.966105 0.966297 0.965883 0.976568 0.999395 1.011924 1.03344 1.051626 1.059014 1.063355 1.068936 1.053488 1.045903 1.031994 1.008174 0.991796 0.972343 0.973369 0.969431 0.967085 0.963154 0.966865 0.962234 0.95759 0.96642 0.966713 0.974709 0.973966 0.981547 0.984093 0.991954 0.985628 0.996822 0.991295 0.98659 0.989936 0.978239 0.977446 1.025974 1.042636 1.040808 0.982495 0.991225 1.015466 1.008242 1.030642 1.004306 1.086892 1.097275 1.120253 1.138095 1.135337 1.209962 1.225443 1.224011 1.338381 1.450842 1.727673 1.719172 1.82727 2.074713 1.709345 2.290568 2.321692 1.542005 0.798422 8.181066 2.759658 5.513723 22.122134 13.63921 2.675538
h 0.30497 0.32974 0.424478 0.455078 0.523571 0.606559 0.660406 0.703971 0.729999 0.915297 0.981265 0.96674 0.925204 1.036524 0.953261 0.978409 0.987847 1.01834 1.019895 1.038827 1.024035 1.012836 0.994345 0.994459 0.972257 0.97309 0.978206 0.968312 0.964948 0.962225 0.96529 0.974432 0.975632 0.994073 1.010742 1.030179 1.042012 1.056884 1.058926 1.060281 1.059781 1.038339 1.029365 1.009022 0.986078 0.978875 0.9716 0.969215 0.958117 0.972496 0.968037 0.97107 0.958519 0.970863 0.969962 0.975005 0.978711 0.984085 0.984683 0.984162 0.996244 0.997889 0.994661 1.001441 0.985552 1.021569 1.000549 1.002552 0.997683 1.033186 1.013344 1.019947 1.057587 1.033291 1.069199 1.036226 1.168877 1.175308 1.22975 1.11576 1.122753 1.106146 1.224346 1.258167 1.290459 1.477277 1.427201 1.742816 1.558004 1.386269 1.910887 1.920479 1.08143 1.206672 2.082642 3.048554 3.085038 18.491473 12.365233 0 0
j 0.354463 0.327358 0.398802 0.473 0.602168 0.764142 0.819362 0.914898 0.823412 1.010715 0.854421 0.892255 0.981967 0.966507 1.021983 0.975027 0.961088 0.960516 0.971975 1.01222 0.979767 0.987716 0.98707 0.970561 0.963265 0.962699 0.958097 0.961291 0.952577 0.958112 0.9596 0.967041 0.97311 0.991703 1.006086 1.025563 1.040921 1.055595 1.05902 1.068992 1.060288 1.046614 1.037744 1.019254 0.996458 0.984552 0.986514 0.984358 0.977773 0.980087 0.972711 0.969122 0.975125 0.968046 0.968058 0.979439 0.97843 0.978518 0.98551 0.979352 0.983617 0.984822 0.986629 0.986932 0.991861 1.002382 0.999269 0.99465 0.994519 0.987402 1.000541 0.977929 0.976282 0.964102 1.032155 1.04334 1.063832 1.096302 1.105991 1.065358 1.106644 1.068104 1.064264 1.167453 1.278531 1.383359 1.417057 1.672739 1.39427 1.396529 1.94346 1.50906 1.274638 1.467406 2.337833 6.387922 0 11.099379 16.193772 4.992065 0
Functions in R can have default values for some parameters. For read.csv:
read.csv(file, header = TRUE, sep = ",", quote = "\"",
dec = ".", fill = TRUE, comment.char = "", ...)
header=TRUE means que first row is assume to be the header of the file. This means, it won't be recognized as data by R. If you read your file with header=FALSE
read.csv(file, header=FALSE)
you will get the 9 rows.

How can I apply fisher test on this set of data (nominal variables)

I'm pretty new in statistics:
fisher = function(idxToTest, idxATI){
idxDependent=c()
dependent=c()
p = c()
for(i in c(1:length(idxToTest)))
{
tbl = table(data[[idxToTest[i]]], data[[idxATI]])
rez = fisher.test(tbl, workspace = 20000000000)
if(rez$p.value<0.1){
dependent=c(dependent, TRUE)
if(rez$p.value<0.1){
idxDependent = c(idxDependent, idxToTest[i])
}
}
else{
dependent = c(dependent, FALSE)
}
p = c(p, rez$p.value)
}
}
This is the function I use. It seems to work.
What I understood until now is that I have to pass as first parameter data like:
Men Women
Dieting 10 30
Non-dieting 5 60
My data comes from a CSV:
data = read.csv('***.csv', header = TRUE, sep=',');
My first problem is that I don't know how to converse from:
Loan.Purpose Home.Ownership
lp_value_1 ho_value_2
lp_value_1 ho_value_2
lp_value_2 ho_value_1
lp_value_3 ho_value_2
lp_value_2 ho_value_3
lp_value_4 ho_value_2
lp_value_3 ho_value_3
to:
ho_value_1 ho_value_2 ho_value_3
lp_value1 0 2 0
lp_value2 1 0 1
lp_value3 0 1 1
lp_value4 0 1 0
The second issue is that I don't know what the second parameter should be
POST UPDATE: This is what I get using fisher.test(myTable):
Error in fisher.test(test) : FEXACT error 501.
The hash table key cannot be computed because the largest key
is larger than the largest representable int.
The algorithm cannot proceed.
Reduce the workspace size or use another algorithm.
where myTable is:
MORTGAGE NONE OTHER OWN RENT
car 18 0 0 5 27
credit_card 190 0 2 38 214
debt_consolidation 620 0 2 87 598
educational 5 0 0 3 7
...
Basically, fisher tests only work on smallish data sets because they require alot of memory. But all is good because chi-square tests make minimal additional assumptions and are easier on the computer. Just do:
chisq.test(Loan.Purpose,Home.Ownership)
to get your p-values.
Make sure you read through and understand the help page for chisq.test, especially the examples at the bottom.
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/chisq.test.html
Then look at a mosaicplot to see the quantities like:
mosaicplot(Loan.Purpose,Home.Ownership)
this reference explains how mosaicplots work.
http://alumni.media.mit.edu/~tpminka/courses/36-350.2001/lectures/day12/

How to find 5 closest number from matrix having attributes?

I have a matrix as follows
`> y
1 2 3
1 0.8802216 1.2277843 0.6875047
2 0.9381081 1.3189847 0.2046542
3 1.3245534 0.8221709 0.4630722
4 1.2006974 0.8890464 0.6710844
5 1.2344071 0.8354292 0.7259998
6 1.1670665 0.9214787 0.6826173
7 0.9670581 1.1070461 0.7742342
8 0.8867365 1.2160533 0.7024281
9 0.8235792 1.4424190 0.2030302
10 0.8821301 1.0541099 1.2279813
11 1.1958634 0.9708839 0.4297043
12 1.3542734 0.7747481 0.5119648
13 0.4385487 0.3588158 4.9167998
14 0.8530141 1.3578511 0.3698620
15 0.9651803 0.8426226 1.6132899
16 0.8854192 1.2272616 0.6715839
17 0.7779642 0.8132233 2.3386331
18 0.9936722 1.1629110 0.5083558
19 1.1235897 1.0018480 0.5764672
20 0.7887222 1.3101684 0.7373181
21 2.2276176 0.0000000 0.0000000`
I found one clue, but it can give position for the whole matrix,`
n<-read.table(file.choose(),header=T)
y<-n[,c("1","2","3")]
my.number=1.12270420185886 .
z<-abs(y-my.number)==min(abs(y-my.number))
which(z)
[1] 19 `
I want to find at least the 5 closest values with letter & column no too, in another way, I want the 5 closest single values from the matrix with their position.
I don't know what language it is; is it R?
In a procedural language, I would save all values to a map (val, (pos)) = (val (row, col); example (0.880..-> (1, 1)), then sort by value.
Then iterate over i<-pos (1 to map.size-5), and get the diff (pos (i), pos (i+5)), search for the minimum (diff), get the values and their position then.
Here is a solution in Scala:
val matrix = """1 0.8802216 1.2277843 0.6875047
2 0.9381081 1.3189847 0.2046542
3 1.3245534 0.8221709 0.4630722
4 1.2006974 0.8890464 0.6710844
5 1.2344071 0.8354292 0.7259998
6 1.1670665 0.9214787 0.6826173
7 0.9670581 1.1070461 0.7742342
8 0.8867365 1.2160533 0.7024281
9 0.8235792 1.4424190 0.2030302
10 0.8821301 1.0541099 1.2279813
11 1.1958634 0.9708839 0.4297043
12 1.3542734 0.7747481 0.5119648
13 0.4385487 0.3588158 4.9167998
14 0.8530141 1.3578511 0.3698620
15 0.9651803 0.8426226 1.6132899
16 0.8854192 1.2272616 0.6715839
17 0.7779642 0.8132233 2.3386331
18 0.9936722 1.1629110 0.5083558
19 1.1235897 1.0018480 0.5764672
20 0.7887222 1.3101684 0.7373181
21 2.2276176 0.0000000 0.0000000"""
// split block of text into lines
val lines=matrix.split ("\n")
// split lines into words
val rows = lines.map (l => l.split (" \\+"))
// remove the index from the beginning (1, 2, ... 21) and
// transform values from Strings to double numbers
// triples is: Array(Array(0.8802216, 1.2277843, 0.6875047), Array(0.9381081, 1.3189847, 0.2046542),
val triples = rows.map (_.tail).map(triple=> triple.map (_.toDouble))
// generate an own index for the rows and columns
// elems is: elems: Array[Array[(Double, (Int, Int))]] = Array(Array((0.8802216,(0,0)), (1.2277843,(0,1)), (0.6875047,(0,2))), Array((0.9381081,(1,0)), ...
val elems = triples.zipWithIndex.map {t=> t._1.zipWithIndex.map (vc=> (vc._1 -> (t._2, vc._2)))}
// sorted = Array((0.0,(20,1)), (0.0,(20,2)), (0.2030302,(8,2)), (0.2046542,(1,2)),
val sorted = elems.sortBy (e => e._1)
// delta5 = List(0.3588158, 0.369862, 0.2266741, 0.2338945, 0.10425639999999997, 0.1384938,
val delta5 = sorted.sliding (5, 1).map (q => q(4)._1-q(0)._1).toList
val minindex = delta5.indexOf (delta5.min) // minindex: Int = 29, delta5.min = 0.008824799999999966
// we found the smallest intervall of 5 values beginning at 29:
(29 to 29 +5).map (sorted (_))
res568: scala.collection.immutable.IndexedSeq[(Double, (Int, Int))] =
Vector( (0.8802216,(0,0)),
(0.8821301,(9,0)),
(0.8854192,(15,0)),
(0.8867365,(7,0)),
(0.8890464,(3,1)),
(0.9214787,(5,1)))
Since Scala counts from 0 to 20 and 0 to 2, where your index runs from 1 to 3 and 1 to 21 respectively, you have to add (1,1) to each of the positions=> (1,1), (10,1), and so on.

Resources