'max.print' option in R - r

I have a data.frame with 178 rows and 14 columns. When I print it into the R-console, it only shows me 71 rows, despite the max.print option being set to 1000 rows.
Could anyone please explain why max.print option doesn't work to print full dataset in R console? And how can I do that?
I use R 3.4.1 on MacOS.
Here is a data example:
1 1 14.23 1.71 2.43 15.6 127 2.80 3.06 0.28 2.29 5.640000 1.040 3.92 1065
2 1 13.20 1.78 2.14 11.2 100 2.65 2.76 0.26 1.28 4.380000 1.050 3.40 1050
3 1 13.16 2.36 2.67 18.6 101 2.80 3.24 0.30 2.81 5.680000 1.030 3.17 1185
4 1 14.37 1.95 2.50 16.8 113 3.85 3.49 0.24 2.18 7.800000 0.860 3.45 1480
5 1 13.24 2.59 2.87 21.0 118 2.80 2.69 0.39 1.82 4.320000 1.040 2.93 735
6 1 14.20 1.76 2.45 15.2 112 3.27 3.39 0.34 1.97 6.750000 1.050 2.85 1450
7 1 14.39 1.87 2.45 14.6 96 2.50 2.52 0.30 1.98 5.250000 1.020 3.58 1290
8 1 14.06 2.15 2.61 17.6 121 2.60 2.51 0.31 1.25 5.050000 1.060 3.58 1295
9 1 14.83 1.64 2.17 14.0 97 2.80 2.98 0.29 1.98 5.200000 1.080 2.85 1045
10 1 13.86 1.35 2.27 16.0 98 2.98 3.15 0.22 1.85 7.220000 1.010 3.55 1045
11 1 14.10 2.16 2.30 18.0 105 2.95 3.32 0.22 2.38 5.750000 1.250 3.17 1510
12 1 14.12 1.48 2.32 16.8 95 2.20 2.43 0.26 1.57 5.000000 1.170 2.82 1280
13 1 13.75 1.73 2.41 16.0 89 2.60 2.76 0.29 1.81 5.600000 1.150 2.90 1320
14 1 14.75 1.73 2.39 11.4 91 3.10 3.69 0.43 2.81 5.400000 1.250 2.73 1150
15 1 14.38 1.87 2.38 12.0 102 3.30 3.64 0.29 2.96 7.500000 1.200 3.00 1547
16 1 13.63 1.81 2.70 17.2 112 2.85 2.91 0.30 1.46 7.300000 1.280 2.88 1310
17 1 14.30 1.92 2.72 20.0 120 2.80 3.14 0.33 1.97 6.200000 1.070 2.65 1280
18 1 13.83 1.57 2.62 20.0 115 2.95 3.40 0.40 1.72 6.600000 1.130 2.57 1130
19 1 14.19 1.59 2.48 16.5 108 3.30 3.93 0.32 1.86 8.700000 1.230 2.82 1680
20 1 13.64 3.10 2.56 15.2 116 2.70 3.03 0.17 1.66 5.100000 0.960 3.36 845
21 1 14.06 1.63 2.28 16.0 126 3.00 3.17 0.24 2.10 5.650000 1.090 3.71 780
22 1 12.93 3.80 2.65 18.6 102 2.41 2.41 0.25 1.98 4.500000 1.030 3.52 770
23 1 13.71 1.86 2.36 16.6 101 2.61 2.88 0.27 1.69 3.800000 1.110 4.00 1035
24 1 12.85 1.60 2.52 17.8 95 2.48 2.37 0.26 1.46 3.930000 1.090 3.63 1015
25 1 13.50 1.81 2.61 20.0 96 2.53 2.61 0.28 1.66 3.520000 1.120 3.82 845
26 1 13.05 2.05 3.22 25.0 124 2.63 2.68 0.47 1.92 3.580000 1.130 3.20 830
27 1 13.39 1.77 2.62 16.1 93 2.85 2.94 0.34 1.45 4.800000 0.920 3.22 1195
28 1 13.30 1.72 2.14 17.0 94 2.40 2.19 0.27 1.35 3.950000 1.020 2.77 1285
29 1 13.87 1.90 2.80 19.4 107 2.95 2.97 0.37 1.76 4.500000 1.250 3.40 915
30 1 14.02 1.68 2.21 16.0 96 2.65 2.33 0.26 1.98 4.700000 1.040 3.59 1035
31 1 13.73 1.50 2.70 22.5 101 3.00 3.25 0.29 2.38 5.700000 1.190 2.71 1285
32 1 13.58 1.66 2.36 19.1 106 2.86 3.19 0.22 1.95 6.900000 1.090 2.88 1515
33 1 13.68 1.83 2.36 17.2 104 2.42 2.69 0.42 1.97 3.840000 1.230 2.87 990
34 1 13.76 1.53 2.70 19.5 132 2.95 2.74 0.50 1.35 5.400000 1.250 3.00 1235
35 1 13.51 1.80 2.65 19.0 110 2.35 2.53 0.29 1.54 4.200000 1.100 2.87 1095
36 1 13.48 1.81 2.41 20.5 100 2.70 2.98 0.26 1.86 5.100000 1.040 3.47 920
37 1 13.28 1.64 2.84 15.5 110 2.60 2.68 0.34 1.36 4.600000 1.090 2.78 880
38 1 13.05 1.65 2.55 18.0 98 2.45 2.43 0.29 1.44 4.250000 1.120 2.51 1105
39 1 13.07 1.50 2.10 15.5 98 2.40 2.64 0.28 1.37 3.700000 1.180 2.69 1020
40 1 14.22 3.99 2.51 13.2 128 3.00 3.04 0.20 2.08 5.100000 0.890 3.53 760
41 1 13.56 1.71 2.31 16.2 117 3.15 3.29 0.34 2.34 6.130000 0.950 3.38 795
42 1 13.41 3.84 2.12 18.8 90 2.45 2.68 0.27 1.48 4.280000 0.910 3.00 1035
43 1 13.88 1.89 2.59 15.0 101 3.25 3.56 0.17 1.70 5.430000 0.880 3.56 1095
44 1 13.24 3.98 2.29 17.5 103 2.64 2.63 0.32 1.66 4.360000 0.820 3.00 680
45 1 13.05 1.77 2.10 17.0 107 3.00 3.00 0.28 2.03 5.040000 0.880 3.35 885
46 1 14.21 4.04 2.44 18.9 111 2.85 2.65 0.30 1.25 5.240000 0.870 3.33 1080
47 1 14.38 3.59 2.28 16.0 102 3.25 3.17 0.27 2.19 4.900000 1.040 3.44 1065
48 1 13.90 1.68 2.12 16.0 101 3.10 3.39 0.21 2.14 6.100000 0.910 3.33 985
49 1 14.10 2.02 2.40 18.8 103 2.75 2.92 0.32 2.38 6.200000 1.070 2.75 1060
50 1 13.94 1.73 2.27 17.4 108 2.88 3.54 0.32 2.08 8.900000 1.120 3.10 1260
51 1 13.05 1.73 2.04 12.4 92 2.72 3.27 0.17 2.91 7.200000 1.120 2.91 1150
52 1 13.83 1.65 2.60 17.2 94 2.45 2.99 0.22 2.29 5.600000 1.240 3.37 1265
53 1 13.82 1.75 2.42 14.0 111 3.88 3.74 0.32 1.87 7.050000 1.010 3.26 1190
54 1 13.77 1.90 2.68 17.1 115 3.00 2.79 0.39 1.68 6.300000 1.130 2.93 1375
55 1 13.74 1.67 2.25 16.4 118 2.60 2.90 0.21 1.62 5.850000 0.920 3.20 1060
56 1 13.56 1.73 2.46 20.5 116 2.96 2.78 0.20 2.45 6.250000 0.980 3.03 1120
57 1 14.22 1.70 2.30 16.3 118 3.20 3.00 0.26 2.03 6.380000 0.940 3.31 970
58 1 13.29 1.97 2.68 16.8 102 3.00 3.23 0.31 1.66 6.000000 1.070 2.84 1270
59 1 13.72 1.43 2.50 16.7 108 3.40 3.67 0.19 2.04 6.800000 0.890 2.87 1285
60 2 12.37 0.94 1.36 10.6 88 1.98 0.57 0.28 0.42 1.950000 1.050 1.82 520
61 2 12.33 1.10 2.28 16.0 101 2.05 1.09 0.63 0.41 3.270000 1.250 1.67 680
62 2 12.64 1.36 2.02 16.8 100 2.02 1.41 0.53 0.62 5.750000 0.980 1.59 450
63 2 13.67 1.25 1.92 18.0 94 2.10 1.79 0.32 0.73 3.800000 1.230 2.46 630
64 2 12.37 1.13 2.16 19.0 87 3.50 3.10 0.19 1.87 4.450000 1.220 2.87 420
65 2 12.17 1.45 2.53 19.0 104 1.89 1.75 0.45 1.03 2.950000 1.450 2.23 355
66 2 12.37 1.21 2.56 18.1 98 2.42 2.65 0.37 2.08 4.600000 1.190 2.30 678
67 2 13.11 1.01 1.70 15.0 78 2.98 3.18 0.26 2.28 5.300000 1.120 3.18 502
68 2 12.37 1.17 1.92 19.6 78 2.11 2.00 0.27 1.04 4.680000 1.120 3.48 510
69 2 13.34 0.94 2.36 17.0 110 2.53 1.30 0.55 0.42 3.170000 1.020 1.93 750
70 2 12.21 1.19 1.75 16.8 151 1.85 1.28 0.14 2.50 2.850000 1.280 3.07 718
71 2 12.29 1.61 2.21 20.4 103 1.10 1.02 0.37 1.46 3.050000 0.906 1.82 870
[ reached getOption("max.print") -- omitted 107 rows ]```

options(max.print = 99999)
try this command

Type this code at the start of your R code. Worked for me:
options(max.print = .Machine$integer.max)

Related

How can I subset my list to have only the last day of the month?

DGS1MO DGS3MO DGS1 DGS2 DGS3 DGS5 DGS7 DGS10 DGS20 DGS30
2001-07-31 3.67 3.54 3.53 3.79 4.06 4.57 4.86 5.07 5.61 5.51
2001-08-01 3.65 3.53 3.56 3.83 4.09 4.62 4.90 5.11 5.63 5.53
2001-08-02 3.65 3.53 3.57 3.89 4.17 4.69 4.97 5.17 5.68 5.57
2001-08-03 3.63 3.52 3.57 3.91 4.22 4.72 4.99 5.20 5.70 5.59
2001-08-06 3.62 3.52 3.56 3.88 4.17 4.71 4.99 5.19 5.70 5.59
2001-08-07 3.63 3.52 3.56 3.90 4.19 4.72 5.00 5.20 5.71 5.60
2001-08-08 3.61 3.49 3.46 3.77 4.05 4.61 4.87 4.99 5.61 5.52
2001-08-09 3.61 3.45 3.48 3.77 4.07 4.66 4.93 5.04 5.64 5.54
2001-08-10 3.58 3.43 3.45 3.73 4.03 4.61 4.88 4.99 5.61 5.52
2001-08-13 3.57 3.45 3.43 3.70 4.00 4.57 4.86 4.97 5.60 5.52
2001-08-14 3.54 3.43 3.46 3.74 4.03 4.59 4.87 4.97 5.61 5.51
2001-08-15 3.52 3.43 3.47 3.80 4.11 4.62 4.90 5.00 5.62 5.52
2001-08-16 3.48 3.39 3.43 3.75 4.04 4.58 4.84 4.95 5.58 5.48
2001-08-17 3.46 3.36 3.39 3.67 3.95 4.49 4.75 4.84 5.51 5.43
2001-08-20 3.48 3.42 3.44 3.74 4.02 4.55 4.81 4.91 5.55 5.46
2001-08-21 3.46 3.39 3.41 3.69 3.96 4.50 4.79 4.87 5.54 5.44
2001-08-22 3.46 3.38 3.44 3.76 4.03 4.53 4.81 4.91 5.53 5.44
2001-08-23 3.49 3.40 3.46 3.72 3.99 4.52 4.79 4.89 5.50 5.41
2001-08-24 3.49 3.42 3.48 3.76 4.03 4.55 4.82 4.93 5.54 5.45
2001-08-27 3.52 3.45 3.51 3.78 4.04 4.57 4.83 4.94 5.56 5.47
2001-08-28 3.53 3.41 3.46 3.71 3.97 4.48 4.73 4.85 5.49 5.41
2001-08-29 3.48 3.42 3.44 3.67 3.92 4.43 4.67 4.78 5.44 5.36
2001-08-30 3.41 3.36 3.38 3.61 3.88 4.42 4.68 4.79 5.45 5.37
2001-08-31 3.40 3.37 3.41 3.64 3.91 4.46 4.72 4.85 5.47 5.39
2001-09-04 3.43 3.44 3.55 3.83 4.10 4.63 4.88 4.99 5.59 5.50
2001-09-05 3.49 3.41 3.47 3.79 4.07 4.61 4.86 4.97 5.57 5.48
2001-09-06 3.44 3.34 3.40 3.65 3.93 4.48 4.73 4.86 5.50 5.41
2001-09-07 3.40 3.27 3.29 3.53 3.82 4.39 4.67 4.80 5.45 5.39
2001-09-10 3.40 3.26 3.31 3.53 3.82 4.41 4.69 4.84 5.50 5.43
2001-09-13 2.73 2.74 2.81 2.99 3.32 4.03 4.41 4.64 5.41 5.39
2001-09-14 2.54 2.64 2.73 2.87 3.17 3.92 4.31 4.57 5.38 5.35
2001-09-17 2.47 2.59 2.72 2.96 3.30 3.99 4.38 4.63 5.44 5.41
2001-09-18 2.34 2.48 2.69 2.96 3.31 4.01 4.46 4.72 5.59 5.55
2001-09-19 2.00 2.19 2.49 2.81 3.18 3.90 4.41 4.69 5.59 5.56
2001-09-20 2.04 2.22 2.56 2.91 3.27 3.97 4.47 4.75 5.67 5.62
2001-09-21 2.12 2.25 2.53 2.91 3.27 3.94 4.43 4.70 5.62 5.59
2001-09-24 2.38 2.38 2.56 2.94 3.30 4.00 4.47 4.73 5.61 5.58
2001-09-25 2.58 2.40 2.51 2.88 3.25 3.97 4.45 4.72 5.60 5.58
2001-09-26 2.51 2.38 2.48 2.82 3.18 3.91 4.39 4.65 5.52 5.50
2001-09-27 2.34 2.38 2.43 2.78 3.15 3.87 4.33 4.58 5.46 5.45
2001-09-28 2.28 2.40 2.49 2.86 3.22 3.93 4.37 4.60 5.45 5.42
2001-10-01 2.26 2.37 2.47 2.82 3.18 3.90 4.33 4.55 5.39 5.38
2001-10-02 2.27 2.26 2.43 2.77 3.14 3.87 4.31 4.53 5.36 5.34
2001-10-03 2.21 2.23 2.38 2.77 3.14 3.86 4.29 4.50 5.34 5.32
2001-10-04 2.22 2.21 2.37 2.75 3.14 3.88 4.29 4.53 5.33 5.31
2001-10-05 2.21 2.19 2.33 2.71 3.10 3.87 4.26 4.52 5.34 5.31
2001-10-09 2.24 2.22 2.35 2.74 3.16 3.96 4.35 4.62 5.42 5.39
Above I have my dataset that I am trying to subset. My goals is to subset the df to only include the last day of each month listed. For example 8/31/2021,9/27/2021...etc onward through the data.
I have been able to do specific dates but I need something that is dynamic.
Thanks in advance
With lubridate:
library(lubridate)
df$DAY <- as.Date(df$DAY)
df[df$DAY == ceiling_date(df$DAY,'month') - days(1),]
DAY DGS1MO DGS3MO DGS1 DGS2 DGS3 DGS5 DGS7 DGS10 DGS20 DGS30
1 2001-07-31 3.67 3.54 3.53 3.79 4.06 4.57 4.86 5.07 5.61 5.51
24 2001-08-31 3.40 3.37 3.41 3.64 3.91 4.46 4.72 4.85 5.47 5.39
df:
df <- read.table(text='
DAY DGS1MO DGS3MO DGS1 DGS2 DGS3 DGS5 DGS7 DGS10 DGS20 DGS30
2001-07-31 3.67 3.54 3.53 3.79 4.06 4.57 4.86 5.07 5.61 5.51
2001-08-01 3.65 3.53 3.56 3.83 4.09 4.62 4.90 5.11 5.63 5.53
2001-08-02 3.65 3.53 3.57 3.89 4.17 4.69 4.97 5.17 5.68 5.57
2001-08-03 3.63 3.52 3.57 3.91 4.22 4.72 4.99 5.20 5.70 5.59
2001-08-06 3.62 3.52 3.56 3.88 4.17 4.71 4.99 5.19 5.70 5.59
2001-08-07 3.63 3.52 3.56 3.90 4.19 4.72 5.00 5.20 5.71 5.60
2001-08-08 3.61 3.49 3.46 3.77 4.05 4.61 4.87 4.99 5.61 5.52
2001-08-09 3.61 3.45 3.48 3.77 4.07 4.66 4.93 5.04 5.64 5.54
2001-08-10 3.58 3.43 3.45 3.73 4.03 4.61 4.88 4.99 5.61 5.52
2001-08-13 3.57 3.45 3.43 3.70 4.00 4.57 4.86 4.97 5.60 5.52
2001-08-14 3.54 3.43 3.46 3.74 4.03 4.59 4.87 4.97 5.61 5.51
2001-08-15 3.52 3.43 3.47 3.80 4.11 4.62 4.90 5.00 5.62 5.52
2001-08-16 3.48 3.39 3.43 3.75 4.04 4.58 4.84 4.95 5.58 5.48
2001-08-17 3.46 3.36 3.39 3.67 3.95 4.49 4.75 4.84 5.51 5.43
2001-08-20 3.48 3.42 3.44 3.74 4.02 4.55 4.81 4.91 5.55 5.46
2001-08-21 3.46 3.39 3.41 3.69 3.96 4.50 4.79 4.87 5.54 5.44
2001-08-22 3.46 3.38 3.44 3.76 4.03 4.53 4.81 4.91 5.53 5.44
2001-08-23 3.49 3.40 3.46 3.72 3.99 4.52 4.79 4.89 5.50 5.41
2001-08-24 3.49 3.42 3.48 3.76 4.03 4.55 4.82 4.93 5.54 5.45
2001-08-27 3.52 3.45 3.51 3.78 4.04 4.57 4.83 4.94 5.56 5.47
2001-08-28 3.53 3.41 3.46 3.71 3.97 4.48 4.73 4.85 5.49 5.41
2001-08-29 3.48 3.42 3.44 3.67 3.92 4.43 4.67 4.78 5.44 5.36
2001-08-30 3.41 3.36 3.38 3.61 3.88 4.42 4.68 4.79 5.45 5.37
2001-08-31 3.40 3.37 3.41 3.64 3.91 4.46 4.72 4.85 5.47 5.39
2001-09-04 3.43 3.44 3.55 3.83 4.10 4.63 4.88 4.99 5.59 5.50
2001-09-05 3.49 3.41 3.47 3.79 4.07 4.61 4.86 4.97 5.57 5.48
2001-09-06 3.44 3.34 3.40 3.65 3.93 4.48 4.73 4.86 5.50 5.41
2001-09-07 3.40 3.27 3.29 3.53 3.82 4.39 4.67 4.80 5.45 5.39
2001-09-10 3.40 3.26 3.31 3.53 3.82 4.41 4.69 4.84 5.50 5.43
2001-09-13 2.73 2.74 2.81 2.99 3.32 4.03 4.41 4.64 5.41 5.39
2001-09-14 2.54 2.64 2.73 2.87 3.17 3.92 4.31 4.57 5.38 5.35
2001-09-17 2.47 2.59 2.72 2.96 3.30 3.99 4.38 4.63 5.44 5.41
2001-09-18 2.34 2.48 2.69 2.96 3.31 4.01 4.46 4.72 5.59 5.55
2001-09-19 2.00 2.19 2.49 2.81 3.18 3.90 4.41 4.69 5.59 5.56
2001-09-20 2.04 2.22 2.56 2.91 3.27 3.97 4.47 4.75 5.67 5.62
2001-09-21 2.12 2.25 2.53 2.91 3.27 3.94 4.43 4.70 5.62 5.59
2001-09-24 2.38 2.38 2.56 2.94 3.30 4.00 4.47 4.73 5.61 5.58
2001-09-25 2.58 2.40 2.51 2.88 3.25 3.97 4.45 4.72 5.60 5.58
2001-09-26 2.51 2.38 2.48 2.82 3.18 3.91 4.39 4.65 5.52 5.50
2001-09-27 2.34 2.38 2.43 2.78 3.15 3.87 4.33 4.58 5.46 5.45
2001-09-28 2.28 2.40 2.49 2.86 3.22 3.93 4.37 4.60 5.45 5.42
2001-10-01 2.26 2.37 2.47 2.82 3.18 3.90 4.33 4.55 5.39 5.38
2001-10-02 2.27 2.26 2.43 2.77 3.14 3.87 4.31 4.53 5.36 5.34
2001-10-03 2.21 2.23 2.38 2.77 3.14 3.86 4.29 4.50 5.34 5.32
2001-10-04 2.22 2.21 2.37 2.75 3.14 3.88 4.29 4.53 5.33 5.31
2001-10-05 2.21 2.19 2.33 2.71 3.10 3.87 4.26 4.52 5.34 5.31
2001-10-09 2.24 2.22 2.35 2.74 3.16 3.96 4.35 4.62 5.42 5.39',header=T)
A possible solution:
library(tidyverse)
library(lubridate)
df %>%
mutate(d = day(ymd(Date)), m = month(ymd(Date)), y = year(ymd(Date))) %>%
group_by(m, y) %>%
slice_max(d) %>%
ungroup %>%
select(-d, -m, -y)
#> # A tibble: 4 × 11
#> Date DGS1MO DGS3MO DGS1 DGS2 DGS3 DGS5 DGS7 DGS10 DGS20 DGS30
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2001-07-31 3.67 3.54 3.53 3.79 4.06 4.57 4.86 5.07 5.61 5.51
#> 2 2001-08-31 3.4 3.37 3.41 3.64 3.91 4.46 4.72 4.85 5.47 5.39
#> 3 2001-09-28 2.28 2.4 2.49 2.86 3.22 3.93 4.37 4.6 5.45 5.42
#> 4 2001-10-09 2.24 2.22 2.35 2.74 3.16 3.96 4.35 4.62 5.42 5.39
A date is the last of a month if and only if it is one day prior to the first of the following month. You can index the elements of a Date vector x satisfying this condition like so:
is_last_of_month <- function(x) {
x <- trunc(x)
x == as.Date(round(as.POSIXlt(x), units = "months")) - 1
}
x <- seq(as.Date("2022-02-24"), as.Date("2022-03-05"), by = 1)
x
## [1] "2022-02-24" "2022-02-25" "2022-02-26" "2022-02-27" "2022-02-28"
## [6] "2022-03-01" "2022-03-02" "2022-03-03" "2022-03-04" "2022-03-05"
is_last_of_month(x)
## [1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
is_last_of_month operates on trunc(x) instead of x to defend against the possibility of fractional days, so that—for example—is_last_of_month(as.Date("2022-02-28") + u) is TRUE for all u greater than 0 but less than 1:
y <- as.Date("2022-02-28") + 0.5
y
## [1] "2022-02-28"
y == as.Date(round(as.POSIXlt(y), units = "months")) - 1
## [1] FALSE
is_last_of_month(y)
## [1] TRUE
(Well, "... for all u greater than 0 but less than 1" is not quite true in floating point arithmetic, but hopefully my meaning is clear.)

How to do Shapiro test for multicolumns in data.frame? And avoid 2 errors: values are identical and missing value where TRUE/FALSE needed

I have a dataframe like this:
head(Betula, 10)
year start Start_DayOfYear end End_DayOfYear duration DateMax Max_DayOfYear BetulaPollenMax SPI Jan.NAO Jan.AO
1 1997 <NA> NA <NA> NA NA <NA> NA NA NA -0.49 -0.46
2 1998 <NA> 143 <NA> 184 41 <NA> 146 42 361 0.39 -2.08
3 1999 <NA> 148 <NA> 188 40 <NA> 158 32 149 0.77 0.11
4 2000 <NA> 135 <NA> 197 62 <NA> 156 173 917 0.60 1.27
5 2001 <NA> 143 <NA> 175 32 <NA> 154 113 457 0.25 -0.96
Jan.SO Feb.NAO Feb.AO Feb.SO Mar.NAO Mar.AO Mar.SO Apr.NAO Apr.AO Apr.SO DecJanFebMarApr.NAO DecJanFebMar.NAO
1 0.5 1.70 1.89 1.7 1.46 1.09 -0.4 -1.02 0.32 -0.6 0.14 0.43
2 -2.7 -0.11 -0.18 -2.0 0.87 -0.25 -2.4 -0.68 -0.04 -1.4 0.27 0.51
3 1.8 0.29 0.48 1.0 0.23 -1.49 1.3 -0.95 0.28 1.4 0.39 0.73
4 0.7 1.70 1.08 1.7 0.77 -0.45 1.3 -0.03 -0.28 1.2 0.49 0.62
5 1.0 0.45 -0.62 1.7 -1.26 -1.69 0.9 0.00 0.91 0.2 -0.28 -0.35
DecJanFeb.NAO DecJan.NAO JanFebMarApr.NAO JanFebMar.NAO JanFeb.NAO FebMarApr.NAO FebMar.NAO MarApr.NAO
1 0.08 -0.73 0.41 0.89 0.61 0.71 1.58 0.22
2 0.38 0.63 0.12 0.38 0.14 0.03 0.38 0.10
3 0.89 1.19 0.09 0.43 0.53 -0.14 0.26 -0.36
4 0.57 0.01 0.76 1.02 1.15 0.81 1.24 0.37
5 -0.04 -0.29 -0.14 -0.19 0.35 -0.27 -0.41 -0.63
DecJanFebMarApr.AO DecJanFebMar.AO DecJanFeb.AO DecJan.AO JanFebMarApr.AO JanFebMar.AO JanFeb.AO FebMarApr.AO
1 0.55 0.61 0.45 -0.27 0.71 0.84 0.72 1.10
2 -0.24 -0.29 -0.30 -0.37 -0.64 -0.84 -1.13 -0.16
3 0.08 0.04 0.54 0.58 -0.16 -0.30 0.30 -0.24
4 -0.15 -0.11 0.00 -0.54 0.41 0.63 1.18 0.12
5 -0.74 -1.15 -0.97 -1.14 -0.59 -1.09 -0.79 -0.47
FebMar.AO MarApr.AO DecJanFebMarApr.SO DecJanFebMar.SO DecJanFeb.SO DecJan.SO JanFebMarApr.SO JanFebMar.SO
1 1.49 0.71 0.04 0.20 0.40 -0.25 0.30 0.60
2 -0.22 -0.15 -1.42 -1.43 -1.10 -0.65 -2.13 -2.37
3 -0.51 -0.61 1.38 1.38 1.40 1.60 1.38 1.37
4 0.32 -0.37 1.14 1.13 1.07 0.75 1.23 1.23
5 -1.16 -0.39 0.60 0.70 0.63 0.10 0.95 1.20
JanFeb.SO FebMarApr.SO FebMar.SO MarApr.SO TmaxAprI TminAprI TmeanAprI RainfallAprI HumidityAprI SunshineAprI
1 1.10 0.23 0.65 -0.50 3.27 -3.86 -0.44 0.82 76.3 3.45
2 -2.35 -1.93 -2.20 -1.90 4.52 -3.28 -0.15 0.12 73.5 7.12
3 1.40 1.23 1.15 1.35 4.11 -3.86 -0.34 1.32 78.4 4.85
4 1.20 1.40 1.50 1.25 6.11 -1.31 1.93 0.80 71.9 4.20
5 1.35 0.93 1.30 0.55 1.46 -2.37 -1.04 2.83 84.4 1.21
CloudAprI WindAprI SeeLevelPressureAprI TmaxAprII TminAprII TmeanAprII RainfallAprII HumidityAprII
1 6.30 5.26 1008.63 12.12 2.11 6.17 0.23 76.5
2 3.93 3.86 1022.39 5.57 -0.44 1.82 0.83 77.9
3 5.02 3.23 1007.09 0.20 -6.36 -3.23 2.63 82.5
4 6.15 5.13 1012.21 2.74 -4.88 -2.35 0.34 76.0
5 7.50 3.90 1009.50 6.75 -3.22 1.16 0.32 71.5
SunshineAprII CloudAprII WindAprII SeeLevelPressureAprII TmaxAprIII TminAprIII TmeanAprIII RainfallAprIII
1 3.12 6.53 5.19 1024.31 7.35 0.33 3.37 0.33
2 2.41 6.85 3.70 1012.01 6.34 0.76 2.69 2.01
3 4.99 5.87 6.23 1019.66 8.65 0.73 4.23 0.70
4 6.63 5.17 5.84 1022.62 5.84 -1.81 2.02 0.00
5 6.11 4.82 3.92 1018.81 8.47 1.02 4.17 1.09
HumidityAprIII SunshineAprIII CloudAprIII WindAprIII SeeLevelPressureAprIII TmaxDecI TminDecI TmeanDecI
1 75.0 3.73 6.40 4.08 1009.91 -0.90 -5.88 -3.67
2 83.5 1.52 7.31 4.66 1008.33 5.33 0.01 2.46
3 73.4 6.62 5.12 3.16 1017.01 -0.24 -6.93 -3.64
4 69.0 8.80 4.80 4.99 1021.18 4.67 1.86 2.79
5 72.7 5.33 5.41 4.27 1005.48 3.69 -1.43 1.65
RainfallDecI HumidityDecI SunshineDecI CloudDecI WindDecI SeeLevelPressureDecI TmaxDecII TminDecII TmeanDecII
1 0.12 77.3 0.22 5.08 3.49 1003.15 7.99 0.77 4.10
2 1.10 73.5 0.04 6.29 5.21 999.94 0.24 -4.74 -2.67
3 2.41 82.3 0.00 6.70 4.92 998.64 1.22 -5.90 -2.05
4 3.13 88.1 0.00 7.97 4.00 997.82 2.76 -3.89 -0.54
5 1.60 79.1 0.07 5.44 5.76 996.35 10.82 4.36 6.90
RainfallDecII HumidityDecII SunshineDecII CloudDecII WindDecII SeeLevelPressureDecII TmaxDecIII TminDecIII
1 1.90 71.3 0 4.96 5.55 1007.16 4.78 -2.12
2 4.34 82.2 0 7.03 6.06 998.02 2.07 -4.60
3 1.94 78.6 0 6.53 5.82 1008.33 2.09 -2.48
4 1.45 77.2 0 6.57 5.26 1005.11 -1.49 -8.37
5 1.15 66.6 0 5.74 5.47 1030.02 1.40 -7.34
TmeanDecIII RainfallDecIII HumidityDecIII SunshineDecIII CloudDecIII WindDecIII SeeLevelPressureDecIII TmaxFebI
1 1.15 3.96 82.36 0 6.01 4.02 991.60 -0.23
2 -0.51 4.10 81.18 0 6.67 3.91 986.52 0.79
3 -0.61 1.97 81.27 0 6.21 5.53 982.13 2.19
4 -5.28 1.26 79.64 0 6.11 4.22 1019.63 3.27
5 -3.45 1.19 82.18 0 6.20 4.77 1015.53 2.42
TminFebI TmeanFebI RainfallFebI HumidityFebI SunshineFebI CloudFebI WindFebI SeeLevelPressureFebI TmaxFebII
1 -6.67 -3.57 0.84 84.3 1.11 6.81 5.35 990.51 2.97
2 -7.79 -4.49 2.31 72.2 1.88 4.73 4.53 990.39 3.31
3 -4.14 -1.77 0.42 73.3 1.29 6.02 5.57 1007.67 1.55
4 -2.48 0.04 2.28 77.0 0.46 6.84 4.29 982.97 -1.24
5 -3.52 -0.74 1.98 81.5 0.76 5.78 4.93 1008.29 6.71
TminFebII TmeanFebII RainfallFebII HumidityFebII SunshineFebII CloudFebII WindFebII SeeLevelPressureFebII
1 -2.31 -0.10 1.44 82.2 1.07 6.45 4.42 980.59
2 -4.85 -0.99 3.84 75.0 2.54 5.91 5.05 999.98
3 -5.76 -2.44 2.89 75.3 0.40 6.95 5.82 990.44
4 -8.47 -4.65 3.33 83.1 0.63 6.55 4.95 1000.10
5 -0.25 3.01 1.38 66.1 1.16 6.18 6.28 1001.46
TmaxFebIII TminFebIII TmeanFebIII RainfallFebIII HumidityFebIII SunshineFebIII CloudFebIII WindFebIII
1 0.05 -6.01 -3.35 4.60 83.50 1.29 6.58 4.71
2 -0.45 -7.43 -4.51 2.93 78.38 1.00 6.91 5.99
3 2.13 -4.51 -1.21 2.90 79.38 2.51 5.76 5.46
4 0.59 -3.79 -1.92 5.94 88.33 1.40 6.86 6.70
5 -2.68 -7.23 -5.05 1.39 83.88 1.13 7.41 5.69
SeeLevelPressureFebIII TmaxJanI TminJanI TmeanJanI RainfallJanI HumidityJanI SunshineJanI CloudJanI WindJanI
1 980.25 0.38 -5.57 -3.36 0.01 82.9 0.27 3.45 2.97
2 997.71 4.29 -0.03 2.08 3.70 82.9 0.00 7.39 5.01
3 988.45 1.02 -4.47 -1.87 2.22 82.3 0.00 6.94 4.29
4 987.21 0.04 -6.28 -3.03 4.99 85.8 0.00 5.84 4.75
5 1023.84 -0.33 -5.11 -3.17 0.66 81.2 0.00 7.08 3.88
SeeLevelPressureJanI TmaxJanII TminJanII TmeanJanII RainfallJanII HumidityJanII SunshineJanII CloudJanII
1 1023.71 0.09 -6.48 -2.50 4.29 86.5 0.01 7.23
2 984.57 -0.34 -6.49 -3.61 2.74 80.2 0.23 6.99
3 1004.06 0.32 -5.59 -3.03 5.28 83.3 0.00 6.68
4 983.42 8.38 1.46 4.97 0.64 69.3 0.10 6.13
5 1010.31 7.35 3.00 5.09 1.27 66.3 0.03 6.19
WindJanII SeeLevelPressureJanII TmaxJanIII TminJanIII TmeanJanIII RainfallJanIII HumidityJanIII SunshineJanIII
1 5.42 998.88 5.66 -2.39 1.97 1.03 74.27 0.65
2 6.38 1011.44 3.84 -3.32 -0.37 0.70 73.55 0.55
3 6.24 980.15 4.33 -5.19 -0.59 2.23 76.64 0.69
4 6.44 1019.41 4.09 -2.67 0.05 2.18 71.73 0.42
5 6.74 1006.10 4.43 -0.86 1.58 1.91 80.09 0.20
CloudJanIII WindJanIII SeeLevelPressureJanIII TmaxMarI TminMarI TmeanMarI RainfallMarI HumidityMarI
1 6.47 7.59 1004.59 2.83 -3.60 -0.72 2.14 79.9
2 5.25 4.72 1019.95 -5.31 -12.52 -9.52 2.28 72.6
3 5.34 4.65 1001.66 -0.70 -6.67 -4.47 1.39 81.0
4 5.85 4.83 1007.23 0.10 -7.91 -3.98 2.36 80.2
5 6.53 3.63 992.53 -0.38 -4.59 -2.27 3.00 86.4
SunshineMarI CloudMarI WindMarI SeeLevelPressureMarI TmaxMarII TminMarII TmeanMarII RainfallMarII HumidityMarII
1 0.85 6.77 6.64 986.96 -1.48 -8.43 -5.58 1.09 81.0
2 2.92 5.91 4.68 1013.17 6.53 -1.81 2.56 0.43 65.5
3 2.40 5.71 4.02 1014.62 0.53 -5.17 -2.90 5.20 82.8
4 0.91 7.02 5.87 1006.64 5.32 -0.94 1.23 1.11 74.4
5 0.19 7.82 4.49 999.35 1.60 -4.29 -1.89 0.95 79.3
SunshineMarII CloudMarII WindMarII SeeLevelPressureMarII TmaxMarIII TminMarIII TmeanMarIII RainfallMarIII
1 2.12 5.51 3.93 1021.57 3.88 -1.95 0.55 1.42
2 2.25 6.29 6.11 1008.31 3.95 -2.46 -0.15 1.30
3 1.00 6.61 5.77 1006.63 -0.68 -6.60 -4.07 0.70
4 2.16 6.61 6.45 1003.23 5.49 -0.68 1.65 1.58
5 4.07 5.21 3.14 1017.24 -0.66 -7.21 -4.00 1.37
HumidityMarIII SunshineMarIII CloudMarIII WindMarIII SeeLevelPressureMarIII
1 80.45 2.80 6.13 4.03 995.31
2 72.09 3.98 5.99 5.14 1000.32
3 78.73 2.34 6.46 3.81 1005.67
4 74.64 2.85 6.54 6.34 1013.45
5 79.45 4.71 5.65 4.95 1010.47
[ reached 'max' / getOption("max.print") -- omitted 5 rows ]
And I would like to do the normality test for all column in once. I tried
apply(x, shapiro.test)
Betula_shapiro <- apply(Betula, shapiro.test)
Error in FUN(X[[i]], ...) : is.numeric(x) is not TRUE
and it didn´t work. I also tried this:
Betula <- apply(Betula[which(sapply(Betula, is.numeric))], 2, shapiro.test)
Error in FUN(newX[, i], ...) : all 'x' values are identical
f<-function(x){if(diff(range(x))==0)list()else shapiro.test(x)}
Betula <- apply(Betula[which(sapply(Betula, is.numeric))], 2, f)
Error in if (diff(range(x)) == 0) list() else shapiro.test(x) :
missing value where TRUE/FALSE needed
So I did:
Betula_numerics_only <- Betula[which(sapply(Betula, is.numeric))]
selecting columns with at least 3 not missing values and applying shapiro.test on them
Betula_numerics_only_filled_columns <- Betula_numerics_only[which(apply(Betula_numerics_only, 2, function(f) sum(!is.na(f))>=3 ))]
Betula_shapiro<-apply(Betula_numerics_only_filled_columns, 2, shapiro.test)
Error in FUN(newX[, i], ...) : all 'x' values are identical
Could you please help me with this problem?
Since i was talking about readability in my comment i felt i should provide something more readable too as an answer.
Lets make some dummy-data:
data_test <- data.frame(matrix(rnorm(100, 10, 1), ncol = 5, byrow = T), stringsAsFactors = F)
Lets apply shapiro.test to each column
apply(data_test, 2, shapiro.test)
In case there are non numeric columns:
Lets add a dummy-char column for testing-purposes
data_test$non_numeric <- sample(c("hello", "hi", "good morning"), NROW(data_test), replace = T)
and try to apply the test again
apply(data_test, 2, shapiro.test)
which results in:
> apply(data_test, 2, shapiro.test)
Error: is.numeric(x) is not TRUE
To solve this we select only numeric colums by using sapply:
data_test[which(sapply(data_test, is.numeric))]
and combine it with the apply:
apply(data_test[which(sapply(data_test, is.numeric))], 2, shapiro.test)
Removing colums, that are all NA:
data_test_numerics_only <- data_test[which(sapply(data_test, is.numeric))]
Selecting colums with at least 3 not missing values and applying shapiro.test on them:
data_test_numerics_only_filled_colums = data_test_numerics_only[which(apply(data_test_numerics_only, 2, function(f) sum(!is.na(f)) >= 3))]
apply(data_test_numerics_only_filled_colums, 2, shapiro.test)
We will get this running, lets try once more :)
remove non numeric columns
Betula_numerics <- Betula[which(sapply(Betula, is.numeric))]
Remove columns with less than 3 values
Betula_numerics_filled <- Betula_numerics[which(apply(Betula_numerics, 2, function(f) sum(!is.na(f)) >= 3))]
Remove columns with zero variance
Betula_numerics_filled_not_constant <- Betula_numerics_filled [apply(Betula_numerics_filled , 2, function(f) var(f, na.rm = T) != 0)]
Shapiro.test and hope for the best :)
apply(Betula_numerics_filled_not_constant, 2, shapiro.test)

Retrieve information of winning unit in a self organizing map plot

I would like to figure out which is the winning unit of a node in the kohonen plot
library(kohonen)
set.seed(0)
data("wines")
wines <- scale(wines)
som_grid <- somgrid(8, 6, "hexagonal")
som_model <- som(wines, som_grid)
plot(som_model)
The plot will look like this:
And you may know in which cluster the observation will lie with
head(data.frame(cbind(wines,unit= som_model$unit.classif)))
alcohol malic.acid ash ash.alkalinity magnesium tot..phenols flavonoids non.flav..phenols proanth col..int. col..hue
1 13.20 1.78 2.14 11.2 100 2.65 2.76 0.26 1.28 4.38 1.05
2 13.16 2.36 2.67 18.6 101 2.80 3.24 0.30 2.81 5.68 1.03
3 14.37 1.95 2.50 16.8 113 3.85 3.49 0.24 2.18 7.80 0.86
4 13.24 2.59 2.87 21.0 118 2.80 2.69 0.39 1.82 4.32 1.04
5 14.20 1.76 2.45 15.2 112 3.27 3.39 0.34 1.97 6.75 1.05
6 14.39 1.87 2.45 14.6 96 2.50 2.52 0.30 1.98 5.25 1.02
OD.ratio proline unit
1 3.40 1050 24
2 3.17 1185 46
3 3.45 1480 48
4 2.93 735 4
5 2.85 1450 48
6 3.58 1290 47
But I would like to retrieve this unit information in the plot, like putting a text in the nodes with this unit number in the same way that identify function does, but automatically. Thanks in advance!

Assign columns to grouping variable for use with ordihull plotting

I have a dataframe consisting of dissimilarity ratings for pairs of 12 nations, and what I wish to do is essentially divide the columns (the nations) into three different groups (not combining their scores).
I am performing a non-metric multidimensional scaling, so I would like to plot the dissimilarity ratings with convex hulls according to these three groups.
I know the code for making the plots, all I am missing is the grouping variable that is needed, and I cannot for the life of me figure out how to create it.
Brazil Congo Cuba Egypt France India Israel Japan China UdSSR USA Yugoslavia
1 0.00 4.83 5.28 3.44 4.72 4.50 3.83 3.50 2.39 3.06 5.39 3.17
2 4.83 0.00 4.56 5.00 4.00 4.83 3.33 3.39 4.00 3.39 2.39 3.50
3 5.28 4.56 0.00 5.17 4.11 4.00 3.61 2.94 5.50 5.44 3.17 5.11
4 3.44 5.00 5.17 0.00 4.78 5.83 4.67 3.83 4.39 4.39 3.33 4.28
5 4.72 4.00 4.11 4.78 0.00 3.44 4.00 4.22 3.67 5.06 5.94 4.72
6 4.50 4.83 4.00 5.83 3.44 0.00 4.11 4.50 4.11 4.50 4.28 4.00
7 3.83 3.33 3.61 4.67 4.00 4.11 0.00 4.83 3.00 4.17 5.94 4.44
8 3.50 3.39 2.94 3.83 4.22 4.50 4.83 0.00 4.17 4.61 6.06 4.28
9 2.39 4.00 5.50 4.39 3.67 4.11 3.00 4.17 0.00 5.72 2.56 5.06
10 3.06 3.39 5.44 4.39 5.06 4.50 4.17 4.61 5.72 0.00 5.00 6.67
11 5.39 2.39 3.17 3.33 5.94 4.28 5.94 6.06 2.56 5.00 0.00 3.56
12 3.17 3.50 5.11 4.28 4.72 4.00 4.44 4.28 5.06 6.67 3.56 0.00
This is probably a frustratingly simple command, but I am truly lost.

Extract the column number which has the last data point from a table

I have the following inverted data frame
z
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
14 -8.70 0.28 18.66 4.81 -34.33 40.39 3.09 7.89 49.41
13 -6.10 9.51 -1.09 -0.01 7.89 -7.37 -0.61 -9.79 31.75 40.67 5.41 -10.53
12 -5.21 7.49 -7.92 3.54 11.19 -6.66 23.64 13.21 9.64 14.44 59.95 -20.96
11 -12.68 11.04 -11.10 -6.18 -5.61 8.93 94.99 30.15 14.37 31.08 -9.02 -14.77
10 5.07 -2.04 22.77 12.05 0.38 -3.28 -2.73 11.26 5.30 4.61 13.80 3.68
9 -0.82 0.86 3.18 1.06 6.47 1.57 2.25 -9.34 5.27 7.25 2.85 0.42
8 10.48 1.17 10.97 -0.13 0.32 -5.89 -2.26 -7.28 -1.39 3.35 14.81 3.40
7 -5.22 3.09 -7.75 -3.41 -0.09 12.37 -17.38 1.41 8.57 10.48 -1.20 7.45
6 13.85 7.22 3.14 -2.92 -7.12 0.45 3.51 -2.30 7.07 -2.83 -2.27 -1.52
5 -0.57 0.58 -2.59 3.29 -6.07 0.37 1.32 -0.58 4.07 -4.85 -0.48 1.66
4 0.46 -0.41 3.01 0.60 2.20 -2.39 0.22 3.99 5.50 16.07 -4.51 0.50
3 1.28 5.10 -3.61 5.02 3.04 -4.05 -2.64 1.88 -2.44 3.27 -2.71 2.02
2 -1.28 0.99 2.38 0.16 1.03 10.93 5.07 0.26 0.84 -0.05 -0.88 -3.71
1 2.33 -1.71 -0.41 -0.58 -2.19 1.26 1.88 -4.03 0.54 0.34 0.22 -0.50
I would like to find out which column has the last data point in this example -0.50 and extract the column name in this case Dec as a number (12), without using the -0.50 data point, tried wrong with the below expressions
which( colnames(z)==-0.50)
integer(0)
which( colnames(z)==z[length(z)])
integer(0)
Second example
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
18 -12.97 9.96 8.14
17 1.50 3.27 7.38 -1.63 8.53 2.97 1.51 10.99 4.51 -5.70 1.15 9.50
16 -1.38 3.61 -3.98 10.51 -8.39 5.29 -2.01 -3.47 -0.17 -6.20 13.93 9.04
15 -3.96 1.72 -3.28 2.06 -0.26 -1.27 -4.58 3.23 -7.76 2.09 7.33 16.81
14 4.38 0.56 7.09 -5.31 -2.61 -2.66 0.66 0.56 4.64 13.75 -7.10 -5.15
13 -10.13 -6.04 12.62 -3.76 -3.96 7.95 4.71 6.04 7.63 -7.96 -0.69 14.16
12 5.95 11.95 -10.80 2.45 10.19 -5.20 -0.68 0.62 0.26 4.72 -2.48 10.27
11 2.72 11.56 -0.80 -8.62 0.28 -2.96 1.33 3.09 5.14 4.03 6.37 -0.19
10 -5.38 6.58 4.64 -4.21 6.62 3.13 -1.85 7.63 -6.17 -2.95 7.32 -4.37
9 4.20 -2.58 4.01 5.66 -2.94 -1.17 -0.47 4.54 -1.10 1.48 3.24 2.14
8 3.86 -5.93 -3.95 6.46 5.05 1.91 -1.18 -0.88 6.99 2.52 2.42 0.24
7 3.85 7.95 -0.66 -0.99 1.99 5.06 -4.63 -3.00 -0.41 3.73 4.97 2.10
6 0.99 -0.21 -1.64 -3.01 -2.03 -1.26 -1.52 0.32 2.85 -1.59 5.12 -2.45
5 -2.64 2.33 4.91 1.75 -1.01 1.47 -2.78 4.78 0.94 2.51 -2.01 3.75
4 0.08 1.51 0.25 3.00 -2.16 -2.51 4.59 1.43 0.16 -2.59 0.97 1.65
3 0.63 -0.83 -0.68 0.12 -0.22 -3.17 4.41 -1.29 -2.18 -2.54 1.00 1.36
2 2.51 0.17 2.66 3.41 -2.40 -1.77 -0.63 -3.80 3.47 3.20 2.20 0.37
1 -2.37
Last point is Jan -2.37
Thanks
My answer is based on #BrodieG's one.
You could try nchar to test for "empty cells":
tail(which(nchar(as.matrix(z)) == 0, arr.ind=TRUE), 1)
col <- max(which(!is.na(t(as.matrix(z))))) %% ncol(z)
if(!col) col <- ncol(z)
names(z)[[col]]
# [1] "Dec"
This assumes "empty" values are NA, and that z is a data frame. I tested this by removing some values from the end, and it worked as well.

Resources