R dataset not found? - r

I'm trying to load the dataset life.expectancy.1971
, but seem to have trouble loading it. I'm inputting
data(life.expectancy.1971)
life.expectancy.1971
and keep getting the following error:
data set �life.expectancy.1971� not foundError: object 'life.expectancy.1971' not found.
I'm still pretty new to R so it could be a simple error on my part, but I haven't been able to figure out what's wrong since that has worked for loading other datasets. Can anyone help me figure out what I'm missing?

Pasting the answers from the comments to an answer so that the question can be closed.
Install the cluster.datasets package and its dependencies
install.packages(c("cluster.datasets"), dependencies = TRUE)
load cluster.datasets
library(cluster.datasets)
load the dataset life.expectancy.1971,
data(life.expectancy.1971)
look at the dataset life.expectancy.1971,
life.expectancy.1971
#> country year m0 m25 m50 m75 f0 f25 f50 f75
#> 1 Algeria 1965 63 51 30 13 67 54 34 15
#> 2 Cameroon 1964 34 29 13 5 38 32 17 6
#> 3 Madagascar 1966 38 30 17 7 38 34 20 7
#> 4 Mauritius 1966 59 42 20 6 64 46 25 8
#> 5 Reunion 1963 56 38 18 7 62 46 25 10
#> 6 Seychelles 1960 62 44 24 7 69 50 28 14
#> 7 South Africa (Nonwhite) 1961 50 39 20 7 55 43 23 8
#> 8 South Africa (White) 1961 65 44 22 7 72 50 27 9
#> 9 Tunisia 1960 56 46 24 11 63 54 33 19
#> 10 Canada 1966 69 47 24 8 75 53 29 10
#> 11 Costa Rica 1966 65 48 26 9 68 50 27 10
#> 12 Dominican Republic 1966 64 50 28 11 66 51 29 11
#> 13 El Salvador 1961 56 44 25 10 61 48 27 12
#> 14 Greenland 1960 60 44 22 6 65 45 25 9
#> 15 Grenada 1961 61 45 22 8 65 49 27 10
#> 16 Guatemala 1964 49 40 22 9 51 41 23 8
#> 17 Honduras 1966 59 42 22 6 61 43 22 7
#> 18 Jamaica 1963 63 44 23 8 67 48 26 9
#> 19 Mexico 1966 59 44 24 8 63 46 25 8
#> 20 Nicaragua 1965 65 48 28 14 68 51 29 13
#> 21 Panama 1966 65 48 26 9 67 49 27 10
#> 22 Trinidad 1962 64 43 21 7 68 47 25 9
#> 23 Trinidad 1967 64 43 21 6 68 47 24 8
#> 24 US 1966 67 45 23 8 74 51 28 10
#> 25 US (Nonwhite) 1966 61 40 21 10 67 46 25 11
#> 26 US (White) 1966 68 46 23 8 75 52 29 10
#> 27 US 1967 67 45 23 8 74 51 28 10
#> 28 Argentina 1964 65 46 24 9 71 51 28 10
#> 29 Chile 1967 59 43 23 10 66 49 27 12
#> 30 Columbia 1965 58 44 24 9 62 47 25 10
#> 31 Ecuador 1965 57 46 25 9 60 49 28 11

Related

Converting month column table to chronological order in R

I have a table of the following format:
Initial Table Formatting
And I'm seeking an output resembling the following:
Date
Value
January 1659
Value 1
February 1659
Value 2
March 1659
Value 3
April 1659
Value 4
and so on (numerical representations of the Month and Year are perfectly fine also.
I've attempted using merge operations but I'm thinking there must be an easier way (possibly using packages). I've found somewhat similar questions asked but none obviously applicable yet.
You can use pivot_longer and unite, both from the tidyr package:
library(tidyr)
pivot_longer(df, -Year) |>
unite(date, name, Year, sep = " ")
#> # A tibble: 120 x 2
#> date value
#> <chr> <int>
#> 1 Jan 1659 68
#> 2 Feb 1659 97
#> 3 Mar 1659 89
#> 4 Apr 1659 74
#> 5 May 1659 44
#> 6 Jun 1659 2
#> 7 Jul 1659 81
#> 8 Aug 1659 22
#> 9 Sep 1659 87
#> 10 Oct 1659 1
#> # ... with 110 more rows
Data used
set.seed(1)
df <- cbind(1659:1668, replicate(12, sample(99, 10))) |>
as.data.frame() |>
setNames(c("Year", month.abb))
df
#> Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
#> 1 1659 68 97 89 74 44 2 81 22 87 1 76 43
#> 2 1660 39 85 37 42 25 45 13 93 83 43 39 1
#> 3 1661 1 21 34 38 70 18 40 28 90 59 24 29
#> 4 1662 34 54 99 20 39 22 89 48 48 26 53 78
#> 5 1663 87 74 44 28 51 78 48 33 64 15 92 22
#> 6 1664 43 7 79 96 42 65 96 45 94 58 86 70
#> 7 1665 14 73 33 44 6 70 23 21 60 29 40 28
#> 8 1666 82 79 84 87 24 87 84 31 51 24 83 37
#> 9 1667 59 98 35 70 32 93 29 17 34 42 90 61
#> 10 1668 51 37 70 40 14 75 98 73 10 48 35 46
Created on 2022-11-29 with reprex v2.0.2

How to change name on column within a function? [duplicate]

This question already has answers here:
How to dplyr rename a column, by column index?
(4 answers)
Closed 7 months ago.
This is a generic question related to functions:
Let's say I have the following function with random code within brackets. I found this code from a earlier thread from today: Add a column to function with fixed variable
read_prem_league <- function(year) {
"https://en.wikipedia.org/wiki/" %>%
paste0(year - 1, "-", substr(as.character(year), 3, 4), "_Premier_League") %>%
read_html() %>%
html_table() %>%
getElement(5)
}
read_prem_league(2015)
Which generates the following tibble:
#> # A tibble: 20 x 11
#> Pos Team Pld W D L GF GA GD Pts
#> <int> <chr> <int> <int> <int> <int> <int> <int> <chr> <int>
#> 1 1 Manchester City (C) 38 27 5 6 83 32 +51 86
#> 2 2 Manchester United 38 21 11 6 73 44 +29 74
#> 3 3 Liverpool 38 20 9 9 68 42 +26 69
#> 4 4 Chelsea 38 19 10 9 58 36 +22 67
#> 5 5 Leicester City 38 20 6 12 68 50 +18 66
#> 6 6 West Ham United 38 19 8 11 62 47 +15 65
#> 7 7 Tottenham Hotspur 38 18 8 12 68 45 +23 62
#> 8 8 Arsenal 38 18 7 13 55 39 +16 61
#> 9 9 Leeds United 38 18 5 15 62 54 +8 59
#> 10 10 Everton 38 17 8 13 47 48 -1 59
#> 11 11 Aston Villa 38 16 7 15 55 46 +9 55
#> 12 12 Newcastle United 38 12 9 17 46 62 -16 45
#> 13 13 Wolverhampton Wande~ 38 12 9 17 36 52 -16 45
#> 14 14 Crystal Palace 38 12 8 18 41 66 -25 44
#> 15 15 Southampton 38 12 7 19 47 68 -21 43
#> 16 16 Brighton & Hove Alb~ 38 9 14 15 40 46 -6 41
#> 17 17 Burnley 38 10 9 19 33 55 -22 39
#> 18 18 Fulham (R) 38 5 13 20 27 53 -26 28
#> 19 19 West Bromwich Albio~ 38 5 11 22 35 76 -41 26
#> 20 20 Sheffield United (R) 38 7 2 29 20 63 -43 23
#> # ... with 1 more variable: `Qualification or relegation` <chr>
I would like to change name of Team column to Club so it always has the name Club. I want to find a general code that works for column 2 in other functions aswell as there are functions where the data are same but column names differs (and I want one column name).
Something similar to below code that was brought as an previous answer is what I'm looking for:
dat <- read.csv(url)
names(dat)[2] <- "year"
dat
You can rename by index
read_prem_league <- function(year) {
"https://en.wikipedia.org/wiki/" %>%
paste0(year - 1, "-", substr(as.character(year), 3, 4), "_Premier_League") %>%
read_html() %>%
html_table() %>%
getElement(5) %>%
rename(Club=2)
}
It can be done as
library(rvest)
library(dplyr)
read_prem_league <- function(year) {
dat <- "https://en.wikipedia.org/wiki/" %>%
paste0(year - 1, "-", substr(as.character(year), 3, 4), "_Premier_League") %>%
read_html() %>%
html_table() %>%
getElement(5)
names(dat)[2] <- "Club"
dat
}
-testing
> read_prem_league(2015)
# A tibble: 20 × 11
Pos Club Pld W D L GF GA GD Pts `Qualification or relegation`
<int> <chr> <int> <int> <int> <int> <int> <int> <chr> <int> <chr>
1 1 Chelsea (C) 38 26 9 3 73 32 +41 87 "Qualification for the Champions League group stage"
2 2 Manchester City 38 24 7 7 83 38 +45 79 "Qualification for the Champions League group stage"
3 3 Arsenal 38 22 9 7 71 36 +35 75 "Qualification for the Champions League group stage"
4 4 Manchester United 38 20 10 8 62 37 +25 70 "Qualification for the Champions League play-off round"
5 5 Tottenham Hotspur 38 19 7 12 58 53 +5 64 "Qualification for the Europa League group stage[a]"
6 6 Liverpool 38 18 8 12 52 48 +4 62 "Qualification for the Europa League group stage[a]"
7 7 Southampton 38 18 6 14 54 33 +21 60 "Qualification for the Europa League third qualifying round[a]"
8 8 Swansea City 38 16 8 14 46 49 −3 56 ""
9 9 Stoke City 38 15 9 14 48 45 +3 54 ""
10 10 Crystal Palace 38 13 9 16 47 51 −4 48 ""
11 11 Everton 38 12 11 15 48 50 −2 47 ""
12 12 West Ham United 38 12 11 15 44 47 −3 47 "Qualification for the Europa League first qualifying round[b]"
13 13 West Bromwich Albion 38 11 11 16 38 51 −13 44 ""
14 14 Leicester City 38 11 8 19 46 55 −9 41 ""
15 15 Newcastle United 38 10 9 19 40 63 −23 39 ""
16 16 Sunderland 38 7 17 14 31 53 −22 38 ""
17 17 Aston Villa 38 10 8 20 31 57 −26 38 ""
18 18 Hull City (R) 38 8 11 19 33 51 −18 35 "Relegation to the Football League Championship"
19 19 Burnley (R) 38 7 12 19 28 53 −25 33 "Relegation to the Football League Championship"
20 20 Queens Park Rangers (R) 38 8 6 24 42 73 −31 30 "Relegation to the Football League Championship"

Add a column to function with fixed variable

I have this code as a function which generates the table of a Premier League season from Wiki.
read_prem_league <- function(year) {
"https://en.wikipedia.org/wiki/" %>%
paste0(year - 1, "-", substr(as.character(year), 3, 4), "_Premier_League") %>%
read_html() %>%
html_table() %>%
getElement(5)
}
read_prem_league(2021)
Who would create the following tibble:
#> # A tibble: 20 x 11
#> Pos Team Pld W D L GF GA GD Pts
#> <int> <chr> <int> <int> <int> <int> <int> <int> <chr> <int>
#> 1 1 Manchester City (C) 38 27 5 6 83 32 +51 86
#> 2 2 Manchester United 38 21 11 6 73 44 +29 74
#> 3 3 Liverpool 38 20 9 9 68 42 +26 69
#> 4 4 Chelsea 38 19 10 9 58 36 +22 67
#> 5 5 Leicester City 38 20 6 12 68 50 +18 66
#> 6 6 West Ham United 38 19 8 11 62 47 +15 65
#> 7 7 Tottenham Hotspur 38 18 8 12 68 45 +23 62
#> 8 8 Arsenal 38 18 7 13 55 39 +16 61
#> 9 9 Leeds United 38 18 5 15 62 54 +8 59
#> 10 10 Everton 38 17 8 13 47 48 -1 59
#> 11 11 Aston Villa 38 16 7 15 55 46 +9 55
#> 12 12 Newcastle United 38 12 9 17 46 62 -16 45
#> 13 13 Wolverhampton Wande~ 38 12 9 17 36 52 -16 45
#> 14 14 Crystal Palace 38 12 8 18 41 66 -25 44
#> 15 15 Southampton 38 12 7 19 47 68 -21 43
#> 16 16 Brighton & Hove Alb~ 38 9 14 15 40 46 -6 41
#> 17 17 Burnley 38 10 9 19 33 55 -22 39
#> 18 18 Fulham (R) 38 5 13 20 27 53 -26 28
#> 19 19 West Bromwich Albio~ 38 5 11 22 35 76 -41 26
#> 20 20 Sheffield United (R) 38 7 2 29 20 63 -43 23
#> # ... with 1 more variable: `Qualification or relegation` <chr>
What I would like to do is to add a column called Season to the left of Pos which shows the current season, so it it's the season ending in 2020 I want it to to say 2019-20.
read_prem_league$Season <- (year)
The above code should work and I want to put it within the function. However, I get the error: Error in View : object of type 'closure' is not subsettable
We may use mutate
library(dplyr)
library(rvest)
read_prem_league <- function(year) {
"https://en.wikipedia.org/wiki/" %>%
paste0(year - 1, "-", substr(as.character(year), 3, 4),
"_Premier_League") %>%
read_html() %>%
html_table() %>%
getElement(5) %>%
dplyr::mutate(Season = year, .before = Pos)
}
-testing
> dat <- read_prem_league(2021)
> dat
# A tibble: 20 × 12
Season Pos Team Pld W D L GF GA GD Pts `Qualification or relegation`
<dbl> <int> <chr> <int> <int> <int> <int> <int> <int> <chr> <int> <chr>
1 2021 1 Manchester City (C) 38 27 5 6 83 32 +51 86 "Qualification for the Champions League group stage"
2 2021 2 Manchester United 38 21 11 6 73 44 +29 74 "Qualification for the Champions League group stage"
3 2021 3 Liverpool 38 20 9 9 68 42 +26 69 "Qualification for the Champions League group stage"
4 2021 4 Chelsea 38 19 10 9 58 36 +22 67 "Qualification for the Champions League group stage"
5 2021 5 Leicester City 38 20 6 12 68 50 +18 66 "Qualification for the Europa League group stage[a]"
6 2021 6 West Ham United 38 19 8 11 62 47 +15 65 "Qualification for the Europa League group stage[a]"
7 2021 7 Tottenham Hotspur 38 18 8 12 68 45 +23 62 "Qualification for the Europa Conference League play-off round[b]"
8 2021 8 Arsenal 38 18 7 13 55 39 +16 61 ""
9 2021 9 Leeds United 38 18 5 15 62 54 +8 59 ""
10 2021 10 Everton 38 17 8 13 47 48 −1 59 ""
11 2021 11 Aston Villa 38 16 7 15 55 46 +9 55 ""
12 2021 12 Newcastle United 38 12 9 17 46 62 −16 45 ""
13 2021 13 Wolverhampton Wanderers 38 12 9 17 36 52 −16 45 ""
14 2021 14 Crystal Palace 38 12 8 18 41 66 −25 44 ""
15 2021 15 Southampton 38 12 7 19 47 68 −21 43 ""
16 2021 16 Brighton & Hove Albion 38 9 14 15 40 46 −6 41 ""
17 2021 17 Burnley 38 10 9 19 33 55 −22 39 ""
18 2021 18 Fulham (R) 38 5 13 20 27 53 −26 28 "Relegation to the EFL Championship"
19 2021 19 West Bromwich Albion (R) 38 5 11 22 35 76 −41 26 "Relegation to the EFL Championship"
20 2021 20 Sheffield United (R) 38 7 2 29 20 63 −43 23 "Relegation to the EFL Championship"

R: order() not working as expected outside of RStudio

I have a character vector (consisting of randomly arranged numbers or letters) that I want to use to order a dataframe:
vals = as.numeric(dict$keys)
## ONE
vals = order(vals)
## TWO
dict = dict[vals,]
At ONE:
> vals
[1] 1 1 1 1 1 2 2 2 3 3 3 3 3 3 3 4 4 4 4 4 4 4 5 5 5
[26] 6 7 7 7 7 8 8 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 10 10
[51] 10 10 10 11 11 11 11 11 12 12 12 12 12 12 12 12 12 13 13 13 14 14 15 15 15
[76] 15 16 16 16 16 16 16 16 16 16 16 17 17 17 17 17 18 18 18 18 18 18 18 18 18
[101] 18 19 19 19 19 19 19 20 20 20 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21
[126] 22 22 22 22 22 22 22 22 22 22 22 22 23
At TWO:
> vals
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
[19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
[37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
[55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
[73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
[91] 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
[109] 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
[127] 127 128 129 130 131 132 133 134 135 136 137 138
When I execute this snippet in RStudio in Windows, it orders the dataframe dict fine. Numbers are ordered first, then letters are at the end (this is what I want).
However, in a linux remote desktop where I execute with > Rscript , this snippet doesn't work and the dataframe remains how it was before these lines are executed.
I fixed this by defining stringsAsFactors = F for all uses of data.frame in the script as Henrik suggested. The issue lied in the different versions of R I was using on the two systems.

Distance Matrix from table in R

Good evening,
I need to solve a location problem in R and I'm stuck in one of the first steps.
From a .txt file I need to create a distance matrix using the euclidean method.
datos <- file.choose()
servidores <- read.table(datos)
servidores
From which I obtain the following information:
X50 shows the total number of servers.
x5 the number of hubs required.
x120 the total capacity.
The first column shows the distance of x.
The second column shows the distance of y.
The third column shows the requirements of the node.
X50 X5 X120
1 2 62 3
2 80 25 14
3 36 88 1
4 57 23 14
5 33 17 19
6 76 43 2
7 77 85 14
8 94 6 6
9 89 11 7
10 59 72 6
11 39 82 10
12 87 24 18
13 44 76 3
14 2 83 6
15 19 43 20
16 5 27 4
17 58 72 14
18 14 50 11
19 43 18 19
20 87 7 15
21 11 56 15
22 31 16 4
23 51 94 13
24 55 13 13
25 84 57 5
26 12 2 16
27 53 33 3
28 53 10 7
29 33 32 14
30 69 67 17
31 43 5 3
32 10 75 3
33 8 26 12
34 3 1 14
35 96 22 20
36 6 48 13
37 59 22 10
38 66 69 9
39 22 50 6
40 75 21 18
41 4 81 7
42 41 97 20
43 92 34 9
44 12 64 1
45 60 84 8
46 35 100 5
47 38 2 1
48 9 9 7
49 54 59 9
50 1 58 2
I tried to use the dist() function:
distance_matrix <-dist(servidores,method = "euclidean",diag = TRUE,upper = TRUE)
but since x and y are on different columns I am not sure what to do to get a 50x50 matrix with all the distances.
Anybody knows how could I create such matrix?.
Many thanks in advance.

Resources