R Count values in DF variable by groups - r

Using following dataset:
set.seed(2)
origin <- rep(c("DEU", "GBR", "ITA", "NLD", "CAN", "MEX", "USA", "CHN", "JPN", "KOR","DEU", "GBR", "ITA", "NLD", "CAN", "MEX", "USA", "CHN", "JPN", "KOR"), 4)
year <- rep(c(rep(1998, 10), rep(2000, 10)), 2)
type <- sample(1:10, size=length(origin), replace=TRUE)
value <- sample(100:10000, size=length(origin), replace=TRUE)
test.df <- as.data.frame(cbind(origin, year, type, value))
rm(origin, year, type, value)
### add some (6) missing values
test.df$value[sample(1:length(test.df$value), 6, replace = FALSE)] <- NA
I want to count how many types by country (origin) per year
I tryed:
count(trade.df, origin, year)
and
test.df %>% group_by(origin, year) %>% count()
but I am not sure of how I can interpret these results.
of course, if value == NA, R should not count it...

To remove the rows where value is NA, use filter:
test.df %>% group_by(origin,year) %>%
filter(!is.na(value)) %>% count()
# A tibble: 20 x 3
# Groups: origin, year [20]
origin year n
<fct> <fct> <int>
1 CAN 1998 4
2 CAN 2000 3
3 CHN 1998 3
4 CHN 2000 4
5 DEU 1998 4
6 DEU 2000 4
7 GBR 1998 4
8 GBR 2000 4
9 ITA 1998 3
10 ITA 2000 4
11 JPN 1998 3
12 JPN 2000 3
13 KOR 1998 4
14 KOR 2000 4
15 MEX 1998 4
16 MEX 2000 4
17 NLD 1998 3
18 NLD 2000 4
19 USA 1998 4
20 USA 2000 4
Note, however, that this doesn't count how many types there are in each group, but how many rows there are. If you want to count the number of unique types, you can do this:
test.df %>% group_by(origin,year) %>%
filter(!is.na(value)) %>%
summarize(n_distinct(type)) #Merci, #Frank!
# A tibble: 20 x 3
# Groups: origin [?]
origin year `length(unique(type))`
<fct> <fct> <int>
1 CAN 1998 3
2 CAN 2000 3
3 CHN 1998 2
4 CHN 2000 3
5 DEU 1998 4
6 DEU 2000 3
7 GBR 1998 4
8 GBR 2000 4
9 ITA 1998 3
10 ITA 2000 4
11 JPN 1998 3
12 JPN 2000 2
13 KOR 1998 4
14 KOR 2000 4
15 MEX 1998 3
16 MEX 2000 3
17 NLD 1998 2
18 NLD 2000 3
19 USA 1998 3
20 USA 2000 4

Related

Is it possible to make groups based on an ID of a person in R?

I have this data:
data <- data.frame(id_pers=c(4102,13102,27101,27102,28101,28102, 42101,42102,56102,73102,74103,103104,117103,117104,117105),
birthyear=c(1992,1994,1993,1992,1995,1999,2000,2001,2000, 1994, 1999, 1978, 1986, 1998, 1999))
I want to group the different persons by familys in a new column, so that persons 27101,27102 (siblings) are group/family 1 and 42101,42102 are in group 2, 117103,117104,117105 are in group 3 so on.
Person "4102" has no siblings and should be a NA in the new column.
It is always the case that 2 or more persons are siblings if the ID's are not further apart than a maximum of 6 numbers.
I have a far larger dataset with over 3000 rows. How could I do it the most efficient way?
You can use round with digits = -1 (or -2) if you have id_pers that goes above 10 observations per family. If you want the id to be integers from 1; you can use cur_group_id:
library(dplyr)
data %>%
group_by(fam_id = round(id_pers - 5, digits = -1)) %>%
mutate(fam_gp = cur_group_id())
output
# A tibble: 15 × 3
# Groups: fam_id [10]
id_pers birthyear fam_id fam_gp
<dbl> <dbl> <dbl> <int>
1 4102 1992 4100 1
2 13102 1994 13100 2
3 27101 1993 27100 3
4 27102 1992 27100 3
5 28101 1995 28100 4
6 28106 1999 28100 4
7 42101 2000 42100 5
8 42102 2001 42100 5
9 56102 2000 56100 6
10 73102 1994 73100 7
11 74103 1999 74100 8
12 103104 1978 103100 9
13 117103 1986 117100 10
14 117104 1998 117100 10
15 117105 1999 117100 10
It looks like we can the 1000s digit (and above) to delineate groups.
library(dplyr)
data %>%
mutate(
famgroup = trunc(id_pers/1000),
famgroup = match(famgroup, unique(famgroup))
)
# id_pers birthyear famgroup
# 1 4102 1992 1
# 2 13102 1994 2
# 3 27101 1993 3
# 4 27102 1992 3
# 5 28101 1995 4
# 6 28102 1999 4
# 7 42101 2000 5
# 8 42102 2001 5
# 9 56102 2000 6
# 10 73102 1994 7
# 11 74103 1999 8
# 12 103104 1978 9
# 13 117103 1986 10
# 14 117104 1998 10
# 15 117105 1999 10

Add rows and complete dyad by group

I have a dataset in a dyadic format and sorted by group and I am trying to add an observation to each group. I need this observation to also be integrated with the other pairs. Below is a reproducible example to show what I mean. Data is a simplified version of my dataset (it contains more groups essentially).
data <- data.frame(country1 = c("BEL", "FRA", "BEL", "FRA", "AUS", "ITA"),
country2 = c("FRA", "BEL", "FRA", "BEL", "ITA", "AUS"),
year = c(2001,2001,2002,2002,2002,2002),
id = c(1,1,1,1,2,2))
> data
country1 country2 year id
1 BEL FRA 2001 1
2 FRA BEL 2001 1
3 BEL FRA 2002 1
4 FRA BEL 2002 1
5 AUS ITA 2002 2
6 ITA AUS 2002 2
I would like to add a different country to each group. For instance, say I would like to add Luxembourg to group 1 and Portugal to group 2.
This is what the output I need should look like:
> data
country1 country2 year id
1 BEL FRA 2001 1
2 FRA BEL 2001 1
3 LUX BEL 2001 1
4 LUX FRA 2001 1
5 BEL LUX 2001 1
6 FRA LUX 2001 1
7 BEL FRA 2002 1
8 FRA BEL 2002 1
9 LUX BEL 2002 1
10 LUX FRA 2002 1
11 BEL LUX 2002 1
12 FRA LUX 2002 1
13 AUS ITA 2002 2
14 ITA AUS 2002 2
15 POR AUS 2002 2
16 POR ITA 2002 2
17 AUS POR 2002 2
18 ITA POR 2002 2
I found a workaround way but I don't know how to simplify this process and to automate it to some extent.
id1 <- data%>%
filter(id== 1) %>%
mutate(country3 = "LUX")
id1_1 <- id1 %>%
select(!country2) %>%
rename("country2" = "country3") %>%
distinct()
id1_2 <- id1 %>%
select(!country1) %>%
rename("country1" = "country3") %>%
distinct()
id1_2 <- id1_2 [, c(2,1,3,4)]
id1 <- rbind(id1_1, id1_2)
data<- rbind(data, id1)
This completes the dyads but it is quite tedious to do since I am trying to add about 100 countries to a hundred groups.
I can create either a vector or a data frame containing all the countries I need to add (and arrange them by group if necessary), but I just don't know how to use them to fill the main data. Thanks for any tips!
Would something like this work for you?
library(tidyverse)
data <- data.frame(country1 = c("BEL", "FRA", "BEL", "FRA", "AUS", "ITA"),
country2 = c("FRA", "BEL", "FRA", "BEL", "ITA", "AUS"),
year = c(2001,2001,2002,2002,2002,2002),
id = c(1,1,1,1,2,2))
additions <- tribble(
~id, ~country1,
1, "LUX",
2, "POR"
)
unique_combos <- data |>
distinct(id, year, country1) |>
rows_append(additions) |>
expand(year, nesting(id, country1)) |>
filter(!is.na(year))
unique_combos |>
rename(country2 = country1) |>
full_join(unique_combos) |>
filter(country1 != country2) |>
arrange(id, year, country1, country2)
#> Joining, by = c("year", "id")
#> # A tibble: 24 × 4
#> year id country2 country1
#> <dbl> <dbl> <chr> <chr>
#> 1 2001 1 FRA BEL
#> 2 2001 1 LUX BEL
#> 3 2001 1 BEL FRA
#> 4 2001 1 LUX FRA
#> 5 2001 1 BEL LUX
#> 6 2001 1 FRA LUX
#> 7 2002 1 FRA BEL
#> 8 2002 1 LUX BEL
#> 9 2002 1 BEL FRA
#> 10 2002 1 LUX FRA
#> # … with 14 more rows
Created on 2022-06-29 by the reprex package (v2.0.1)

r data.table adjust min and max years only if each set has at least one incrementing obs

I have a data set that holds an id, location, start year, end year, age1 and age2. For each group defined as id, location, age1 and age2, I would like to create new start and end year. For instance, I may have three entries for china encompassing age 0 - age 4. One will be 2000 - 2000, the other is 2001 - 2001, and the final is 2005-2005. Since the years are incrementing by 1 in the first two entries, I'd want their corresponding newstart and newend to be 2000-2001. The third entry would have newstart==2005 and newend==2005 as this is not apart of a continuous set of years.
The data table I have resembles the following, except it has thousands of entries many combinations :
id location start end age1 age2
1 brazil 2000 2000 0 4
1 brazil 2001 2001 0 4
1 brazil 2002 2002 0 4
2 argentina 1990 1991 1 1
2 argentina 1991 1991 2 2
2 argentina 1992 1992 2 2
2 argentina 1993 1993 2 2
3 belize 2001 2001 0.5 1
3 belize 2005 2005 1 2
I want to alter the data table so that it will look like the following
id location start end age1 age2 newstart newend
1 brazil 2000 2000 0 4 2000 2002
1 brazil 2001 2001 0 4 2000 2002
1 brazil 2002 2002 0 4 2000 2002
2 argentina 1990 1991 1 1 1991 1991
2 argentina 1991 1991 2 2 1991 1993
2 argentina 1992 1992 2 2 1991 1993
2 argentina 1993 1993 2 2 1991 1993
3 belize 2001 2001 0.5 1 2001 2001
3 belize 2005 2005 1 2 2005 2005
I have tried creating a variable that tracks the difference of the previous year and the current year using lag and then calculating the difference between these two years. I then created the newstart and newend by placing the min start and max end. I have found that this only works if there is a set of 2 in continuous years. If I have a larger set, this doesn't work as it has no way of tracking the number of obs in which the years increase by 1 for each grouping. I believe I need some type of loop.
Is there a more efficient way to accomplish this?
data.table
You tagged with data.table, so my first suggestion is this:
library(data.table)
dat[, contiguous := rleid(c(TRUE, diff(start) == 1)), by = .(id)]
dat[, c("newstart", "newend") := .(min(start), max(end)), by = .(id, contiguous)]
dat[, contiguous := NULL]
dat
# id location start end age1 age2 newstart newend
# 1: 1 brazil 2000 2000 0.0 4 2000 2002
# 2: 1 brazil 2001 2001 0.0 4 2000 2002
# 3: 1 brazil 2002 2002 0.0 4 2000 2002
# 4: 2 argentina 1990 1991 1.0 1 1990 1993
# 5: 2 argentina 1991 1991 2.0 2 1990 1993
# 6: 2 argentina 1992 1992 2.0 2 1990 1993
# 7: 2 argentina 1993 1993 2.0 2 1990 1993
# 8: 3 belize 2001 2001 0.5 1 2001 2001
# 9: 3 belize 2005 2005 1.0 2 2005 2005
base R
If instead you really just mean data.frame, then
dat <- transform(dat, contiguous = ave(start, id, FUN = function(a) cumsum(c(TRUE, diff(a) != 1))))
dat <- transform(dat,
newstart = ave(start, id, contiguous, FUN = min),
newend = ave(end , id, contiguous, FUN = max)
)
# Warning in FUN(X[[i]], ...) :
# no non-missing arguments to min; returning Inf
# Warning in FUN(X[[i]], ...) :
# no non-missing arguments to min; returning Inf
# Warning in FUN(X[[i]], ...) :
# no non-missing arguments to max; returning -Inf
# Warning in FUN(X[[i]], ...) :
# no non-missing arguments to max; returning -Inf
dat
# id location start end age1 age2 newstart newend contiguous
# 1 1 brazil 2000 2000 0.0 4 2000 2002 1
# 2 1 brazil 2001 2001 0.0 4 2000 2002 1
# 3 1 brazil 2002 2002 0.0 4 2000 2002 1
# 4 2 argentina 1990 1991 1.0 1 1990 1993 1
# 5 2 argentina 1991 1991 2.0 2 1990 1993 1
# 6 2 argentina 1992 1992 2.0 2 1990 1993 1
# 7 2 argentina 1993 1993 2.0 2 1990 1993 1
# 8 3 belize 2001 2001 0.5 1 2001 2001 1
# 9 3 belize 2005 2005 1.0 2 2005 2005 2
dat$contiguous <- NULL
Interesting point I just learned about ave: it uses interaction(...) (all grouping variables), which is going to give all possible combinations, not just the combinations observed in the data. Because of that, the FUNction may be called with zero data. In this case, it did, giving the warnings. One could suppress this with function(a) suppressWarnings(min(a)) instead of just min.
We could use dplyr. After grouping by 'id', take the difference of the 'start' and the lagof the 'start', apply rleid to get the run-length-id' and create the 'newstart', 'newend' as the min and max of the 'start'
library(dplyr)
library(data.table)
df1 %>%
group_by(id) %>%
group_by(grp = rleid(replace_na(start - lag(start), 1)),
.add = TRUE) %>%
mutate(newstart = min(start), newend = max(end))
-output
# A tibble: 9 x 9
# Groups: id, grp [4]
# id location start end age1 age2 grp newstart newend
# <int> <chr> <int> <int> <dbl> <int> <int> <int> <int>
#1 1 brazil 2000 2000 0 4 1 2000 2002
#2 1 brazil 2001 2001 0 4 1 2000 2002
#3 1 brazil 2002 2002 0 4 1 2000 2002
#4 2 argentina 1990 1991 1 1 1 1990 1993
#5 2 argentina 1991 1991 2 2 1 1990 1993
#6 2 argentina 1992 1992 2 2 1 1990 1993
#7 2 argentina 1993 1993 2 2 1 1990 1993
#8 3 belize 2001 2001 0.5 1 1 2001 2001
#9 3 belize 2005 2005 1 2 2 2005 2005
Or with data.table
library(data.table)
setDT(df1)[, grp := rleid(replace_na(start - shift(start), 1))
][, c('newstart', 'newend') := .(min(start), max(end)), .(id, grp)][, grp := NULL]

Calculating percentage change of panel data for other entities

I have a very large data frame that takes the form of panel data. The data has economic information on production for each industry within countries for a range of years. I would like to find a code that calculates year-to-year percentage changes for this output within the same industry but aggregates this for different countries as the one of the same row.
It sounds difficult (difficult to explain) so I give an example. Using this code:
panel <- cbind.data.frame(industry = rep(c("Logging" , "Automobile") , each = 9) ,
country = rep(c("Austria" , "Belgium" , "Croatia") , each = 3 , times = 2) ,
year = rep(c(2000:2002) , times = 6) ,
output = c(2,3,4,1,5,8,1,2,4,2,3,4,6,7,8,9,10,11))
That gives this matrix:
industry country year output
1 Logging Austria 2000 2
2 Logging Austria 2001 3
3 Logging Austria 2002 4
4 Logging Belgium 2000 1
5 Logging Belgium 2001 5
6 Logging Belgium 2002 8
7 Logging Croatia 2000 1
8 Logging Croatia 2001 2
9 Logging Croatia 2002 4
10 Automobile Austria 2000 2
11 Automobile Austria 2001 3
12 Automobile Austria 2002 4
13 Automobile Belgium 2000 6
14 Automobile Belgium 2001 7
15 Automobile Belgium 2002 8
16 Automobile Croatia 2000 9
17 Automobile Croatia 2001 10
18 Automobile Croatia 2002 11
I compute percentage changes per industry using tidyverse:
library(tidyverse)
panel <- panel %>%
group_by(country , industry) %>%
mutate(per_change = (output - lag(output)) / lag(output))
giving:
# A tibble: 18 x 5
# Groups: country, industry [6]
industry country year output per_change
<fct> <fct> <int> <dbl> <dbl>
1 Logging Austria 2000 2 NA
2 Logging Austria 2001 3 0.5
3 Logging Austria 2002 4 0.333
4 Logging Belgium 2000 1 NA
5 Logging Belgium 2001 5 4
6 Logging Belgium 2002 8 0.6
7 Logging Croatia 2000 1 NA
8 Logging Croatia 2001 2 1
9 Logging Croatia 2002 4 1
10 Automobile Austria 2000 2 NA
11 Automobile Austria 2001 3 0.5
12 Automobile Austria 2002 4 0.333
13 Automobile Belgium 2000 6 NA
14 Automobile Belgium 2001 7 0.167
15 Automobile Belgium 2002 8 0.143
16 Automobile Croatia 2000 9 NA
17 Automobile Croatia 2001 10 0.111
18 Automobile Croatia 2002 11 0.1
So I would like a code that gives for row 1 NA, row 2 the sum of percentage change for all logging industry in 2001 except Austria (4+1) = 5, row 3 sum of all percentage change in logging industry in 2002 except Austria (0.6 +1) = 1.6, row 4 again NA, row 5 sum of percentage change for logging in 2001 except Belgium (1.5) , ....
I wouldn't know how to do this other by hand.
Please also a code that is flexible and would be able to identify N countries and Y industries.
You can
first group the "panel" table according to industry and year to sum "per_change"
second join this grouped table with your main table
lastly subtract "per_change" from "grouped sum"
After your code:
d1<-as.data.frame(panel)
attach(panel)
d2<-aggregate(per_change~industry+year, FUN=sum)
detach(panel)
library(dplyr)
panel<-left_join(d1,d2, by=c("industry"="industry", "year"="year"))
panel$exc_per_change<-panel$per_change.y-panel$per_change.x
output is
> head(panel)
industry country year output per_change.x per_change.y exc_per_change
1 Logging Austria 2000 2 NA NA NA
2 Logging Austria 2001 3 0.5000000 5.500000 5.000000
3 Logging Austria 2002 4 0.3333333 1.933333 1.600000
4 Logging Belgium 2000 1 NA NA NA
5 Logging Belgium 2001 5 4.0000000 5.500000 1.500000
6 Logging Belgium 2002 8 0.6000000 1.933333 1.333333

Add lines with NA values

I have a data frame like this:
indx country year death value
1 1 Italy 2000 hiv 1
2 1 Italy 2001 hiv 2
3 1 Italy 2005 hiv 3
4 1 Italy 2000 cancer 4
5 1 Italy 2001 cancer 5
6 1 Italy 2002 cancer 6
7 1 Italy 2003 cancer 7
8 1 Italy 2004 cancer 8
9 1 Italy 2005 cancer 9
10 4 France 2000 hiv 10
11 4 France 2004 hiv 11
12 4 France 2005 hiv 12
13 4 France 2001 cancer 13
14 4 France 2002 cancer 14
15 4 France 2003 cancer 15
16 4 France 2004 cancer 16
17 2 Spain 2000 hiv 17
18 2 Spain 2001 hiv 18
19 2 Spain 2002 hiv 19
20 2 Spain 2003 hiv 20
21 2 Spain 2004 hiv 21
22 2 Spain 2005 hiv 22
23 2 Spain ... ... ...
indx is a value linked to the country (same country = same indx).
In this example I used only 3 countries (country) and 2 disease (death), in the original data frame are many more.
I would like to have one row for each country for each disease from 2000 to 2005.
What I would like to get is:
indx country year death value
1 1 Italy 2000 hiv 1
2 1 Italy 2001 hiv 2
3 1 Italy 2002 hiv NA
4 1 Italy 2003 hiv NA
5 1 Italy 2004 hiv NA
6 1 Italy 2005 hiv 3
7 1 Italy 2000 cancer 4
8 1 Italy 2001 cancer 5
9 1 Italy 2002 cancer 6
10 1 Italy 2003 cancer 7
11 1 Italy 2004 cancer 8
12 1 Italy 2005 cancer 9
13 4 France 2000 hiv 10
14 4 France 2001 hiv NA
15 4 France 2002 hiv NA
16 4 France 2003 hiv NA
17 4 France 2004 hiv 11
18 4 France 2005 hiv 12
19 4 France 2000 cancer NA
20 4 France 2001 cancer 13
21 4 France 2002 cancer 14
22 4 France 2003 cancer 15
23 4 France 2004 cancer 16
24 4 France 2005 cancer NA
25 2 Spain 2000 hiv 17
26 2 Spain 2001 hiv 18
27 2 Spain 2002 hiv 19
28 2 Spain 2003 hiv 20
29 2 Spain 2004 hiv 21
30 2 Spain 2005 hiv 22
31 2 Spain ... ... ...
I.e. I would like to add lines with value = NA at the missing years for each country for each disease.
For example, it lacks data of HIV in Italy between 2002 and 2004 and then I add this lines with value = NA.
How can I do that?
For a reproducible example:
indx <- c(rep(1, times=9), rep(4, times=7), rep(2, times=6))
country <- c(rep("Italy", times=9), rep("France", times=7), rep("Spain", times=6))
year <- c(2000, 2001, 2005, 2000:2005, 2000, 2004, 2005, 2001:2004, 2000:2005)
death <- c(rep("hiv", times=3), rep("cancer", times=6), rep("hiv", times=3), rep("cancer", times=4), rep("hiv", times=6))
value <- c(1:22)
dfl <- data.frame(indx, country, year, death, value)
Using base R, you could do:
# setDF(dfl) # run this first if you have a data.table
merge(expand.grid(lapply(dfl[c("country", "death", "year")], unique)), dfl, all.x = TRUE)
This first creates all combinations of the unique values in country, death, and year and then merges it to the original data, to add the values and where combinations were not in the original data, it adds NAs.
In the package tidyr, there's a special function that does this for you with a a single command:
library(tidyr)
complete(dfl, country, year, death)
Here is a longer base R method. You create two new data.frames, one that contains all combinations of the country, year, and death, and a second that contains an index key.
# get data.frame with every combination of country, year, and death
dfNew <- with(df, expand.grid("country"=unique(country), "year"=unique(year),
"death"=unique(death)))
# get index key
indexKey <- unique(df[, c("indx", "country")])
# merge these together
dfNew <- merge(indexKey, dfNew, by="country")
# merge onto original data set
dfNew <- merge(df, dfNew, by=c("indx", "country", "year", "death"), all=TRUE)
This returns
dfNew
indx country year death value
1 1 Italy 2000 cancer 4
2 1 Italy 2000 hiv 1
3 1 Italy 2001 cancer 5
4 1 Italy 2001 hiv 2
5 1 Italy 2002 cancer 6
6 1 Italy 2002 hiv NA
7 1 Italy 2003 cancer 7
8 1 Italy 2003 hiv NA
9 1 Italy 2004 cancer 8
10 1 Italy 2004 hiv NA
11 1 Italy 2005 cancer 9
12 1 Italy 2005 hiv 3
13 2 Spain 2000 cancer NA
14 2 Spain 2000 hiv 17
15 2 Spain 2001 cancer NA
...
If df is a data.table, here are the corresponding lines of code:
# CJ is a cross-join
setkey(df, country, year, death)
dfNew <- df[CJ(country, year, death, unique=TRUE),
.(country, year, death, value)]
indexKey <- unique(df[, .(indx, country)])
dfNew <- merge(indexKey, dfNew, by="country")
dfNew <- merge(df, dfNew, by=c("indx", "country", "year", "death"), all=TRUE)
Note that it rather than using CJ, it is also possible to use expand.grid as in the data.frame version:
dfNew <- df[, expand.grid("country"=unique(country), "year"=unique(year),
"death"=unique(death))]
tidyr::complete helps create all combinations of the variables you pass it, but if you have two columns that are identical, it will over-expand or leave NAs where you don't want. As a workaround you can use dplyr grouping (df %>% group_by(indx, country) %>% complete(death, year)) or just merge the two columns into one temporarily:
library(tidyr)
# merge indx and country into a single column so they won't over-expand
df %>% unite(indx_country, indx, country) %>%
# fill in missing combinations of new column, death, and year
complete(indx_country, death, year) %>%
# separate indx and country back to how they were
separate(indx_country, c('indx', 'country'))
# Source: local data frame [36 x 5]
#
# indx country death year value
# (chr) (chr) (fctr) (int) (int)
# 1 1 Italy cancer 2000 4
# 2 1 Italy cancer 2001 5
# 3 1 Italy cancer 2002 6
# 4 1 Italy cancer 2003 7
# 5 1 Italy cancer 2004 8
# 6 1 Italy cancer 2005 9
# 7 1 Italy hiv 2000 1
# 8 1 Italy hiv 2001 2
# 9 1 Italy hiv 2002 NA
# 10 1 Italy hiv 2003 NA
# .. ... ... ... ... ...

Resources