Extracting numbers from very long string into vector - r

I have the fairly long string shown below (~50k characters)
https://gist.github.com/anonymous/9de31de2e6fc9888f3debeda4698b739
I want to extract numbers (always 1 or 2 digit), that are always between "'>" and "<" and add them to a vector (must be in the correct order).
for example:
><td class='td-val ball-8'>13</td><td class='td-val ball-8'>9</td>
Would output a vector, [13,9]
I couldn't even get it to let me enter my string into r, when I tried to do it in the form.
mystring <- "text here"
When I would try to press enter then, it would just have a + next to the command line. So I think some of the symbols in the text were messing it up.

Since it's HTML that you're trying to parse, it's best to use an HTML parsing package like rvest:
library(rvest)
url <- 'https://gist.githubusercontent.com/anonymous/9de31de2e6fc9888f3debeda4698b739/raw/c07c2d6c6f00060806b15ec57ed06d4a4e0d9d74/gistfile1.txt'
url %>% read_html() %>% html_nodes('td.td-val') %>% html_text() %>% as.integer()
which returns
[1] 13 9 8 8 1 2 0 8 11 2 13 5 13 4 4 5 4 7 3 8 10 13 1 7 14 13 10 2 0 8
[31] 13 0 10 5 11 9 3 1 4 3 5 12 4 14 1 9 13 5 9 7 12 10 2 10 14 4 11 11 13 8
[61] 8 10 10 12 12 6 8 13 7 2 2 9 10 9 13 3 14 14 0 14 4 11 14 6 10 2 0 0 10 14
[91] 2 8 3 6 14 6 1 9 11 12 1 12 4 0 7 9 2 10 1 12 0 8 0 9 3 11 11 0 8 5
[121] 0 6 1 9 8 10 7 4 7 0 3 12 10 11 11 8 4 11 1 5 12 2 14 9 12 8 1 9 14 13
[151] 8 2 1 5 7 9 14 14 12 3 6 3 9 0 6 9 3 3 10 3 8 6 9 2 4 12 2 2 14 7
[181] 12 8 0 8 12 2 12 9 6 8 9 9 3 7 9 0 6 13 0 12 3 14 12 4 8 9 14 4 5 9
[211] 6 3 2 5 1 2 0 5 0 5 9 0 12 14 11 11 7 4 12 1 14 2 13 3 13 2 0 12 13 6
[241] 5 3 13 9 12 2 11 6 8 12 9 6 13 9 0 0 4 2 1 0 0 3 0 3 7 9 11 1 8 10
[271] 11 13 12 9 10 8 10 3 7 12 4 9 0 4 14 1 7 0 7 1 2 6 0 6 6 1 0 9 4 8
[301] 0 7 13 8 11 4 1 12 1 14 11 13 9 12 8 2 8 7 12 13 12 5 8 5 10 2 7 5 9 12
[331] 12 13 8 7 6 4 12 13 4 9 12 2 0 11 8 9 1 10 5 10 9 11 10 1 8 1 12 10 9 5
[361] 7 10 5 2 7 12 4 10 6 9 0 6 0 4 13 7 0 8 3 3 11 8 4 12 10 5 7 1 11 3
[391] 1 11 7 14 13 13 14 4 2 11 2 12 3 6 14 10 6 13 9 12 4 13 10 3 9 11 8 4 8 10
[421] 9 6 3 6 7 5 11 0 2 7 6 11 11 13 13 12 7 9 6 9 5 12 14 3 13 10 1 2 7 1
[451] 14 1 0 7 8 13 6 3 9 12 2 2 2 7 11 1 2 14 6 13 11 3 6 11 5 9 0 9 13 10
[481] 11 13 3 12 12 3 7 6 5 14 3 9 10 6 13 5 7 4 5 12 8 14 5 6 8 7 0 0 2 1
[511] 1 9 13 13 5 6 10 8 0 2 3 4 4 5 14 13 5 2 2 4 6 5 9 6 14 8 4 12 4 6
[541] 9 1 4 2 4 9 1 7 1 10 0 1 1 8 6 5 8 4 9 11 14 2 3 8 2 11 3 7 11 2
[571] 4 9 5 3 4 1 4 8 13 4 8 8 1 7 2 7 3 11 13 1 13 7 9 3 7 7 4 12 9 14
[601] 11 9 2 12 12 14 10 4 12 11 12 10 14 3 11 6 12 3 6 3 11 8 10 2 6 3 1 11 2 6
[631] 0 8 12 5 5 3 6 2 14 11 7 14 14 8 11 2 7 0 10 2 0 4 8 9 8 3 2 13 4 10
[661] 2 5 13 2 2 12 12 0 10 4 1 5 13 3 10 3 11 2 5 3 9 6 11 0 8 12 0 11 2 11
[691] 7 8 1 3 4 14 4 4 9 5 12 7 6 9 12 13 2 11 1 11 12 0 4 6 10 8 5 14 7 6
[721] 4 7 2 5 2 14 3 8 10 6 14 7 14 3 2 6 5 0 3 0 12 0 12 3 5 5 8 5 14 6
[751] 10 14 5 2 3 11 3 4 3 11 4 2 0 11 11 13 4 0 6 14 2 6 9 10 4 9 5 7 1 13
[781] 8 3 13 3 10 4 8 1 3 11 2 8 5 10 7 6 10 14 14 2 2 12 8 4 13 7 11 13 4 5
[811] 7 2 3 8 14 3 9 12 6 2 6 0 3 5 8 8 0 14 13 13 7 10 9 6 1 0 4 8 6 8
[841] 14 1 9 0 9 2 7 10 8 5 10 7 1 8 2 13 3 1 8 12 12 2 5 6 3 9 4 5 4 13
[871] 6 3 10 7 9 2 1 12 1 11 0 10 0 11 8 8 0 7 0 11 10 3 14 6 9 11 11 0 12 1
[901] 10 13 1 7 7 2 0 3 13 9 2 4 12 3 0 11 1 8 8 13 12 6 8 13 8 1 13 11 2 9
[931] 11 8 10 8 3 14 6 14 7 6 7 10 3 11 3 13 11 3 9 13 8 10 8 7 12 4 11 12 12 9
[961] 6 10 2 8 13 7 11 5 7 12 10 14 1 6 7 6 7 2 3 5 13 6 10 9 5 2 0 1 11 8
[991] 9 5 1 3 3 1 12 1 13 2 14 5 7 1 10 9 0 9 11 10 6 2 7 12 10 6 2 10 13 4
[1021] 9 9 14 4 4 5 7 13 13 13 6 7 12 1 6 11 12 14 4 11 6 4 10 0 9 12 10 10 13 8
[1051] 3 3 0 8 5 14 10 3 7 5 0 14 5 6 10 14 7 4 8 9 1 6 14 1 14 5 5 14 4 11
[1081] 12 14 9 13 14 13 2 13 11 9 14 2 1 9 8 11 13 11 14 13 3 4 9 6 9 6 10 13 1 12
[1111] 10 14 11 5 8 9 3 5 6 14 1 11 10 12 7 7 2 13 13 12 12 4 3 14 6 4 2 5 9 4
[1141] 14 11 6 4 11 6 4 4 8 2 2 5 14 1 7 11 8 9 11 11 10 6 14 3 0 3 8 8 14 13
[1171] 10 6 10 4 9 12 0 9 2 9 13 12 1 12 3 5 5 3 12 2 1 5 1 0 10 7 3 10 14 13
[1201] 11 8 0 10 12 9 4 5 4 8 5 6 2 11 7 5 5 8 4 9 9 10 14 3 7 9 1 9 9 8
[1231] 1 8 11 5 2 4 9 14 14 6 10 7 4 14 6 5 1 4 3 8 13 10 5 1 8 8 6 8 7 1
[1261] 14 4 4 7 2 12 10 8 10 5 6 7 2 3 5 13 1 2 9 8 5 14 1 11 9 5 8 12 13 0
[1291] 4 2 0 8 8 2 5 3 13 11 5 11 14 14 9 12 4 5 9 3 13 14 1 5 10 4 9 6 5 8
[1321] 7 5 7 3 14 8 4 8 4 6 5 8 11 0 14 13 2 13 12 13 3 4 7 8 11 4 14 12 3 6
[1351] 11 8 8 9 6 7 4 3 10 9 2 9 12 12 0 1 10 9 8 0 12 9 3 14 13 7 8 12 10 9
[1381] 10 10 2 11

You can use readLines to import string from the url which you can get by clicking the Raw button.
mystring <- readLines("https://gist.githubusercontent.com/anonymous/9de31de2e6fc9888f3debeda4698b739/raw/c07c2d6c6f00060806b15ec57ed06d4a4e0d9d74/gistfile1.txt")
Use some regular expression as follows should give you all the numbers you want:
library(stringr)
num <- gsub(">|<", "", str_extract_all(mystring, ">\\d+<", simplify = T))
head(as.vector(num))
[1] "13" "9" "8" "8" "1" "2"

Related

Convert dataframe from vertical to horizontal

I already checked many questions and I don't seem to find the suitable answer.
I have this df
df = data.frame(x = 1:10,y=11:20)
the output
x y
1 1 11
2 2 12
3 3 13
4 4 14
5 5 15
6 6 16
7 7 17
8 8 18
9 9 19
10 10 20
I just wish the output to be:
1 2 3 4 5 6 7 8 9 10
x 1 2 3 4 5 6 7 8 9 10
y 11 12 13 14 15 16 17 18 19 20
thanks
Try t() like below
> data.frame(t(df), check.names = FALSE)
1 2 3 4 5 6 7 8 9 10
x 1 2 3 4 5 6 7 8 9 10
y 11 12 13 14 15 16 17 18 19 20
A transpose should do it
setNames(data.frame(t(df)), df[,"x"])
1 2 3 4 5 6 7 8 9 10
x 1 2 3 4 5 6 7 8 9 10
y 11 12 13 14 15 16 17 18 19 20

Limit Number of Items Displayed in Legend - GGplot R

I have a large taxonomic dataset that I need to plot as a stacked bar chart. Sample Data:
ID X A B C D E F G
1 5 9 6 7 4 8 10 6
2 6 3 9 10 3 10 4 8
3 6 6 5 8 8 8 8 1
4 9 3 2 8 4 1 5 8
5 6 6 2 8 3 7 4 10
6 0 7 8 9 1 4 9 10
7 3 2 6 8 8 1 8 7
8 4 7 10 2 9 7 9 8
9 5 7 9 10 8 2 2 1
10 0 4 6 8 9 10 7 1
11 8 9 2 2 6 5 1 7
12 8 6 0 9 7 9 8 1
13 2 8 4 4 4 2 6 7
14 4 6 6 4 9 9 3 5
15 8 1 0 6 5 8 1 1
16 6 6 9 3 9 2 1 1
17 2 4 0 2 4 8 10 9
18 5 9 8 9 4 9 3 9
19 0 2 1 6 6 9 6 2
20 3 3 7 10 4 5 6 8
21 2 6 6 9 8 10 9 4
22 7 7 1 6 8 3 7 1
23 1 9 4 5 8 9 7 7
24 0 8 5 9 1 8 9 1
25 2 1 0 1 1 2 10 7
26 10 4 1 8 2 5 9 0
27 2 7 10 10 2 3 8 6
28 6 4 2 6 7 3 1 0
29 8 1 3 4 1 10 3 6
30 1 6 5 4 7 9 7 10
31 4 4 3 2 2 9 0 4
32 9 6 6 1 6 1 5 2
The plotting part is no problem, using gggplot as below:-
l5 <- read.xlsx(paste(taxawmeta,taxawmeta_files[2], sep = ""), sheetIndex = 1)
l5_long <- l5 %>% gather(taxa,value,-c(X.FinalSampleID,TimePoint_Luna))
ggplot(l5_long, aes(fill=taxa, y = value, x = X.FinalSampleID, )) +
geom_bar(position='stack', stat='identity') +
theme_minimal() +
labs(x='Sample', y='Relative Abundance', title='Family Level Relative Abundance') +
theme(axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
legend.position="none")
Where I'm running into an issue is the actual dataset has almost 200 variables. Meaning the legend is completely out of control. I know I can just hide the legend with:-
theme(.position="none")
... but what I'd like to do is keep say the top 10 entries as those are the ones of most interest. Is there any simple method to limit the number of items that are displayed in the legend? Anything I've found so far seems very convoluted and not directly applicable to this problem.

Change units of time dimension in NetCDF file from months to months since

I currently have multiple NetCDF files with 4 dimensions, (latitude, longitude, time, and depth). Each represents a single year of monthly data. The unit of time is "month", 1-12, and therefore quite useless if I want to merge these files across years to give me a single NetCDF file with a time dimension of size months*years.
The time dimension attributes for a single file:
time Size:12 *** is unlimited ***
long_nime: time
units: month
I used ncrcat of nco to merge.
ncrcat soda3.3.1*sst.nc -O soda3.3.1_1980_2015_sst.nc
This works except that when merged, time values read
#in R
soda.info$var$temp$dim[[3]]$vals
[1] 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1
[26] 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2
[51] 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3
[76] 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4
[101] 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5
[126] 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6
[151] 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7
[176] 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8
[201] 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9
[226] 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10
[251] 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11
[276] 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
[301] 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1
[326] 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2
[351] 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3
[376] 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4
[401] 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5
[426] 6 7 8 9 10 11 12
...which obviously isn't much help if I want to keep track of time.
In the past I've only used NetCDF files with a "months since..." unit. Is there a way to change these rather groundless 'month' units to 'months since...'?
Would it suffice to enumerate the months sequentially?
ncap2 -s 'time=array(0,1,$time)' soda3.3.1_1980_2015_sst.nc out.nc
You can also add a "months since ..." unit to time as described in the comment by Chelmy and/or in the NCO manual. I leave that as an exercise for you, gentle reader.

Histogram from character string in R (from chr to num)

I'm a newbie in R and have been stuck with this problem for a long time now. Given how simple it seems, I'm puzzled to be stuck with this for so long. So here we go:
Basically, I have a vector, let's call it "test", which contains a series of numbers.
[1] "9 29 7 22 5 5 5 8 14 5 5 8 7 9 15 15 7 5 5 6 6 5 9 5 6 7 6 7 11 5 6 10 5 5 7 8 23 11 15 24 5 5 11 5 7 19 6 6 30 6 7 7 24 9 8 15 5 5 29 10 17 6 6 11 26 9 19 32 7 8 14 5 8 8 18 6 5 9 6 11 5 7 6 8 5 6 54 6 7 8 22 7 5 8 6 31 6 5 8 26 12 9 7 5 11 6 27 9 6 15 17 5 8 5 6 5 5 5 9 6 5 7 7 9 10 11 33 19 13 6 18 6 9 7 5 6 8 5 5 5 6 5 6 5 18 6 6 7 8 9 5 8 5 8 16 5 8 6 8 7 12 8 13 11 5 17 15 5 12 7 7 11 6 6 5 10 9 5 5 14 7 12 6 5 5 7 5 30 7 5 8 5 9 10 21 6 14 9 7 14 26 23 7 24 7 13 7 5 5 9 12 11 6 5 5 6 5 6 7 76 5 10 6 16 5 12 11 15 6 28 7 14 8 5 6 5 8 5 12 6 5 10 5 14 7 8 6 5 5 8 19 15 10 7 5 14 5 15 7 8 6 6 5 35 5 6 5 11 5 13 5 7 12 11 5 6 10 5 15 6 12 9 11 5 7 9 8 17 8 8 11 6 7 5 15 10 8 8 9 26,6 25 6 13 11 6 15 5 7 7 38 9 5 10 10 11 6 8 6 13 10 7 5 18 9 12 6 16 13 8 8 6 5 5 8 8 8 5 6 5 5 5 5 7 13 6 12 6 6 10 8 8 18 6 5 12 5 8 17 5 18 5 5 17 8 7 6 7 16 10 7 6 10 6 6 10 17 5 10 7 10 6 11 9 5 25 12 13 6 11 5"
R interprets this as a character string:
str(test)
chr "9 29 7 22 5 5 5 8 14 5 5 8 7 9 15 15 7 5 5 6 6 5 9 5 6 7 6 7 11 5 6 10 5 5 7 8 23 11 15 24 5 5 11 5 7 19 6 6 30..."
What I wish to do is no more complex than this: I would like to create a histogram, plotting the frequency of each number in the character string above (in fact, this is the degree distribution for a network).
The problem is that I'm dealing with a character string.
> hist(test)
Error in hist.default(test) : 'x' must be numeric
However, if I try to convert "test" into numeric, it also fails.
> as.numeric(test)
[1] NA
Warning message:
NAs introduced by coercion
I'm sure the solution is something very simple here, but I've tried to search for a solution for a long time without success.
Thank you in advance for your help!
The str(test) shows that is a single string, so we can extract the elements with scan and then use hist
hist(scan(text = test, what = numeric(), quiet = TRUE))
Upon looking at the OP's data, there are spaces and ,. So, we change it to a single delimiter and then use scan
hist(scan(text = gsub(",", " ", test), what = numeric(), quiet = TRUE))
I suggest using stringr package to split character string into a list, then unlist and store as numeric vector:
a <- "9 29 7 22 5 5 5 8 14 5 5 8 7 9 15 15 7 5 5 6 6 5 9 5 6 7 6 7 11 5 6 10 5 5 7 8 23 11 15 24 5 5 11 5 7 19 6 6 30 6 7 7 24 9 8 15 5 5 29 10 17 6 6 11 26 9 19 32 7 8 14 5 8 8 18 6 5 9 6 11 5 7 6 8 5 6 54 6 7 8 22 7 5 8 6 31 6 5 8 26 12 9 7 5 11 6 27 9 6 15 17 5 8 5 6 5 5 5 9 6 5 7 7 9 10 11 33 19 13 6 18 6 9 7 5 6 8 5 5 5 6 5 6 5 18 6 6 7 8 9 5 8 5 8 16 5 8 6 8 7 12 8 13 11 5 17 15 5 12 7 7 11 6 6 5 10 9 5 5 14 7 12 6 5 5 7 5 30 7 5 8 5 9 10 21 6 14 9 7 14 26 23 7 24 7 13 7 5 5 9 12 11 6 5 5 6 5 6 7 76 5 10 6 16 5 12 11 15 6 28 7 14 8 5 6 5 8 5 12 6 5 10 5 14 7 8 6 5 5 8 19 15 10 7 5 14 5 15 7 8 6 6 5 35 5 6 5 11 5 13 5 7 12 11 5 6 10 5 15 6 12 9 11 5 7 9 8 17 8 8 11 6 7 5 15 10 8 8 9 26,6 25 6 13 11 6 15 5 7 7 38 9 5 10 10 11 6 8 6 13 10 7 5 18 9 12 6 16 13 8 8 6 5 5 8 8 8 5 6 5 5 5 5 7 13 6 12 6 6 10 8 8 18 6 5 12 5 8 17 5 18 5 5 17 8 7 6 7 16 10 7 6 10 6 6 10 17 5 10 7 10 6 11 9 5 25 12 13 6 11 5"
library(stringr)
b <- as.numeric( unlist ( str_split (a, " ")))
hist(b)
The histogram I am getting:
It looks like your test "vector" is just one long string.
A numeric vector is as follows:
nums <- c(1,2,3,4,5,6)
You could also make a character vector and convert it, like you tried:
chars <- c("1","2","3","4","5","6")
nums <- as.numeric(chars)
Your values are more like:
char <- "1 2 3 4 5 6"
which cannot be converted to a numeric value with as.numeric(), as it is one long string rather than a vector of numbers or characters

Need to count items from a tables

I have this DF (partially shown) with 15 categories in the first column and each cell has number between 1 and 15. Actually this is just a small example, The 15 categories are repeated with their different numbers in the other columns
What I need is to have a 16x15 matrix with the count of appearances of the values as follows.
I can program this in an old fashion with IFs etc but I am kind of lost using R
I hope this is clear.
Any advise is welcome
EDITED AS REQUESTED (I apology not to be clear)
RESULTADOS DF
PREOCUPACIÓN 13 15 4 4 1 8 3 1
TRISTEZA 15 13 2 5 4 14 6 6
PERDIDA 4 11 3 2 14 12 7 10
ANGUSTIA 14 10 11 3 2 13 1 2
IMPOTENCIA 1 8 9 6 5 5 5 4
MUERTE 2 1 14 14 15 6 13 15
ENOJO 12 7 10 8 6 7 12 5
INJUSTICIA 3 9 12 7 12 2 14 13
AUSENCIA 11 14 6 1 8 11 11 11
DOLOR 5 12 5 9 7 15 8 8
CORRUPCIÓN 8 6 15 13 11 3 15 12
MIEDO 9 3 13 10 3 10 9 3
SECUESTRO 10 2 1 11 9 4 4 14
INSEGURIDAD 7 4 7 15 10 1 10 9
DESESPERACIÓN 6 5 8 12 13 9 2 7
PREOCUPACIÓN 14 2 5 4 3 8 8 7
TRISTEZA 5 7 1 8 7 9 13 9
PERDIDA 2 6 6 12 2 10 6 10
ANGUSTIA 13 3 15 9 8 11 7 4
IMPOTENCIA 12 11 7 5 10 12 12 1
MUERTE 3 10 14 2 13 13 9 2
ENOJO 11 5 10 10 11 7 11 5
INJUSTICIA 7 13 2 6 15 14 10 6
AUSENCIA 8 1 9 11 1 6 4 12
DOLOR 6 8 8 13 9 3 3 3
CORRUPCIÓN 10 15 3 14 14 15 5 11
MIEDO 9 4 13 15 4 4 14 8
SECUESTRO 4 9 11 1 12 5 15 13
INSEGURIDAD 1 12 4 7 6 1 1 14
DESESPERACIÓN 15 14 12 3 5 2 2 15
PREOCUPACIÓN 13 10 4 1 7 4 11 2
TRISTEZA 15 11 11 2 9 3 12 8
PERDIDA 2 15 7 4 15 7 3 13
ANGUSTIA 8 13 5 3 6 1 7 1
IMPOTENCIA 10 4 8 5 12 10 13 3
MUERTE 7 8 15 15 3 6 6 9
ENOJO 14 12 12 10 10 8 15 10
INJUSTICIA 4 1 13 6 1 9 2 6
AUSENCIA 12 9 1 7 8 11 1 14
DOLOR 9 14 2 12 5 2 14 12
CORRUPCIÓN 3 6 14 14 14 14 5 15
MIEDO 6 2 3 9 2 5 10 7
SECUESTRO 1 3 6 8 13 15 4 5
INSEGURIDAD 5 5 9 11 4 13 8 4
DESESPERACIÓN 11 7 10 13 11 12 9 11
...
The result I need is like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
PREOCUPACION 3 2 2 5 1 0 2 3 0 1 1 0 2 0 1
TRISTEZA 1 2 1 1 2 2 2 2 3 0 2 1 1 1 2
Using apply on every row, convert to factor and get table:
res <-
cbind.data.frame(name = df1[, 1],
t(apply(df1[, -1], 1, function(i){
table(factor(i, levels = 1:15))
})))
res
# name 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
# 1 PREOCUPACIÓN 2 1 2 2 0 2 0 1 0 0 0 0 1 0 1
# 2 TRISTEZA 0 2 0 1 2 3 0 0 1 0 0 0 1 1 1
# 3 PERDIDA 0 1 1 1 0 0 1 0 0 1 2 2 1 2 0
# 4 ANGUSTIA 2 2 1 1 0 0 0 0 1 1 1 0 1 1 1
# ...
Edit: If you have names repeated on multiple rows, then try below. Split dataframe on 1st column, then loop through each split dataframe and get counts per factor level.
res <- t(data.frame(
lapply(split(df1, df1$V1), function(i){
as.numeric(table(factor(unlist(i[-1, ]), levels = 1:15)))
})))
res
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
# ANGUSTIA 4 0 2 1 1 1 2 2 1 0 1 0 2 0 1
# AUSENCIA 4 2 0 1 0 1 1 2 2 0 2 2 0 1 0
# CORRUPCIÓN 0 0 4 0 2 1 0 0 0 1 1 0 0 6 3
# DESESPERACIÓN 0 2 1 2 1 0 1 0 1 1 3 2 1 1 2
# ...

Resources