How Would I go About Subsetting this Data Frame? - r

I have the follow data frame:
> resident
X LOS Age Meds MHealth DietRest ReligAff NmChores Employed EdLevel Courses
1 R1 27 35 2 1 3 2 2 0 2 1
2 R2 56 43 0 0 0 1 3 1 3 2
3 R3 101 41 1 1 0 0 2 2 2 3
4 R4 19 54 3 2 4 3 1 0 1 0
5 R5 34 29 0 0 0 2 3 0 2 1
6 R6 78 46 2 0 2 1 2 1 3 2
7 R7 134 51 3 2 4 0 1 1 3 2
8 R8 112 38 0 1 1 4 2 1 2 3
9 R9 83 61 3 1 3 2 2 0 4 3
10 R10 9 50 2 0 2 1 1 2 2 0
11 R11 67 23 0 1 0 0 2 0 3 1
12 R12 30 47 2 2 0 3 2 0 4 0
13 R13 95 65 4 1 4 2 2 0 3 2
14 R14 165 63 5 2 4 1 1 0 2 2
15 R15 29 40 0 1 0 0 3 2 5 0
16 R16 44 33 2 2 1 0 2 0 3 1
17 R17 36 48 2 1 0 3 2 0 1 1
18 R18 58 57 3 0 2 1 1 1 2 1
19 R19 116 39 0 1 0 2 2 1 3 1
20 R20 73 44 1 0 0 2 1 0 4 2
21 R21 79 30 3 2 3 3 1 0 2 1
22 R22 39 41 0 0 0 0 3 2 2 2
23 R23 18 50 2 1 2 1 1 1 3 0
24 R24 60 35 1 0 0 0 2 1 4 2
25 R25 106 48 3 2 3 2 2 0 2 2
26 R26 46 31 2 1 0 0 1 1 3 1
27 R27 52 59 2 0 1 1 3 2 2 1
28 R28 28 62 6 0 4 2 1 0 5 1
29 R29 79 45 4 2 3 3 2 1 3 2
30 R30 24 42 1 1 1 0 1 0 2 1
31 R31 123 36 3 1 0 2 2 1 3 4
32 R32 11 49 2 0 2 1 2 0 1 0
33 R33 95 26 1 1 0 1 3 0 3 4
34 R34 61 24 0 0 0 2 2 1 2 1
35 R35 88 63 2 1 0 1 1 1 4 2
36 R36 64 38 1 2 1 4 1 1 2 3
37 R37 99 40 2 0 0 1 3 2 4 1
>
LOS = length of stay
I am trying to go through the data frame and create a new column that consists of either a zero or one, based upon if the resident is completing an average of one course every thirty days. How would I go upon doing this? I understand I would need to do something like within this subset of people, break things down so that if someone has been there between thirty and fifty-nine days and has completed at least one course, they receive a value of one. If someone has been there between sixty and eighty-nine days and that person has finished at least two courses, give them a one, and so forth and if not give them a value of zero. How would I create a function that does this and adds a value of either 1 or 0 to a new vector based upon the data for each resident?

Related

Plotting count data in r

I have counted crashes at intersections and am wondering how to plot this data in time series. The data was counted through the years of 2008 to 2018. the data is found at this link. Please, i am interested in the code and proper technique for plotting the data.
In order to get the data into table format the melt command from shape2 is required:
using melt from reshape2:
> attidtudeM=melt(df)
> head(attidtudeM)
variable value
1 F2008 0
2 F2008 1
3 F2008 1
4 F2008 2
5 F2008 0
6 F2008 1
> table(attidtudeM)
variable 0 1 2 3 4 5 6 7
F2008 235 38 11 3 0 0 0 0
F2009 244 27 8 6 2 0 0 0
F2010 237 9 31 3 2 2 3 0
F2011 241 33 11 0 1 0 1 0
F2012 246 31 8 1 1 0 0 0
F2013 251 28 7 1 0 0 0 0
F2014 265 16 5 0 0 1 0 0
F2015 261 6 17 0 2 0 1 0
F2016 263 17 5 0 1 0 0 1
F2017 275 7 4 0 0 0 0 1
F2008 F2009 F2010 F2011 F2012 F2013 F2014 F2015 F2016 F2017
1 1 1
1 2 1 1 2 1
1 1 2
2 1 2
1 1
3 1
1 1 2 3 2 2 1
3 1
2
1
1 1 4 1 1 2 2 2
2 1
2 1 1 1 1 2
1 3 2 2 1 5 4 1 7
1
2 2
1 6 2 1 2 1 1 2
1 2 1
5 2 1 2
2 1 1
1 2 2 1
2 2
1
1
1
1 0
1
4

data cleaning for plotting data frames

I am currently working with survey data in R studio. I originally had two csv files but I merged them into one. Both CSV files contained sample IDs. The first file also contains bivariate info, while the second contains rating as a continuous variable.
Here is a sample of the data
ID O1 O2 O3 O4 O5 O6 O7 O8 S1 S2 S3 S4 S5 S6 S7 S8
22 0 1 0 1 0 1 0 1 4 6 2 6 4 3 6 2
23 0 1 0 0 1 1 0 1 5 6 10 4 5 7 7 6
24 0 1 1 0 1 0 0 1 7 4 7 8 7 6 3 9
25 0 0 1 1 0 0 1 1 3 5 5 7 4 6.9 6 5
26 0 1 0 0 1 1 0 1 2 2.5 7 5 4 5 4 3
27 0 1 1 1 0 1 0 0 6 3 4 6 5 6 5 6
28 0 1 1 1 0 0 0 1 7 4 2 8 2 1 4 5
29 0 0 1 0 1 1 1 0 2 5 1 2 4 3 2 2
30 0 1 0 1 1 1 0 0 8 2 6 7 1 7 5 4
31 0 0 0 1 0 1 1 1 7 4 3 2 4 5 7 2
32 0 0 1 0 0 1 1 1 4 7 5 3 1 6 2 3
33 0 1 1 0 1 1 0 0 7 4 5 8 8 5 6 7
For example the 0 in O1 corresponds to the 4 in S1.
I want to make a loop that will sum all of the values corresponding to variable 0 and 1.
if value in O1 is 0, add value in S1 to "sum of 0"
if value in O1 is 1, add value in S1 to "sum of 1"
repeat for all columns to get a total value for 0 and 1.
Any strategies or tips would be helpful going forward!

Wide to Long format with multiple variables? [duplicate]

This question already has answers here:
Reshaping multiple sets of measurement columns (wide format) into single columns (long format)
(8 answers)
Closed 4 years ago.
I'm looking to convert a data frame from wide to long format, while maintaining multiple columns.
Here is sample data:
df <- read.table(header=T, text='
Subject Day Correct1 Correct2 Correct3 Percent1 Percent2 Percent3
1 1 1 0 1 50 25 70
2 1 1 0 0 75 30 80
3 1 0 1 1 70 45 90
4 1 0 1 0 80 50 100
5 1 1 1 1 90 60 100
1 2 0 1 0 30 75 90
2 2 0 0 1 45 70 80
3 2 1 1 0 50 30 90
4 2 1 0 0 60 45 100
5 2 1 1 1 80 45 90
')
And would like it to look like this -- where I have a Correct and Percent column.
Subject Day Correct CorrectValue Percent PercentValue
1 1 1 1 1 50
2 1 1 1 1 75
3 1 1 0 1 70
4 1 1 0 1 80
5 1 1 1 1 90
1 1 2 0 2 25
2 1 2 0 2 30
3 1 2 1 2 45
4 1 2 1 2 50
5 1 2 1 2 60
1 1 3 1 3 70
2 1 3 0 3 80
3 1 3 1 3 90
4 1 3 0 3 100
5 1 3 1 3 100
1 2 1 0 1 30
2 2 1 0 1 45
3 2 1 1 1 50
4 2 1 1 1 60
5 2 1 1 1 80
1 2 2 1 2 75
2 2 2 0 2 70
3 2 2 1 2 30
4 2 2 0 2 45
5 2 2 1 2 45
1 2 3 0 3 90
2 2 3 1 3 80
3 2 3 0 3 90
4 2 3 0 3 100
5 2 3 1 3 90
Thank you!
With gather from tidyr:
library(dplyr)
library(tidyr)
df %>%
gather(Correct, CorrectValue, Correct1:Correct3) %>%
gather(Percent, PercentValue, Percent1:Percent3) %>%
mutate_at(vars(Correct, Percent), ~sub("[[:alpha:]]+", "", .))
Result:
Subject Day Correct CorrectValue Percent PercentValue
1 1 1 1 1 1 50
2 2 1 1 1 1 75
3 3 1 1 0 1 70
4 4 1 1 0 1 80
5 5 1 1 1 1 90
6 1 2 1 0 1 30
7 2 2 1 0 1 45
8 3 2 1 1 1 50
9 4 2 1 1 1 60
10 5 2 1 1 1 80
11 1 1 2 0 1 50
12 2 1 2 0 1 75
13 3 1 2 1 1 70
14 4 1 2 1 1 80
15 5 1 2 1 1 90
16 1 2 2 1 1 30
17 2 2 2 0 1 45
18 3 2 2 1 1 50
19 4 2 2 0 1 60
20 5 2 2 1 1 80
21 1 1 3 1 1 50
22 2 1 3 0 1 75
23 3 1 3 1 1 70
24 4 1 3 0 1 80
25 5 1 3 1 1 90
...

How to save for loop results in data frame using cbind

I have a data frame dfSub with a number of parameters inside. This is hourly based data for energy use. I need to sort data by each hour, e.g. for each hour get all values of energy from data frame. As a result I expect to have data frame with 24 columns for each hour, rows are filled with energy values.
The hour is specified as 1:24 and in data frame is linked as dfSub$hr.
The heat is dfSub$heat
I constructed a for-loop and tried to save with cbind, but it does not work, error message is about different size of rows and columns.
I print results and see them on screen, but cant save as d(dataframe)
here is the code:
d = NULL
for (i in 1:24) {
subh= subset(dfSub$heat, dfSub$hr == i)
print(subh)
d = cbind(d, as.data.frame(subh))
}
append function is not applicable, since I dont know the expected length of heat value for each hour.
Any help is appreciated.
Part of dfSub
hr wk month dyid wend t heat
1 2 1 1 0 -9.00 81
2 2 1 1 0 -8.30 61
3 2 1 1 0 -7.80 53
4 2 1 1 0 -7.00 51
5 2 1 1 0 -7.00 30
6 2 1 1 0 -6.90 31
7 2 1 1 0 -7.10 51
8 2 1 1 0 -6.50 90
9 2 1 1 0 -8.90 114
10 2 1 1 0 -9.90 110
11 2 1 1 0 -11.70 126
12 2 1 1 0 -9.70 113
13 2 1 1 0 -11.60 104
14 2 1 1 0 -10.00 107
15 2 1 1 0 -10.20 117
16 2 1 1 0 -9.00 90
17 2 1 1 0 -8.00 114
18 2 1 1 0 -7.80 83
19 2 1 1 0 -8.10 82
20 2 1 1 0 -8.20 61
21 2 1 1 0 -8.80 34
22 2 1 1 0 -9.10 52
23 2 1 1 0 -10.10 41
24 2 1 1 0 -8.80 52
1 2 1 2 0 -8.70 44
2 2 1 2 0 -8.40 50
3 2 1 2 0 -8.10 33
4 2 1 2 0 -7.70 41
5 2 1 2 0 -7.80 33
6 2 1 2 0 -7.50 43
7 2 1 2 0 -7.30 40
8 2 1 2 0 -7.10 8
The output expected as:
hr1 hr2 hr3 hr4..... hr24
81 61 53 51 ..... 52
44 50 33 41
One can avoid use of for-loop in this case. An option is to use tidyr::spread to convert your hourly data in wide format.
library(tidyverse)
df %>% select(-t, -wend) %>%
mutate(hr = sprintf("hr%02d",hr)) %>%
spread(hr, heat)
Result:
# wk month dyid hr01 hr02 hr03 hr04 hr05 hr06 hr07 hr08 hr09 hr10 hr11 hr12 hr13 hr14 hr15 hr16 hr17 hr18 hr19 hr20 hr21 hr22 hr23 hr24
# 1 2 1 1 81 61 53 51 30 31 51 90 114 110 126 113 104 107 117 90 114 83 82 61 34 52 41 52
# 2 2 1 2 44 50 33 41 33 43 40 8 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Data:
df <- read.table(text =
"hr wk month dyid wend t heat
1 2 1 1 0 -9.00 81
2 2 1 1 0 -8.30 61
3 2 1 1 0 -7.80 53
4 2 1 1 0 -7.00 51
5 2 1 1 0 -7.00 30
6 2 1 1 0 -6.90 31
7 2 1 1 0 -7.10 51
8 2 1 1 0 -6.50 90
9 2 1 1 0 -8.90 114
10 2 1 1 0 -9.90 110
11 2 1 1 0 -11.70 126
12 2 1 1 0 -9.70 113
13 2 1 1 0 -11.60 104
14 2 1 1 0 -10.00 107
15 2 1 1 0 -10.20 117
16 2 1 1 0 -9.00 90
17 2 1 1 0 -8.00 114
18 2 1 1 0 -7.80 83
19 2 1 1 0 -8.10 82
20 2 1 1 0 -8.20 61
21 2 1 1 0 -8.80 34
22 2 1 1 0 -9.10 52
23 2 1 1 0 -10.10 41
24 2 1 1 0 -8.80 52
1 2 1 2 0 -8.70 44
2 2 1 2 0 -8.40 50
3 2 1 2 0 -8.10 33
4 2 1 2 0 -7.70 41
5 2 1 2 0 -7.80 33
6 2 1 2 0 -7.50 43
7 2 1 2 0 -7.30 40
8 2 1 2 0 -7.10 8",
header = TRUE, stringsAsFactors = FALSE)
With tidyr:
> df<-read.fwf(textConnection(
+ "hr,wk,month,dyid,wend,t,heat
+ 1 2 1 1 0 -9.00 81
+ 2 2 1 1 0 -8.30 61
+ 3 2 1 1 0 -7.80 53
+ 4 2 1 1 0 -7.00 51
+ 5 2 1 1 0 -7.00 30
+ 6 2 1 1 0 -6.90 31
+ 7 2 1 1 0 -7.10 51
+ 8 2 1 1 0 -6.50 90
+ 9 2 1 1 0 -8.90 114
+ 10 2 1 1 0 -9.90 110
+ 11 2 1 1 0 -11.70 126
+ 12 2 1 1 0 -9.70 113
+ 13 2 1 1 0 -11.60 104
+ 14 2 1 1 0 -10.00 107
+ 15 2 1 1 0 -10.20 117
+ 16 2 1 1 0 -9.00 90
+ 17 2 1 1 0 -8.00 114
+ 18 2 1 1 0 -7.80 83
+ 19 2 1 1 0 -8.10 82
+ 20 2 1 1 0 -8.20 61
+ 21 2 1 1 0 -8.80 34
+ 22 2 1 1 0 -9.10 52
+ 23 2 1 1 0 -10.10 41
+ 24 2 1 1 0 -8.80 52
+ 1 2 1 2 0 -8.70 44
+ 2 2 1 2 0 -8.40 50
+ 3 2 1 2 0 -8.10 33
+ 4 2 1 2 0 -7.70 41
+ 5 2 1 2 0 -7.80 33
+ 6 2 1 2 0 -7.50 43
+ 7 2 1 2 0 -7.30 40
+ 8 2 1 2 0 -7.10 8"
+ ),header=TRUE,sep=",",widths=c(5,3,6,5,5,7,5))
>
> library(tidyr)
> df1 <- select(df,dyid,hr,heat)
> df2 <- spread(df1,hr,heat)
> colnames(df2)[2:ncol(df2)] <- paste0("hr",colnames(df2)[2:ncol(df2)])
> df2
dyid hr1 hr2 hr3 hr4 hr5 hr6 hr7 hr8 hr9 hr10 hr11 hr12 hr13 hr14 hr15 hr16 hr17 hr18 hr19 hr20 hr21 hr22 hr23 hr24
1 1 81 61 53 51 30 31 51 90 114 110 126 113 104 107 117 90 114 83 82 61 34 52 41 52
2 2 44 50 33 41 33 43 40 8 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>
I found solution that helped me to solve my task here: Append data frames together in a for loop
by using empty list and combining later on in data frame
datalist = list()
for (i in 1:24) {
subh= subset(dfSub$heat, dfSub$hr == i)
datalist[[i]] = subh
}
big_data = do.call(rbind, datalist)
both cbind and rbind work.
Thanks everyone for help :)

Parsing Character String in R with "\r\n" from PDF Conversion

I am having issues parsing the following character string in R:
> dput(txt[1])
"NATIONAL BASKETBALL ASSOCIATION OFFICIAL SCORER'S REPORT\r\n FINAL BOX\r\nWednesday, January 17, 2018 Spectrum Center, Charlotte, NC\r\nOfficials: #15 Zach Zarba, #11 Derrick Collins, #12 CJ Washington\r\n Game Duration: 2:14\r\n Attendance: 11528\r\nVISITOR: Washington Wizards (25-20)\r\n POS MIN FG FGA 3P 3PA FT FTA OR DR TOT A PF ST TO BS +/- PTS\r\n22 Otto Porter Jr. F 22:45 2 6 1 3 1 2 0 2 2 2 0 0 0 0 -24 6\r\n 5 Markieff Morris F 15:26 1 5 0 2 0 0 0 5 5 1 4 0 1 0 -10 2\r\n13 Marcin Gortat C 19:42 0 3 0 0 0 2 3 5 8 1 2 2 0 0 -23 0\r\n 3 Bradley Beal G 27:43 10 19 4 6 2 2 1 2 3 2 1 0 5 1 -14 26\r\n 2 John Wall G 24:20 5 11 2 2 0 0 0 2 2 9 2 0 3 2 -20 12\r\n30 Mike Scott 25:13 7 10 2 2 2 2 0 2 2 3 3 1 0 0 -8 18\r\n12 Kelly Oubre Jr. 26:32 5 9 3 5 3 4 0 5 5 0 4 0 3 0 3 16\r\n28 Ian Mahinmi 08:47 1 2 0 0 2 2 0 1 1 2 1 0 0 1 -1 4\r\n31 Tomas Satoransky 25:49 2 3 0 0 2 2 1 2 3 7 1 0 1 0 -13 6\r\n20 Jodie Meeks 20:17 2 3 1 2 2 3 1 4 5 0 0 0 2 0 -10 7\r\n14 Jason Smith 13:50 4 8 0 0 2 2 1 2 3 3 5 0 1 2 0 10\r\n 1 Chris McCullough 06:48 1 3 0 1 0 0 0 1 1 0 0 0 0 1 -1 2\r\n 8 Tim Frazier 02:48 0 0 0 0 0 2 0 0 0 1 0 0 0 0 1 0\r\n 240:00 40 82 13 23 16 23 7 33 40 31 23 3 16 7 -24 109\r\n 48.8% 56.5% 69.6% TM REB: 7 TOT TO: 16 (20 PTS)\r\nHOME: CHARLOTTE HORNETS (18-25)\r\n POS MIN FG FGA 3P 3PA FT FTA OR DR TOT A PF ST TO BS +/- PTS\r\n14 Michael Kidd-Gilchrist F 22:59 8 11 0 0 5 6 0 4 4 2 1 3 0 0 26 21\r\n 2 Marvin Williams F 21:38 4 7 3 4 1 1 1 2 3 1 0 0 1 0 25 12\r\n12 Dwight Howard C 28:54 7 13 0 0 4 5 3 12 15 2 2 2 2 2 17 18\r\n 5 Nicolas Batum G 26:00 4 8 2 4 1 2 0 3 3 4 1 1 2 0 17 11\r\n15 Kemba Walker G 28:48 6 15 4 8 3 3 1 2 3 7 1 0 1 1 14 19\r\n 3 Jeremy Lamb 21:02 7 9 2 2 0 0 3 0 3 0 4 0 0 1 -4 16\r\n44 Frank Kaminsky 23:36 6 14 1 4 1 1 0 2 2 2 1 1 0 0 -5 14\r\n10 Michael Carter-Williams 15:12 0 2 0 1 3 4 0 3 3 5 2 0 0 1 8 3\r\n 8 Johnny O'Bryant III 19:07 2 6 1 2 2 2 3 3 6 1 1 1 2 0 7 7\r\n21 Treveon Graham 22:00 3 6 1 2 2 2 2 2 4 1 4 1 1 0 7 9\r\n 1 Malik Monk 03:59 1 5 1 3 0 0 0 1 1 1 0 1 0 0 2 3\r\n 7 Dwayne Bacon 03:59 0 2 0 1 0 0 0 0 0 0 0 0 0 0 2 0\r\n32 Julyan Stone 02:46 0 0 0 0 0 0 0 2 2 1 1 0 0 0 4 0\r\n 240:00 48 98 15 31 22 26 13 36 49 27 18 10 9 5 24 133\r\n 49% 48.4% 84.6% TM REB: 7 TOT TO: 10 (15 PTS)\r\nSCORE BY PERIOD 1 2 3 4 FINAL\r\n Wizards 36 25 18 30 109\r\n HORNETS 38 39 25 31 133\r\nInactive: Wizards - Mac (Injury/Illness - left achilles surgery), Robinson (G League Team - two-way player)\r\nInactive: Hornets - Mathiang, Paige (G League Team - two-way player), Zeller (Injury/Illness - left knee surgery)\r\nPoints in the Paint: Wizards 30 (15/27), HORNETS 50 (25/48) Biggest Lead: Wizards 2, HORNETS 28\r\n2nd Chance Points: Wizards 9 (4/7), HORNETS 21 (6/12) Lead Changes: 2\r\nFast Break Points: Wizards 16 (6/8), HORNETS 10 (5/8) Times Tied: 5\r\nTechnical fouls - Individual\r\nWizards (3): Wall 4:16 1st , Brooks 6:49 2nd , Frazier 4:00 4th\r\nHORNETS (2): Kidd-Gilchrist 3:08 2nd , Carter-Williams 4:00 4th\r\nTechnical fouls - Defensive Three Seconds\r\nWizards (0) : NONE\r\nHORNETS (1) : Howard 2:27 1st\r\nEjections\r\nWizards (1): Frazier 4:00 4th\r\nHORNETS (1): Carter-Williams 4:00 4th\r\nMEMO: Ejected for excessive communication and contact during stoppage in play.\r\nMEMO: Ejected for excessive communication and contact during stoppage in play.\r\n Copyright (c ) 2017-2018 NBA Properties, INC. All Rights Reserved\r\n"
I would like to extract the following sub-section of the above character string:
Technical fouls - Individual\r\nWizards (3): Wall 4:16 1st , Brooks 6:49 2nd , Frazier 4:00 4th\r\nHORNETS (2): Kidd-Gilchrist 3:08 2nd , Carter-Williams 4:00 4th\r\nTechnical fouls - Defensive Three Seconds\r\nWizards (0) : NONE\r\nHORNETS (1) : Howard 2:27 1st\r\n
To do this, I have tried different variations of the following approach:
library(stringr)
> techs <- str_extract(txt[1], "(?<=\\bTechnical).+?.(\\\r\nEjections)")
> techs
[1] NA
But, as you can see, it doesn't work. This approach has worked successfully in other sections of the character string. Like here, for example:
> str_extract(txt[1], "(?<=\\bInactive).+?.(\\\r\nPoints)")
[1] ": Hornets - Mathiang, Paige (G League Team - two-way player), Zeller (Injury/Illness - left knee surgery)\r\nPoints"
So, why doesn't it work when I target \r\nEjections\r\n as my endpoint? The only difference I can see is that \r\n both precedes and succeeds Ejections, whereas \r\n only precedes Points. I've tried to account for the additional \r\n like so:
> techs <- str_extract(txt[1], "(?<=\\bTechnical).+?.(\\\r\nEjections\r\n)")
> techs
[1] NA
But, that still doesn't work. What is this \r\n that's all over the place, and how do I account for it? I acquired this character string from a PDF like so:
library(pdftools)
> download.file("http://www.nba.com/data/html/nbacom/2017/gameinfo/20180117/0021700653_Book.pdf", "mydf", mode = "wb")
trying URL 'http://www.nba.com/data/html/nbacom/2017/gameinfo/20180117/0021700653_Book.pdf'
Content type 'application/pdf' length unknown
downloaded 308 KB
> txt <- pdf_text("mydf")
> txt[1]
[1] "NATIONAL BASKETBALL ASSOCIATION OFFICIAL SCORER'S REPORT\r\n FINAL BOX\r\nWednesday, January 17, 2018 Spectrum Center, Charlotte, NC\r\nOfficials: #15 Zach Zarba, #11 Derrick Collins, #12 CJ Washington\r\n Game Duration: 2:14\r\n Attendance: 11528\r\nVISITOR: Washington Wizards (25-20)\r\n POS MIN FG FGA 3P 3PA FT FTA OR DR TOT A PF ST TO BS +/- PTS\r\n22 Otto Porter Jr. F 22:45 2 6 1 3 1 2 0 2 2 2 0 0 0 0 -24 6\r\n 5 Markieff Morris F 15:26 1 5 0 2 0 0 0 5 5 1 4 0 1 0 -10 2\r\n13 Marcin Gortat C 19:42 0 3 0 0 0 2 3 5 8 1 2 2 0 0 -23 0\r\n 3 Bradley Beal G 27:43 10 19 4 6 2 2 1 2 3 2 1 0 5 1 -14 26\r\n 2 John Wall G 24:20 5 11 2 2 0 0 0 2 2 9 2 0 3 2 -20 12\r\n30 Mike Scott 25:13 7 10 2 2 2 2 0 2 2 3 3 1 0 0 -8 18\r\n12 Kelly Oubre Jr. 26:32 5 9 3 5 3 4 0 5 5 0 4 0 3 0 3 16\r\n28 Ian Mahinmi 08:47 1 2 0 0 2 2 0 1 1 2 1 0 0 1 -1 4\r\n31 Tomas Satoransky 25:49 2 3 0 0 2 2 1 2 3 7 1 0 1 0 -13 6\r\n20 Jodie Meeks 20:17 2 3 1 2 2 3 1 4 5 0 0 0 2 0 -10 7\r\n14 Jason Smith 13:50 4 8 0 0 2 2 1 2 3 3 5 0 1 2 0 10\r\n 1 Chris McCullough 06:48 1 3 0 1 0 0 0 1 1 0 0 0 0 1 -1 2\r\n 8 Tim Frazier 02:48 0 0 0 0 0 2 0 0 0 1 0 0 0 0 1 0\r\n 240:00 40 82 13 23 16 23 7 33 40 31 23 3 16 7 -24 109\r\n 48.8% 56.5% 69.6% TM REB: 7 TOT TO: 16 (20 PTS)\r\nHOME: CHARLOTTE HORNETS (18-25)\r\n POS MIN FG FGA 3P 3PA FT FTA OR DR TOT A PF ST TO BS +/- PTS\r\n14 Michael Kidd-Gilchrist F 22:59 8 11 0 0 5 6 0 4 4 2 1 3 0 0 26 21\r\n 2 Marvin Williams F 21:38 4 7 3 4 1 1 1 2 3 1 0 0 1 0 25 12\r\n12 Dwight Howard C 28:54 7 13 0 0 4 5 3 12 15 2 2 2 2 2 17 18\r\n 5 Nicolas Batum G 26:00 4 8 2 4 1 2 0 3 3 4 1 1 2 0 17 11\r\n15 Kemba Walker G 28:48 6 15 4 8 3 3 1 2 3 7 1 0 1 1 14 19\r\n 3 Jeremy Lamb 21:02 7 9 2 2 0 0 3 0 3 0 4 0 0 1 -4 16\r\n44 Frank Kaminsky 23:36 6 14 1 4 1 1 0 2 2 2 1 1 0 0 -5 14\r\n10 Michael Carter-Williams 15:12 0 2 0 1 3 4 0 3 3 5 2 0 0 1 8 3\r\n 8 Johnny O'Bryant III 19:07 2 6 1 2 2 2 3 3 6 1 1 1 2 0 7 7\r\n21 Treveon Graham 22:00 3 6 1 2 2 2 2 2 4 1 4 1 1 0 7 9\r\n 1 Malik Monk 03:59 1 5 1 3 0 0 0 1 1 1 0 1 0 0 2 3\r\n 7 Dwayne Bacon 03:59 0 2 0 1 0 0 0 0 0 0 0 0 0 0 2 0\r\n32 Julyan Stone 02:46 0 0 0 0 0 0 0 2 2 1 1 0 0 0 4 0\r\n 240:00 48 98 15 31 22 26 13 36 49 27 18 10 9 5 24 133\r\n 49% 48.4% 84.6% TM REB: 7 TOT TO: 10 (15 PTS)\r\nSCORE BY PERIOD 1 2 3 4 FINAL\r\n Wizards 36 25 18 30 109\r\n HORNETS 38 39 25 31 133\r\nInactive: Wizards - Mac (Injury/Illness - left achilles surgery), Robinson (G League Team - two-way player)\r\nInactive: Hornets - Mathiang, Paige (G League Team - two-way player), Zeller (Injury/Illness - left knee surgery)\r\nPoints in the Paint: Wizards 30 (15/27), HORNETS 50 (25/48) Biggest Lead: Wizards 2, HORNETS 28\r\n2nd Chance Points: Wizards 9 (4/7), HORNETS 21 (6/12) Lead Changes: 2\r\nFast Break Points: Wizards 16 (6/8), HORNETS 10 (5/8) Times Tied: 5\r\nTechnical fouls - Individual\r\nWizards (3): Wall 4:16 1st , Brooks 6:49 2nd , Frazier 4:00 4th\r\nHORNETS (2): Kidd-Gilchrist 3:08 2nd , Carter-Williams 4:00 4th\r\nTechnical fouls - Defensive Three Seconds\r\nWizards (0) : NONE\r\nHORNETS (1) : Howard 2:27 1st\r\nEjections\r\nWizards (1): Frazier 4:00 4th\r\nHORNETS (1): Carter-Williams 4:00 4th\r\nMEMO: Ejected for excessive communication and contact during stoppage in play.\r\nMEMO: Ejected for excessive communication and contact during stoppage in play.\r\n Copyright (c ) 2017-2018 NBA Properties, INC. All Rights Reserved\r\n"
Obviously, the PDF does not show \r\ns in its text. Were they inserted upon conversion? Is there a way to convert without them? Or, is there a simple-enough way to work with them? Thanks for the help.
EDIT
It does not appear that adding \\ to \\\r\n makes a difference.
> techs <- str_extract(txt[1], "(?<=\\bTechnical).+?.(\\\r\n)")
> techs
[1] " fouls - Individual\r\n"
> techs <- str_extract(txt[1], "(?<=\\bTechnical).+?.(\\\r\nEjections)")
> techs
[1] NA
> techs <- str_extract(txt[1], "(?<=\\bTechnical).+?.(\\\r\nEjections\r\n)")
> techs
[1] NA
> techs <- str_extract(txt[1], "(?<=\\bTechnical).+?.(\\\r\nEjections\\\r\n)")
> techs
[1] NA
> techs <- str_extract(txt[1], "(?<=\\bTechnical).+?.(\\\r\nEjections\\\r\\\n)")
> techs
[1] NA
> techs <- str_extract(txt[1], "(?<=\\bTechnical).+?.(\\\r\\\nEjections\\\r\\\n)")
> techs
[1] NA
> techs <- str_extract(txt[1], "(?<=\\bTechnical).+?.(\\\r\\\nEjections\\\r\n)")
> techs
[1] NA
> techs <- str_extract(txt[1], "(?<=\\bTechnical).+?.(\\\r\\\nEjections\r\n)")
> techs
[1] NA
> techs <- str_extract(txt[1], "(?<=\\bTechnical).+?.(\\\r\\\nEjections)")
> techs
[1] NA
Am I doing it right?

Resources