Difficulties adding data to R dataset - r

I'm not too advanced with R so any help would be appreciated. I am trying to add values to columns in my dataset and my dataset is called 'katie'.
For example, in the column 'word' I'd like to select instances where 'SUBJECTED' is written and then post 'middle' in the column 'pre.environment', on the same line as 'SUBJECTED' is written. Is there something that I am doing wrong? With this code, the initial line definitely works (as I can see how many "SUBJECTED" items are recognized in the column 'word') but nothing happens when I enter the second line of code.
>x=grep("SUBJECTED", katie$word)
>katie[x,]$pre.environment= c('middle')
I hope this example is sufficient. Thanks in advance for your help.

Try the following code, if I understand your question correctly,
katie$pre.environment <- ifelse(grepl("SUBJECTED", katie$word),
yes = "middle",
no = katie$pre.environment)

Related

How to replace entries in a column in r

I'm brand new to r, and coding, and am playing around with a dataset. I have what I think should be a really straight-forward problem, but I can't figure it out, and haven't found any other code that will work.
I have a tibble with several columns. In column "RelationshipTypeCd" there are some values of "PT." I would like to change all of these to "PT" (essentially removing the period).
I am working in R studio, and have loaded the tidyverse.
Thanks!
You could use sub here:
dt$RelationshipTypeCd <- sub("^PT\\.$", "PT", dt$RelationshipTypeCd, fixed=TRUE)
dt$RelationshipTypeCd <- ifelse(stringr::str_detect(dt$RelationshipTypeCd, 'PT'), 'PT', dt$RelationshipTypeCd)
Thanks for the replies.
I eventually found a solution:
tibble %>%
mutate(RelationshipTypeCd = replace(RelationshipTypeCd, RelationshipTypeCd == "PT.", "PT"))
I'm not really sure I understand all the arguments there, however. It appears to be using replace inside of mutate, and I don't know why the column "RelationshipTypeCd" is listed twice in the arguments for replace.

Extra Alignment Error with KableExtra in R-markdow

I am relitively new to R and R-Markdown. I am receiving an error message of
! Extra alignment tab has been changed to \cr.
\endtemplate
In Table 4. I do not know what the error is telling me, and I don't know how to fix it. The code in Table 4 is the same format as the other tables. I have spent three days on this, I understand that this my be a simple fix, but I am at a loss.
I have added and subtracted l's and c's from the alignment clause, I have constructed a randomly generated data set, and Table 4 will run as it should. Leading me to believe that there is a problem with that specific column of data.
licen_area<-USC_cola%>%
group_by(Licen_area, .drop=F)%>%
summarise(count=n())%>%
ungroup()%>%
mutate(perc = round((count/sum(count)*100),2))
kable(licen_area,"latex",booktabs=T,align="lcc",
col.names=linebreak(c("Licensure\nArea","Count", "\\%"),align="c"),row.names=F, escape=F)
code
data
Any solutions or workarounds will be most helpful!
There was a special character in the data. Once that was removed it worked fine!

Remove everything after an empty row on a list in R

I have a quick question that I cannot figure out. I am reading some results from an output file using the code below and stored as a list in R that can be seen in the picture. I want to delete all of the information after an empty row, in other words, it would be everything after line 42:
Does anybody know anything that I could use? I tried using gsub was I was not very successful.
Thanks for all of the help I am new to programming in R. Again any help is very much appreciated.
LoadFFA <- function(filename, folder.out, TYPE = "PeakFQ_17C",
colStandard = TRUE){ # standardize column output names
require(data.table)
if(grepl("PEAKFQSA",TYPE)){ # PeakfqSA Bulleting 17C analysis
text.list<-lapply(fileinput,readLines)
skip.rows<-sapply(text.list, grep, pattern = '^Ann. Exc. Prob.\\s+EMA Est.')-1
PFA <- lapply(seq_along(text.list),function(i) read.delim(fileinput[i],skip=skip.rows[i],sep="\n",stringsAsFactors = TRUE,blank.lines.skip = FALSE))
}
EDIT
I don't know if I could upload directly so here is the google drive link.
Also, here is the command to run the function LoadFFA("03606500peaks.out","D:/Documents/hydraulic.failures","PEAKFQSA"). The screenshot is the result using print(PFA).
The reason why I am using a loop is because I am reading multiple files (output files) and they have a lot of data, multiple lenghts, and I am reading the data beginning Ann.Exc.Prob. and as per the screenshot provided I would like to end after line 42 (after a full empty row). I hope that clears some confusion.
Basically read the output files, start reading on "Ann.Exc.Prob" and end until the end of that data (line 42 for this particular file). I am using a function because I am running several times.
Again, sorry for the trouble. Thank you for your time and I appreciate your patience.
https://drive.google.com/file/d/1PGbGWIHFj7IQRevTAEfqqA9Okg4fz7Mg/view?usp=sharing

R: avoid repeating $

I'm new here and new to R and I think I have a simple question but don't know how to name it so I can't find any help by searching the web.
I have a data set and want to form a new Data set with several variables from the first one.
The working code looks like this:
em.table2 <- data.frame(em.table$item1,em.table$item2,...[here are some more]...,em.table$item22)
In order to keep it more simple, I want to get rid of the "em.table$"-construction in front of every variable... unfortunately i don't know the function to do so...
I tried it like this, but it didn't work (and is a pretty embarrasing try i guess):
em.table2 <- data.frame(em.table$(item1,item2,item3,item4))
Anyone here to help? Thanks a lot!
Instead of the $ operator, try the following:
em.table2 <- em.table[,c("item1","item2","item3","item4")]
Try with
em.table2 <- with(em.table, data.frame(item1, item2, item3, item4))
But if you just want to subset the data, there are better solutions.

R - How to create array from console input

Hi all and thanks in advance for all your help.
In R, I'm sending a command to an external Windows program using system(command), which in turn outputs lines (with multiple values per line) that I see directly on the R console. They look something like this:
a,b,c,d,e,f,g,h
1,2,3,4,5,6,7,8
3,4,5,7,1,3,4,9
7,5,3,1,8,1,5,7
What I would like to do is create an array that has the top row as column names and each subsequent row from the input should be the values that go into these columns. Any and all help in making this work would be very appreciated.
This is my first foray into this territory so I'm quite stuck as to how to do it. I've meddled with scan(), pipe() and readLines() but haven't been able to succeed. I have no particular attachment to system(command), any function that will run the executable that will give me the output I need is fine by me if it helps achieve what I want.
The comment made by user1935457 did the trick.
read.table(text = system(command, intern=TRUE), sep = ",", header=TRUE)

Resources