I recently began using R with no prior coding experience after I was transferred to a new department and I want to understand how some R functions work. I have this written code:
read.csv("something.csv",header=TRUE)$DATE123
The csv file contains a time series with header that begins with DATE in A1 cell.
How does R classify that A column is DATE123? is it because of the header=true and $?
As explained in the comments, header=TRUE indicates that the first row of your files are column names. Thus, every object in that row will be a column with that name. In your case, there is probably a field in the first row of your csv file that is called DATE123.
A data frame consists of rows and columns. Each column in the data frame can be accessed by the $ sign. If the name of the data frame is df and one of the columns is named DATE123, then you can extract all data from that column by using the following command:
df$DATE123
Related
I have a dataframe with a picture below, it contains a list of dataframes in the 2nd column, the column name is content. I also have a column called racenames in the 3rd column that I'd like to put inside of my csvs while running through the code. I can't figure out a way to get the list of dataframes to write to a csv in a loop or anything.
The code below works at writing a csv for the first dataframe in the content column, but I would like to write all of the dataframes at the same time so that I don't need to manually change the numbers/names for hours. All of the data has been scraped in one of my prior loops.
write.csv(ARCA_separate[[2]][[1]], file = "C:\\Users\\bubba\\Desktop\\RStuff\\Scraping\\ARCA 2012 Season\\*racename*.csv")
Here is what the data I'm working with looks like. The dataframe is called ARCA_separate.
How do I write all of the csvs and grab the corresponding racename in the same row to put into my csv name?
You can try this:
purrr::walk2(ARCA_separate$content, ARCA_separate$racename, function (x, y) write_csv(x, paste0("C:\\Users\\bubba\\Desktop\\RStuff\\Scraping\\ARCA 2012 Season\\", y, ".csv")))
When I ran a security report through the Office 365 Admin Email Explorer to obtain detailed information about emails and their respective types of attacks, I downloaded the .csv file and manually use Microsoft Excel to filter out exact email subject rows and save to their own .csv file. This took a long time to create individual CSV files since there were quite a lot of various emails with same or differing subject titles as values.
Downloaded the .csv fild from the Office 365 Admin portal with a date range of 7 days into the past (date-range).
Imported into R using the R command below:
Office_365_Report_CSV = "C:/Users/absnd/Documents/2022-11-18office365latestquarantine.csv"
Imported the table from the library.
require(data.table)
Created a new variable to convert the data into a data-frame.
quarantine_data = fread(paste0(Office_365_Report_CSV),sep = ",", header = TRUE, check. Names = FALSE)
Pull columns needed to filter through in the data-frame.
Quarantine_Columns = quarantine_data[,c("Email date (UTC)","Recipients","Subject","Sender","Sender IP","Sender domain","Delivery action","Latest delivery location","Original delivery location","Internet message ID","Network message ID","Mail language","Original recipients","Additional actions","Threats","File threats","File hash","Detection technologies","Alert ID","Final system override","Tenant system override(s)","User system override(s)","Directionality","URLs","Sender tags","Recipient tags","Exchange transport rule","Connector","Context" )]
Steps Needed to be done (I am not sure where to go from here):
-I would like to have R write to individual .csv file with the same "Subject" value rows that must contain all the above columns data in step 5.
Sub-step - ex. If the row data contains the value inside the column (named, "Threats") = "Phish" generate a file named, "YYYY-MM-DD Phishing <number increment +1>.csv."
Sub-step - ex. 2 If the row data contains the value inside the column (named, "Threats") = Phish, Spam" generate a CSV file named, "YYYY-MM-DD Phishing and Spam <number increment +1>.csv."
Step 6 and so on would filter out like same "Subject" column values for rows and save the rows with same Subject email values into a single file that would be named based on the if-condition in the substeps above in step 6.
First of all, you are looking to do this in R - RStudio is an IDE to make usage of R easier.
If you save your data frames in a list, and then set a vector of the names of the files that you want to give each of those files, you can then use purrr::walk2() to iterate through the saving. Some reproducible code as an example:
library(purrr)
library(glue)
library(readr)
mydfs_l <- list(mtcars, iris)
file_names <- c("mtcars_file", "iris_file")
walk2(mydfs_l, file_names, function(x, y) { write_excel_csv(x, glue("mypath/to/files/{y}.csv")) })
For some reason, when I run a line assigning columnn names to my dataframe (df) from another data frame (nm), I can no longer view my columns using the "$" operating; instead when I put "df$" I get the following error: Cannot read property 'substr' of Null.
Loading either dataset does not produce this problem, only when I assign column names to df using the following line:
colnames(df) = nm$Var_Code
This problem has not been happening before when running this code and is rather new. I'm not sure how to approach the problem and any assistance would be appreciated.
I am also new to R-studio, the way I get over it is to write the row names on the first row in textfile, and import data from textfile with specifying row names of data frame to be the first row of textfile.
Apologies if this is a trivial question. I saw others like it such as: How can I turn a part of the filename into a variable when reading multiple text files into R? , but I still seem to be having some trouble...
I have been given 50000 .txt files. Each file contains a single observation (a single row of data) with exactly 12 variables (number of columns). The name of each .txt file is fairly regular. Specifically, each .txt file has a code at the end indicating the type of observation across three dimensions. An example of this code is 'VL-VL-NE' or 'VL-M-N' or 'H-H-L' (not including the apostrophes). Therefore, an example of a file name could be 'I-love-using-R-20_01_2016-VL-VL-NE.txt'.
My problem is that I want to include this code at the end of the .txt file in the actual vector itself when I import into R, i.e., I want to add three more variables (columns) at the end of the table corresponding to the three parts of code at the end of the file name.
Any help would be greatly appreciated.
Because you have exactly the same number of columns in each file, why don't you import them into R using a loop that looks for all .txt files in a particularly directory?
df <- c()
for (x in list.files(pattern="*.txt")) {
u<-read.csv(x, skip=6)
u$Label = factor(x) #A column that is the filename
df <- rbind(df,u)
}
You'll note that the file name itself becomes a column. Once everything is into R, it should be fairly easy to use a regex function to extract the exact elements you need from the file name column (df$Label).
I have a matrix with 168 rows and about 6000 columns. The column names are stock identifiers, the rownames are dates. I would like to export this matrix as a .csv file. I tried the following:
write.csv(OAS_data, "Some Path")
The export works. But the header of the matrix (stock identifiers) is distributed over the first 3 rows of the .csv file (when opened in excel). The first two rows each contain the names of 2184 column names. The rest of the column names is in the third line. How can I avoid those breaks? The rest of the .csv file looks fine - no line breaks there.
Thank You.
Your best bet is probably to transpose your data and analyze it that way due to Excel's limits.
write.csv(t(OAS_data), "Some Path")