I am making a for/if loop and I am missing a step somewhere and I cant figure it out - r

strong text Below is my objective and the code I made to represent that Row 19 is the original street text and 24 is where street2 is located
https://www.opendataphilly.org/dataset/shooting-victims/resource/a6240077-cbc7-46fb-b554-39417be606ee << where the .csv is
Let's deal with the streets with '&' separating their names. Create a new column named street2 and set it equal to NA.
Then, iterate over the data frame using a for loop, testing if the street variable you created earlier contains an NA value.
In cases where this occurs, separate the names in block according to the & delimiter into the fields street and street2 accordingly.
Output the first 5 lines of the data frame to the screen.
Hint: for; if; :; nrow(); is.na(); strsplit(); unlist().
NewLocation$street2 <- 'NA'
Task7 <- unlist(NewLocation)
for (col in seq (1:dim(NewLocation)[19])) {
if (Task7[street2]=='NA'){
for row in seq (1:dim(NewLocation[24])){
NewLocation[row,col] <-strsplit(street,"&",(NewLocation[row,col]))
}
}
}

Related

Working on loop and wanting some feedback, re-adding this to update code and list .csv

Acses to
https://www.opendataphilly.org/dataset/shooting-victims/resource/a6240077-cbc7-46fb-b554 39417be606ee
I have gotten close and got my loop to run, but not gotten the output I want
want a split of street # any '&' locations to a col called 'street$2
**Main objective explained et's deal with the streets with & separating their names. Create a new column named street2 and set it equal to NA.
Then, iterate over the data frame using a for loop, testing if the street variable you created earlier contains an NA value.
In cases where this occurs, separate the names in block according to the & delimiter into the fields street and street2 accordingly.
Output the first 5 lines of the data frame to the screen.
Hint: mutate(); for; if; :; nrow(); is.na(); strsplit(); unlist().
library('readr')
NewLocation$street2 <- 'NA'
#head(NewLocation)
Task7 <- unlist(NewLocation$street2)
for (row in seq(from=1,to=nrow(NewLocation))){
if (is.na(Task7[NewLocation$street])){
NewLocation$street2 <-strsplit(NewLocation$street,"&",(NewLocation[row]))
}
}
This is changing all on my street2 to equal street 1 and get rid of my "NA"s

Create a new row to assign M/F to a column based on heading, referencing second table?

I am new to R (and coding in general) and am really stuck on how to approach this problem.
I have a very large data set; columns are sample ID# (~7000 samples) and rows are gene expression (~20,000 genes). Column headings are BIOPSY1-A, BIOPSY1-B, BIOPSY1-C, ..., BIOPSY200-Z. Each number (1-200) is a different patient, and each sample for that patient is a different letter (-A, -Z).
I would like to do some comparisons between samples that came from men and women. Gender is not included in this gene expression table. I have a separate file with patient numbers (BIOPSY1-200) and their gender M/F.
I would like to code something that will look at the column ID (ex: BIOPSY7-A), recognize that it includes "BIOPSY7" (but not == BIOPSY7 because there is BIOPSY7-A through BIOPSY7-Z), find "BIOPSY7" in the reference file, extrapolate M/F, and create a new row with M/F designation.
Honestly, I am so overwhelmed with coding this that I tried to open the file in Excel to manually input M/F, for the 7000 columns as it would probably be faster. However, the file is so large that Excel crashes when it opens.
Any input or resources that would put me on the right path would be extremely appreciated!!
I don't quite know how your data looks like, so I made mine based on your definitions. I'm sure you can modify this answer based on your needs and your dataset structure:
library(data.table)
genderfile <-data.frame("ID"=c("BIOPSY1", "BIOPSY2", "BIOPSY3", "BIOPSY4", "BIOPSY5"),"Gender"=c("F","M","M","F","M"))
#you can just read in your gender file to r with the line below
#genderfile <- read.csv("~/gender file.csv")
View(genderfile)
df<-matrix(rnorm(45, mean=10, sd=5),nrow=3)
colnames(df)<-c("BIOPSY1-A", "BIOPSY1-B", "BIOPSY1-C", "BIOPSY2-A", "BIOPSY2-B", "BIOPSY2-C","BIOPSY3-A", "BIOPSY3-B", "BIOPSY3-C","BIOPSY4-A", "BIOPSY4-B", "BIOPSY4-C","BIOPSY5-A", "BIOPSY5-B", "BIOPSY5-C")
df<-cbind(Gene=seq(1:3),df)
df<-as.data.frame(df)
#you can just read in your main df to r with the line below, fread prevents dashes to turn to period in r, you need data.table package installed and checked in
#df<-fread("~/first file.csv")
View(df)
Note that the following line of code removes the dash and letter from the column names of df (I removed the first column by df[,-c(1)] because it is the Gene id):
substr(x=names(df[,-c(1)]),start=1,stop=nchar(names(df[,-c(1)]))-2)
#[1] "BIOPSY1" "BIOPSY1" "BIOPSY1" "BIOPSY2" "BIOPSY2" "BIOPSY2" "BIOPSY3" "BIOPSY3" "BIOPSY3" "BIOPSY4" "BIOPSY4"
#[12] "BIOPSY4" "BIOPSY5" "BIOPSY5" "BIOPSY5"
Now, we are ready to match the columns of df with the ID in genderfile to get the Gender column:
Gender<-genderfile[, "Gender"][match(substr(x=names(df[,-c(1)]),start=1,stop=nchar(names(df[,-c(1)]))-2), genderfile[,"ID"])]
Gender
#[1] F F F M M M M M M F F F M M M
Last step is to add the Gender defined above as a row to the df:
df_withGender<-rbind(c("Gender", as.character(Gender)), df)
View(df_withGender)

Delete row in dataframe R

I used
milsa <- edit(data.frame())
To open the R Data Editor and now I can type the data of my table.
My problem is: my table has 36 rows, but for some reason I have 39 rows appearing in the program (the 3 additional rows are all filled with NA).
When I try to use:
length(civil)
I'm getting 39 instead of 36. How can I solve this? I am trying to use fix(milsa) but it can't delete the additional rows.
PS: Civil is a variable of milsa.
Subset with the index:
You can reassign the data.frame to itself with only the rows you want to keep.
milsa <- milsa[1:36,]
Here is a LINK to a quick tutorial for your reference
To delete specific rows
milsa <- milsa[-c(row_num1, row_num2, row_num3), ]
To delete rows containing one or more NA's
milsa <- na.omit(milsa)

R for Loop Add Value to New Column

I am trying to run a for loop in a R data frame to pull the Last Price of dataframe of stocks. I am having trouble appending the result to the original dataframe and using it as a second column. Here is the code I am working with thus far. I can get it to print but not add to a new column. I tried to set the loop value equal to a new column but I get an error
for (i in df_financials$Ticker){
df_financials$Last_Price=(bdp(i,'PX_LAST'))
}
Error in `$<-.data.frame`(`*tmp*`, "Last_Price", value = list(PX_LAST =
NA_real_)) :
replacement has 1 row, data has 147
Print(df_financials)
Ticker
1 ENH Equity
2 AXS Equity
3 BOH Equity
4 CNA Equity
5 TRH Equity
You first need to specify the order to apply your command to the stated vector and when to stop [i.e., use 1:length(df$Var) within for()]. Second, specify which row (i) of your new column to replace (i.e.,df$var[i]). Give the code below a try and see if that works.
for (i in 1:length(df_financials$Ticker)){
df_financials$Last_Price[i]=(bdp(i,'PX_LAST'))
}
I'm not familiar with the bdp() function itself. However, I suspect the
problem is that you are trying to pull data from a list with more stocks than you are interested in. If this is the case you need to reference the stock in row i that you want to obtain the last price for. If I'm understanding this correctly the code below should do the trick.
I'll assume that the list is something like
Stock<-data.frame(other_stocks = c("ENH","AXS","Rando1","BOH","CNA","TRH","Rando2","Rando3"),
PX_LAST=c(1,2,3,4,5,6,7,8))
Stock
for (i in 1:length(df$Ticker)){
df$Last_Price[i]=(bdp(df$Ticker[i],'PX_LAST'))
}

Google Scripts automatically add date to column in Google Sheets

I am trying to use a Google script that retrieves 2 securities fields from GOOGLEFINANCE and saves the output to a Google Sheet file. I need the script to also add the datetime to the first column of the Sheet.
I have created a basic Google Sheet with 3 columns:
A is formatted to DateTime. It has column name date in row 1 and is empty in rows 2 onwards
C has the column name price in row 1 and is empty in rows 2 onwards
D has the column name pe in row 1 and is empty in rows 2 onwards
Here is my function:
function myStocks() {
var sh = SpreadsheetApp.getActiveSpreadsheet();
sh.insertRowAfter(1);
sh.getRange("A2").setValue(new Date());
sh.getRange("C2").setFormula('=GOOGLEFINANCE("GOOG", "price")');
sh.getRange("D2").setFormula('=GOOGLEFINANCE("GOOG", "pe")');
}
Here is the output:
Date price pe
12/10/2017 22:44:31 1037.05 34.55
12/10/2017 22:43:24 1037.05 34.55
The output of columns C and D is correct. The output of column A is wrong. Every time I run the function, each new row is added ABOVE the last row:
The first time I ran the function was at 12/10/2017 22:43:24 and it added that row first.
The second time I ran the function was 12/10/2017 22:44:31 BUT it added that row ABOVE the last row in the sheet - I wanted it to add the new row BELOW the last row.
Is there a way to auto fill the datetime downwards in a single column in GoogleSheets, using a script function?
How about the following modifications?
Modification points :
sh.insertRowAfter(1) means that a row is inserted between 1 row and 2 row.
In your situation, you can retrieve the last row using getLastRow().
getRange("A2").setValue(), getRange("C2").setFormula() and getRange("D2").setFormula() mean that the values are imported to "A2", "C2" and "D2", respectively.
By this, the values are always imported to 2 row.
When you want to import several values and formulas to sheet, you can use setValues() and setFormulas().
The script which was reflected above points is as follows.
Modified script :
function myStocks() {
var sh = SpreadsheetApp.getActiveSheet();
var lastrow = sh.getLastRow() + 1; // This means a next row of last row.
sh.getRange(lastrow, 1).setValue(new Date());
var formulas = [['=GOOGLEFINANCE("GOOG", "price")', '=GOOGLEFINANCE("GOOG", "pe")']];
sh.getRange(lastrow, 3, 1, 2).setFormulas(formulas);
}
Note :
In your script, date and 2 formulas are imported, simultaneously. The modified script works the same to this.
References :
insertRowAfter()
getLastRow()
setValues()
setFormulas()
If I misunderstand your question, please tell me. I would like to modify.

Resources