I am wanting to import the table information from https://www.pro-football-reference.com/years/2020/draft.htm into a google sheet. However, I'm trying to avoid pulling in null cells as well as information I already have in other sheets. Here are my questions:
The only columns I want are Round (col1), Pick (Col2), and Player (Col4). I've tried using ImportHTML and so far, all i can do is grab the whole table.
I want to create a new column called 'Rd.Pick' which would convert the pick column into a representation ofwhat pick in the respective round they were. So aka Pick 33 would display 2.1
Finally, I would like to be able to remove the rows that are listed in between the last pick of a round but before the first pick in the following round. I'm not sure how to do that given that the text in those rows matches the header row.
This is just to answer the question from your comment above - how to convert the sequential draft pick number to a number like 3.12, 12th pick in the 3rd round.
This formula is a bit brute force, but it works:
={"Round-Pick";
ArrayFormula(ifna(ifs(
D2:D=1,"1."& text(E2:E,"00"),
D2:D=2,"2."& text(E2:E-max(filter(D$2:E,D$2:D=1)),"00"),
D2:D=3,"3."& text(E2:E-max(filter(D$2:E,D$2:D=2)),"00"),
D2:D=4,"4."& text(E2:E-max(filter(D$2:E,D$2:D=3)),"00"),
D2:D=5,"5."& text(E2:E-max(filter(D$2:E,D$2:D=4)),"00"),
D2:D=6,"6."& text(E2:E-max(filter(D$2:E,D$2:D=5)),"00"),
D2:D=7,"7."& text(E2:E-max(filter(D$2:E,D$2:D=6)),"00")
),""))}
If you put that in NFLDraft!F1, it should do what you want. You could then hide Column E if you like.
UPDATED: To provide the format you've requested, with leading zero.
try:
=ARRAYFORMULA(QUERY({
QUERY(IMPORTHTML("https://www.pro-football-reference.com/years/2020/draft.htm",
"table", 1), "select Col4"),
QUERY(IMPORTHTML("https://www.pro-football-reference.com/years/2020/draft.htm",
"table", 1), "select Col1")&"."&
QUERY(IMPORTHTML("https://www.pro-football-reference.com/years/2020/draft.htm",
"table", 1), "select Col2")}, "where not Col2 matches '\.'", 1))
I have a bunch of chat logs and have managed to pull email addresses from them and seperate the domains "#bacon.edu" I have a list of domains matched with a category name.
Basically I want to match the variable to a row in the column 2 pull the category name from column 1.
I should mention everything is formatted as factors currently but that can change.
In this example d1 = "bacon.edu" and name list is a data frame set up like this:
d1 = "bacon.edu"
Workplace Name Email List
Pancake #bac.edu
Test place #toe.edu
superworld #bacon.edu
monkey gym #aclu.edu
toaster oven #yoyo.edu
The goal is to find bacon in row 3 create a variable from column 1 row 3(so abc = "superworld"), but i struggle to find the variable to begin with.
I have tried:
which(d1, namelist$Email.List)
which(namelist$Email.List == d1)
which(grep
match(d1, namelist$Email.list)
which(grepl("bacon.edu, namelist$Email.List
Sadly I dont recall all errors or what they came from but they include:
integer(0)
object class not logical
level sets of factors are different.
I have sense deleted failed attempts. Im sure its simple and I feel bad asking but any help would be appreciated!
We can use grep
namelist$1Workplace Name`[grep(d1, namelist$`Email List`)]
I have csv file contains iphone device roadmap like version number, name of model, release of model , price etc. I have done following:
I have imported data set in Rstudio in variable name iphonedetail by following command. iphonedetail <-read.csv("iphodedata.csv")
Than i hv changed the attribute "name of model" to character by using following: iphonedetail$nameofmodel <- as.character(iphonedetail$nameofmodel)
Now i need to access 1st 5 name of model and store them in vector .
I tried this to achieve : iphonesubset <- data.frame(iphonedetail$nameofmodel)
Then on console i typed iphonesubset, but gave 0 col and row.
Could someone help in above 2 steps correct or not ? and also suggest how to fix 3rd step?
if you want to extract the first five (non unique):
iphonedf1to5 <- df[1:5,]
That means that you get the first 5 rows and all columns. Then if you want to get the unique first five elements it should be like:
iphonedf1to5 <- unique(df[1:5,])
Edit:
df means your data frame of the read csv, iphonedetail in your case.
I have a csv file with almost 4 millions records and 30 + columns.
The Columns are of varied type that includes Numeric, Alphanumeric, Date Column, character etc.
Attempt 1:
When I first read the file in R using read.csv Function then only 2 millions of the records were read.
This may have happened because of some special characters in the DATA.
Attempt 2:
I provided the argument quote = "" in read.csv Function and all the records were read succesfully.
However this brings up 2 issues:
a. all teh Columns were appended with 'x.' modifier:
egs.: x.date , x.name
b. all the Character Columns were loaded in dataframe, enclosed with double quotes ""
Can someone, please advise me that how to resolve these 2 issues and get the data loaded in R succesfully?
I work for a financial insititution and the data is highly sensitive, hence cannot paste the screenshot over here.
I also tried to create the scenario at my home but all my efforts were of little or of no avail.
The below screenshot is closest I have came to the exact scenario:
DATAFRAME SCREENSHOT: Not exact copy
I am trying to write to a csv file with write.table but from what I've read it has limited capabilities. I am using the following command.
write.table(s$Nomen, "table.csv", row.names=FALSE, col.names=FALSE)
which exports a datasheet consisting of a single column (as I like it). However, that column contains a lot of duplicate values. I would like to remove duplicates and order the column alphabetically.
For example, if this is s$Nomen:
Nomen
------
archer
sent
chocolate
banana
arbitrary
column
paste
paste
knowledge
zen
banana
sent
surprise
The output should be:
arbitrary
archer
banana
chocolate
column
knowledge
paste
sent
surprise
zen
I'm assuming sort.list comes in handy, but I don't know how to remove the duplicates.
Note the data in the original column s$Nomen should not be altered! So I don't want to re-order the actual column, but I want to re-order the output.
You could try either unique or duplicated from base R
sort(unique(s$Nomen))
Or
sort(s$Nomen[!duplicated(s$Nomen)])