Dataframe updating in results but not when View() is used? - r

I have a markdown doc that shows the results of me changing a variable from 1 column (Industries) with multiple values in each cell, into a wide format where each value is now its own variable with values for TRUE or FALSE if it exists in a row from the initial single industries column.
When I run this code the results below the code show the wide format with TRUE/FALSE etc appropriately, but when i use View() to look at the wide format dataframe on its own, it just shows the single original Industries column with multiple values in each cell and I cant work out why.
Any ideas? I need to be able to merge my new wide set into a new dataset but as its still in the single column format i cant until I can work out why its not showing up properly.
Code and results showing successful wide variable layout
Dataframe not showing wide variable layout

Fixed it. I never assigned the changes to something new. I just needed to add a df <- at the start of the second chunk and then view that df instead.

Related

How to stop R from reading first row as column name when scraping a pdf

Unfortunately, the pdf I'm scraping is sensitive so I can't share it.
It's about 50 pages long and none of the columns have actual column headers so R is taking the first row and using it as the column names. Not a huge deal, I can always add that row back in and replace the column names. The problem is each page has a different first line so when I run all the pages, it take the first line from each page and takes is as a new column name. So, page one spits out 10 nice columns with the wrong names. Then it moves to page two and recognizes new column names so in addition to adding new rows it adds another 10 columns. So in the end instead of 1000 obs. of 10 variables, I have 1000 obs. of 500 variables.
I hope this explanation makes sense.
Using extract_tables(), I'm able to specify table area and column widths. Is there a command I can use with extract_tables() to tell it not to assume/use column names?

How to shift rows down based on value in column R?

I have a data frame that looks kind of like this...but much larger
I want to look at the record_id column and shift the right side columns down when the row says admin_time. Then make that previous row NA. Then when I write it to a csv, I'll just use the na = "" to make those cells blank
For example, in the first few rows, it would look like this...
No need to try to recreate the data frame. I was thinking maybe a for loop would work with an embedded if statement to review the patient_id, record_id, and pk_day. I was just looking for alternate suggestions or how to use a statement within the loop to pick out the admin line and do what I mentioned above

Updating a File in R by adding a column/vector

Is there any way that I can update an existing .csv file by adding a column/vector that I have scraped from the web. I have a webscraper that pulls COVID-19 data and I am trying to create a file that has positive cases in columns and each column is the list of cases for a day in each county (x-axis is counties, y-axis is date). I have toyed around with many different ideas at this point and seem to have hit a roadblock. I'm fairly new to r so any ideas would be appreciated!
Packages I am Currently Using/Planning to Use:
library(tidyverse)
library(funModeling)
library(Hmisc)
library(rvest)
library(ggplot2)
CODE:
#writing the original file
positive <- data.frame(Counties= counties_list, "06/12/2020"= positive_data)
positive[is.na(positive)]= 0
positive = positive[-c(76),]
write.csv(positive, "C:/Users/Nathan May/Desktop/Research Files (ABI)/Covid/Data For
Shiny/Positive/Positive Data.csv")
#creating the new vector and updating the existing file with it
datap <- read.csv("C:/Users/Nathan May/Desktop/Research Files (ABI)/Covid/Data For
Shiny/Positive/Positive Data.csv")
positive_data = positive_data[-c(76),]
datap$DATE <- positive_data
NOTE: The end goal is to create a ShinyApp that displays bar charts for postives, recoveries, and deaths by day in each county. This is the data wrangling portion.
First things first, if you are going to use the tidyverse, use tibble instead of data.frame. Tibbles are the Tidyverse version of data frames.
Next, be aware of the structure of your data frame. The way you create your data.frame now (and later probably your tibble) you get a variable "Counties" and one additional variable for each day. That means that you will have to add columns as time passes (the opposite of what you described: Moving along the x axis (along columns) will move along dates while moving along the y-axis (moving along rows) will move along counties). It's possible but I think a bit unconventional. You might want to initialize your data frame with one column for each county and an additional variable called "date". Then whenever you get new data you can add a row in your dataframe instead of a column (so you're "adding a new case" instead of "adding a new variable").
To actually add the row you will have to load the data as you do in your code, create the new row (or column, if you insist) and then "glue" it to the rest of the data.
Depending on how your data looks you can create a single row dataframe using tibble_row() with the same countries as variable names as you have in your main data frame and then glue them together with add_row(datap, your_new_row). Alternatively, if you want to add the row only using position and not column names, you can have the new row as a vector and use rbind() instead of add_row.
If you persist with the "one variable per date" approach there's column equivalents (add_column and cbind) for both these functions.
Hope this helps, Cheers

How to import reactive datasets in Rshiny?

i m creating a risk dashboard , the problem is that i need the data set to be reactive , i have a simple dataset composed of countries (8) , sectors and values , what i want is that my app will be able to deal with different data sets for example if we change the colnames (country becomes pays) and we change the position of the col ,the app will recognize the column as country (in reality the data set is composed of an undefined number of variables with unkown names)
for example for the country column , i thought of creating a list that contains all country names and and when the first row of a column matches with a country from that list ,the column become names country
like that the problem is solved for one variable and what about the other ones
I think that's unnecesary complexity.
I suggest you to build an script to clean your data first with those specifications and then use it as a source.
You can use pattern recognition to match columns but be sure there aren't similar columns, for example, if you have two numerical variables there's a big problem.
Via Shiny I suggest you:
fileInput to import your database
Visualizate your database using DT
Create as many textInput boxes as columns you have
Manually change colnames using dplyr::rename and the boxes
Use the transformed database in your dashboard
Other options can be made using base::grep and dplyr.

Mixed types in Shiny datatable

So I have a Shiny App that is basically a dataframe passed to the datatable function and formatted in a specific way (see picture below).
I want to put a string where there are 99999999999% values and blank cells while I keep the rest of the cells as numeric (I need the rest of the cells to be of numeric type so as to apply a color scale formatting on them). However, because of the very nature of dataframes, it is not possible to have different types in the same columns.
The question is: Do you know a way to have strings and numeric types in the same column of a data frame? Should I make some other workaround? In this case: Any idea?

Resources