I am new to R and I am facing difficulties to convert my dataframe (named dffinal) which contains list into a csv.
I tried the following code which gave a csv that is not usable:
dput(dffinal, file="out.txt")
new <- source("out.txt")
write.csv2(dffinal,"C:/Users\\final.csv", row.names = FALSE)
I tried all the option but I found nothing! Here is a sample of my dataframe:
dput(head(dffinal[1:2]))
structure(list(V1 = list("I heard about your products and I would like to give it a try but I'm not sure which product is better for my dry skin, Almond products or Shea Butter products? Thank you",
"Hi,\n\nCan you please tell me the difference between the shea shower oil limited edition and the other shower gels? I got a sample of one in a kit that had a purple label on it. (Please see attached photo.) I love it!\nBut, what makes it limited edition, the smell or what? It is out of stock and I was wondering if it is going to be restocked or not?\n\nAlso, what makes it different from the almond one?\n\nThank you for your help.",
"Hello, Have you discontinued Eau de toilette", "I both an eGift card for my sister and she hasn't received anything via her email\n\nPlease advise \n\nThank you \n\n cann",
"I do not get Coco Pillow Mist. yet. When are you going to deliver it? I need it before January 3rd.",
"Hello,\nI wish to follow up on an email I just received from Lol, notifying\nme that I've \"successfully canceled my subscription of bun Complete.\"\nHowever, I didn't request a cancelation and was expecting my next scheduled\nfulfillment later this month. Could you please advise and help? I'd\nappreciate it if you could reinstate my subscription.\n"),
V2 = list("How long can I keep a product before opening it? shea butter original hand cream large size 5oz, i like to buy a lot during sales promotions, is this alright or should i only buy what i'll use immediately, are these natural organic products that will still have a long stable shelf life? thank you",
"Hi,\nI recently checked to see if my order had been delivered, and I only received my gift box and free sample. Can you please send the advent calendar? Does not seem to have been included in the shipping. Thank you",
"Is the gade fragrance still available?", "I previously contacted you because I purchased your raspberry lip scrub. When I opened the scrub, 25% of the product was missing. Your customer service department agreed to send me a replacement, but I never received the replacement rasberry lip scrub. Could you please tell me when I will receive the replacement product? Thanks, me",
"To whom it may concern:\n\nI have 3 items in my order: 1 Shea Butter Intensive Hand Balm and 2 S‚r‚nit‚ Relaxing Pillow Mist. I have just received the hand balm this morning. I was wondering when I would receive the two bottles of pillow mist.\n\nThanks and regards,\n\nMe",
"I have not received 2X Body Scalp Essence or any shipment information regarding these items. Please let me know if and when you will be shipping these items, otherwise please credit my card. Thanks")), row.names = c(NA,
6L), class = "data.frame")
We can do this in tidyverse
library(dplyr)
library(readr)
dffinal %>%
mutate(across(everything(), unlist)) %>%
write_csv('result.csv')
If you have list of only length 1 for all the rows as shared in the example using unlist will work -
dffinal[] <- lapply(dffinal, unlist)
If the length of list is greater than 1 use -
dffinal[] <- lapply(dffinal, sapply, toString)
Write the data with write.csv -
write.csv(dffinal, 'result.csv', row.names = FALSE)
First of all, I must say I'm completely noob in R. So I apologize in advance for asking for help with such a simple task. My task is to form a graph of COVID-19 cases for a certain period using data from the CSV file. Unfortunately, at the moment I cannot contact the person from the World Health Organization who provided the data and the script for launching. But I was left with an error that I cannot fix either myself, not with the help of Google.
script.R
library(EpiEstim)
library(ggplot2)
COVID<-read.csv("dataset.csv")
res_parametric_si<-estimate_R(COVID$I,method="parametric_si",config=make_config(list(mean_si=4,std_si=3)))
plot(res_parametric_si)
dataset.csv
Date,Suspected per day,Total suspected,Discarded/pending,Confirmed per day,Total confirmed,Deaths per day,Deaths Total,Case fatality rate,Daily confirmed,Recovered per day,Recovered total,Active cases,Tested with PCR,# of PCR tests total,average tests/ 7 days,Inf HCW,Inf HCW/d,Vent HCW,Susp per day
01-Jul-20,1239,91172,45285,889,45887,12,1185,2.58%,889,505,20053,24649,11109,676684,10073,6828,63,,1239
02-Jul-20,1249,92421,45658,876,46763,27,1212,2.59%,876,505,20558,24993,13167,689851,9966,6874,46,,1249
03-Jul-20,1288,93709,46032,914,47677,15,1227,2.57%,914,597,21155,25295,11825,701676,9915.7,6937,63,,1288
04-Jul-20,926,94635,46135,823,48500,22,1249,2.58%,823,221,21376,25875,9934,711610,9957,6990,53,,926
05-Jul-20,680,95315,46272,543,49043,13,1262,2.57%,543,327,21703,26078,6696,718306,9963.7,7030,40,,680
06-Jul-20,871,96186,46579,564,49607,21,1283,2.59%,564,490,22193,26131,9343,727649,10303.9,7046,16,,871
07-Jul-20,1170,97356,46942,807,50414,23,1306,2.59%,807,926,23119,25989,13568,741217,10806,7092,46,,1170
Error
Error in process_I(incid) (script.R#4): incid must be a vector or a dataframe with either i) a column called 'I', or ii) 2 columns called 'local' and 'imported'.
For the example data the issue seems to be that it does only cover 7 data points, and the configurator assumes that there it can window over more than 7 days. What worked for me was the following code (working in the sense that it does not throw an error).
config <- make_config(incid = COVID$Daily.confirmed,
method="parametric_si",
list(mean_si=4,std_si=3, t_start = c(2,3),t_end = c(6,7)))
res_parametric_si<-estimate_R(COVID$Daily.confirmed,method="parametric_si",config=config)
plot(res_parametric_si)
I am using the newsanchor package in R to try to extract entire article content via NewsAPI. For now I have done the following :
require(newsanchor)
results <- get_everything(query = "Trump +Trade", language = "en")
test <- results$results_df
This give me a dataframe full of info of (maximum) a 100 articles. These however do not containt the entire actual article text. Rather they containt something like the following:
[1] "Tensions between China and the U.S. ratcheted up several notches over the weekend as Washington sent a warship into the disputed waters of the South China Sea. Meanwhile, Google dealt Huaweis smartphone business a crippling blow and an escalating trade war co… [+5173 chars]"
Is there a way to extract the remaining 5173 chars. I have tried to read the documentation but I am not really sure.
I don't think that is possible at least with free plan. If you go through the documentation at https://newsapi.org/docs/endpoints/everything in the Response object section it says :
content - string
The unformatted content of the article, where available. This is truncated to 260 chars for Developer plan users.
So all the content is restricted to only 260 characters. However, test$url has the link of the source article which you can use to scrape the entire content but since it is being aggregated from various sources I don't think there is one automated way to do this.
I apologize if this question has been asked with terminology I don't recognize but it doesn't appear to be.
I am using the function comm2sci in the library taxize to search for the scientific names for a database of over 120,000 rows of common names. Here is a subset of 10:
commnames <- c("WESTERN CAPERCAILLIE", "AARDVARK", "AARDWOLF", "ABACO ISLAND BOA",
"ABBOTT'S DAY GECKO", "ABDIM'S STORK", "ABRONIA GRAMINEA", "ABYSSINIAN BLUE
WINGED GOOSE",
"ABYSSINIAN CAT", "ABYSSINIAN GROUND HORNBILL")
When searching with the NCBI database in this function, it asks for user input if the common name is generic/general and not species specific, for example the following call will ask for clarification for "AARDVARK" by entering '1', '2' or 'return' for 'NA'.
install.packages("taxize")
library(taxize)
ncbioutput <- comm2sci(commnames, db = "ncbi")###querying ncbi database
Because of this, I cannot rely on this function to find the names of the 120000 species without me sitting and entering 'return' every few minutes. I know this question sounds taxize specific - but I've had this situation in the past with other functions as well. My question is: is there a general way to place the comm2sci call in a conditional statement that will return a specific value when user input is prompted? Or otherwise write a function that will return some input when prompted?
All searches related to this tell me how to ask for user input but not how to override user queries. These are two of the question threads I've found, but I can't seem to apply them to my situation: Make R wait for console input?, Switch R script from non-interactive to interactive
I hope this was clear. Thank you very much for your time!
So the get_* functions, used internally, all by default ask for user input when there is > 1 option. But, all of those functions have a sister function with an underscore, e.g., get_uid_ that do not prompt for input, and return all data. You can use that to get all the data, then process however you like.
Made some changes to comm2sci, so update first: devtools::install_github("ropensci/taxize")
Here's an example.
library(taxize)
commnames <- c("WESTERN CAPERCAILLIE", "AARDVARK", "AARDWOLF", "ABACO ISLAND BOA",
"ABBOTT'S DAY GECKO", "ABDIM'S STORK", "ABRONIA GRAMINEA",
"ABYSSINIAN BLUE WINGED GOOSE",
"ABYSSINIAN CAT", "ABYSSINIAN GROUND HORNBILL")
Then use get_uid_ to get all data
ids <- get_uid_(commnames)
Process the results in ids as you like. Here, for brevity, we'll just grab first row of each
ids <- lapply(ids, function(z) z[1,])
Then grab the uid's out
ids <- as.uid(unname(vapply(ids, "[[", "", "uid")), check = FALSE)
And pass to comm2sci
comm2sci(ids)
$`100830`
[1] "Tetrao urogallus"
$`9818`
[1] "Orycteropus afer"
$`9680`
[1] "Proteles cristatus"
$`51745`
[1] "Chilabothrus exsul"
$`8565`
[1] "Gekko"
$`39789`
[1] "Ciconia abdimii"
$`278977`
[1] "Abronia graminea"
$`8865`
[1] "Cyanochen cyanopterus"
$`9685`
[1] "Felis catus"
$`153643`
[1] "Bucorvus abyssinicus"
Note that NCBI returns common names from get_uid/get_uid_, so you can just go ahead and pluck those out if you want