how do i read a .dxl file in R using read.csv? - r

I tried opening the file in excel and it is being displayed in proper format. Now how do i read it in R? I tried using read.csv function. It takes all the columns together without any separator.

You cannot directly load it to dataframe, first you have to load it as xml and then you can process it further.
Try following
require(XML)
data <- xmlParse('sample.dxl')
xml_data <- xmlToList(data)
You this list further to make your dataframe.

Related

Converting RData to CSV file returns incorrect CSV file

I do not have any expertise on R and I have to convert RData files to CSV to analyze the data. I followed the following links to do this: Converting Rdata files to CSV and "filename.rdata" file Exploring and Converting to CSV. The second option seemed to be a simpler as I failed to understand the first one. This is what I have tried till now and the results along with it:
>ddata <- load("input_data.RData")
>print(ddata)
[1] "input_data"
> print(ddata[[1]])
[1] "input_data"
> write.csv(ddata,"test.csv")
From the first link I learnt that we can see the RData type and when I did str(ddata) I found out that it is a List of size 1. Hence, I checked to see if print(ddata[[1]]) would print anything apart from just "input_data". With the write.csv I was able to write it to a csv without any errors but it has just the following 2 lines inside the CSV file:
"","x"
"1","input_data"
Can you please help me understand what am I doing wrong and show a way to get all the details in a csv?
The object ddata contains the name of the object(s) that load() created. Try typing the command ls(). That should give you the names of the objects in your environment. One of them should be input_data. That is the object. If it is a data frame (str(input_data)), you can create the csv file with
write.csv(input_data, "test.csv")

Error when parsing JSON file into R - how to fix?

Using the package rtweet, I have streamed some tweets and saved them in a JSON file.
When using the following: tweets_df <- parse_stream('file.json'), I get the following error during the process:
Does anyone have any idea how to fix this so that the JSON file can be read into R as a data frame?
Have you tried it this way? I don't personally use rtweet but work with json files.
#load library to read json
library(jsonlite)
json_data <- fromJSON("db.json")
It reads it as a nested list but then you can simply change it to a dataframe using
df<-rlist::list.stack(x, fill=TRUE )'
You might have to adapt it and for example use a loop if your json file contains several users.

read_csv does not work separate commas and not capture separate rows

I am trying to parse a text log file like this, I can use the default read.csv to parse this file.
test <- read.csv("test.txt", header=FALSE)
It separated all comma parts, though not perfectly put in a dataframe, further manipulation can be done to improve.
However, I can not seem to do so using readr package
test <- read_csv("test.txt", header=FALSE)
All observations turn into 1 row, no separation between commas.
I am learning this package so any help would be great.
{"dev_id":"f8:f0:05:xx:db:xx","data":[{"dist":[7270,7269,7269,7275,7270,7271,7265,7270,7274,7267,7271,7271,7266,7263,7268,7271,7266,7265,7270,7268,7264,7270,7261,7260]},{"temp":0},{"hum":0},{"vin":448}],"time":4485318,"transmit_time":4495658,"version":"1.0"}
{"dev_id":"f8:xx:05:xx:d9:xx","data":[{"dist":[6869,6868,6867,6871,6866,6867,6863,6865,6868,6869,6868,6860,6865,6866,6870,6861,6865,6868,6866,6864,6866,6866,6865,6872]},{"temp":0},{"hum":0},{"vin":449}],"time":4405316,"transmit_time":4413715,"version":"1.0"}
{"dev_id":"xx:f0:05:e8:da:xx","data":[{"dist":[5775,5775,5777,5772,5777,5770,5779,5773,5776,5777,5772,5768,5782,5772,5765,5770,5770,5767,5767,5777,5766,5763,5773,5776]},{"temp":0},{"hum":0},{"vin":447}],"time":4461316,"transmit_time":4473307,"version":"1.0"}
{"dev_id":"xx:f0:xx:e8:xx:0a","data":[{"dist":[4358,4361,4355,4358,4359,4359,4361,4358,4359,4360,4360,4361,4361,4359,4359,4356,4357,4361,4359,4360,4358,4358,4362,4359]},{"temp":0},{"hum":0},{"vin":424}],"time":5190320,"transmit_time":5198748,"version":"1.0"}
Thanks to #Dave2e pointing out that this file is in JSON format, I found the way to parse it using ndjson::stream_in.

How to put data frame in R including count of complete cases in separate files

I'm a new student at R. I have a directory containing EXCEL files and I need to make a summary in a data frame with complete cases in each file. How can I do this. I tried the following code buwt doesn't work. Appreciate your support
Always begin with the steps required. You will need to do the following:
Read in your data
Clean up your data
Since you do not have any code shown, I will provide you with pseudo code.
library(readxl)
df <- read_xls(path, other options)
df <- complete.cases(df)
You'll want to do that for all of your files. You can use lapply once you are more advanced, and loop over your list.files() list of excel files.

Reading and Setting Up CSV files on R Programming Language

I would like to clarify my understanding here on both converting a file into CSV and also reading it. Let's use a dataset from R for instance, titled longley.
To set up a data frame, I can just use the write.table command as follows, right?
d1<-longley
write.table(d1, file="", sep="1,16", row.names=TRUE, col.names=TRUE)
Has this already become a data frame or am I missing something here?
Now let's say if I want to read this CSV file. Then would my code be something like:
read.table(<dframe1>, header=FALSE, sep="", quote="\"")
It seems like before that I have to use a function called setwd(). I'm not really sure what it does or how it helps. Can someone help me here?
longley and, therefore, d1 are already data frames (type class(d1) in the console). A data frame is a fundamental data structure in R. Writing a data frame to a file saves the data in the data frame. In this case, you're trying to save the data in the data frame in CSV format, which you would do like this:
write.csv(d1, "myFileName.csv")
write.csv is a wrapper for write.table that takes care of the settings needed for saving in CSV format. You could also do:
write.table(d1, "myFileName.csv", sep=",")
sep="," tells R to write the file with values separated by a comma.
Then, to read the file into an R session you can do this:
df = read.csv("myFileName.csv", header=TRUE, stringsAsFactors=FALSE)
This creates a new object called df, which is the data frame created from the data in myFileName.csv. Once again, read.csv is a wrapper for read.table that takes care of the settings for reading a CSV file.
setwd is how you change the working directory--that is, the default directory where R writes to and reads from. But you can also keep the current working directory unchanged and just give write.csv or read.csv (or any other function that writes or reads R objects) the full path to wherever you want to read from or write to. For example:
write.csv(d1, "/path/for/saving/file/myFileName.csv")

Resources