Unable to read Landsat 5 metadata using readMeta() of R - r

I am following a tutorial on remote sensing using R. The tutorial is available from here. pg 44.
I would like to read the metadata for a Landsat 5 image, specifically for 1984-06-22 path/row 170/072. The Landsat Product ID is LT05_L2SP_170072_19840622_20200918_02_T1. Here is the L5 metadata.
I am using the readMeta function from the RSToolbox package. The work should be pretty straightforward in that I put in the path to my metadata file and specify raw = F so that the metadata can be put in a format for further analyses.
mtl <- readMeta(file = ".", raw = F)
After doing this (reading the L5 using readMeta) I get this error.
Error in `.rowNamesDF<-`(x, value = value) : invalid 'row.names' length
Now of course there are many ways of killing a rat, so I used this method here - whereby read.delim function is used to read the metadata file. This brings in a dataframe with all the metadata values alright. However, when this dataframe is put into the radCor function in order to convert the L5 DNs to Top-of-Atmosphere radiance, the following error appears:
Error in radCor(june_landsat2, metaData = metadata_june, method = "rad") :
metaData must be a path to the MTL file or an ImageMetaData object (see readMeta)
Seems like radCor will accept nothing else apart from what is read by readMeta or the path to the MTL file itself. Not even the result from read.delim will do. Because the first error readMeta mentioned row.names length issue, I thought deleting the last row without a value in the metadata file would solve the issue, but this brings more complicated errors.
In short, I would like to find a way to make readMeta read my L5 metadata file, since the result from readMeta is being used in other places of the tutorial. Thanks

Related

Using readtext to extract text from XML

I am not used to working with XML files but need to extract text from various fields in XML files. Specifically, I've downloaded and saved XML files like the following: https://www.federalregister.gov/documents/full_text/xml/2007/09/18/07-4595.xml. I'm interested in the text within the tag "regtext" in this and other similar XML files.
I've downloaded the XML files and stored them on my computer, but when I set the directory and attempt to use the readtext package to read from the XML files, I get the following error:
regtext <- readtext("/regdata/RegDataValidation", text_field = "regtext")
Error in doc_parse_file(con, encoding = encoding, as_html = as_html, options = options) :
Start tag expected, '<' not found [4]
I've tried to search the error, but nothing I've come across has helped me figure out what might be going on. This basic command works like a charm on any number of other document types, including .csv or .docx, but for some reason it just doesn't seem to recognize the files I'm trying to work with here. Any pointers would be much appreciated--I'm too much of a novice and all of the documentation on readtext does not give examples of how to work with XML.
Pursuant to comments below, I've also tried to specify a single saved XML file, as follows:
> regtext <- readtext("/regdata/RegDataValidation/0579- AC01.xml", text_field = "regtext")
Error in xml2_to_dataframe(xml) :
The xml format does not fit for the extraction without xPath
Use xPath method instead
In addition: There were 50 or more warnings (use warnings() to see the first 50)
I tried to specify an xPath expression on a single file, and this did not return any errors, but didn't actually extract any text (even though there should be plenty of text within the "regtext" node:
> regtext <- readtext("/regdata/RegDataValidation/0579- AC01.xml", text_field = "/regtext/*")
I end up with a dataframe with the correct doc_id, but no text.
From the error messages, the readtext function appears to be converting the xml file into a plain text document and the XML package is not accepting it as a valid document.
It is also likely that the XML parser is differentiating between "regtext" and "REGTEXT".
Here is a solution using the xml2 package. (I find this package provides a simpler interface and is easier to use)
library(xml2)
url <- "https://www.federalregister.gov/documents/full_text/xml/2007/09/18/07-4595.xml"
page <- read_xml(url)
#parse out the nodes within the "REGTEXT" sections
regtext <- xml_find_all(page, ".//REGTEXT")
#convert the regtext nodes into vector of strings
xml_text(regtext)

XLConnect: Error: IllegalArgumentException (Java): Sheet index (-1) is out of range (no sheets)

I am trying to use XLConnect to load in a series of excel workbooks that I have. Using the code:
BASZ <- loadWorkbook("BASZ.xlsx", create = TRUE)
works every time, and gives me a formal class workbook. However when I go to read in the worksheet I wish to use:
data <- readWorksheet("BASZ", sheet = "Sheet1")
I always get the same arguement:
"Error: IllegalArgumentException (Java): Sheet index (-1) is out of range (no sheets")
Just yesterday this code worked, im new to this and wondering why this continues to occur. Furthermore; it doesn't matter which excel workbook I try to load, the same error occurs when trying to read in the specific sheet I want to work with. It must be a syntax issue or something im doing wrong right? I fail to understand why it would work, then I close out Studio, then the next day it won't...?
If you have already loaded the excel file using loadWorkbook(), you can use the function readWorksheet() to read individual sheets. You would only use readWorksheetFromFile() if you had not previously loaded the file. So your code should read:
BASZ <- loadWorkbook("BASZ.xlsx", create = TRUE)
data <- readWorksheet(BASZ, sheet = "Sheet1")
Note that in the second line, the first argument is the variable BASZ, not a quoted string.
Okay so just in case someone else makes the same mistake as me; you have to be working within the directory your xlsx file is in.

R save() not producing any output but no error

I am brand new to R and I am trying to run some existing code that should clean up an input .csv then save the cleaned data to a different location as a .RData file. This code has run fine for the previous owner.
The code seems to be pulling the .csv and cleaning it just fine. It also looks like the save is running (there are no errors) but there is no output in the specified location. I thought maybe R was having a difficult time finding the location, but it's pulling the input data okay and the destination is just a sub folder.
After a full day of extensive Googling, I can't find anything related to a save just not working.
Example code below:
save(data, file = "C:\\Users\\my_name\\Documents\\Project\\Data.RData", sep="")
Hard to believe you don't see any errors - unless something has switched errors off:
> data = 1:10
> save(data, file="output.RData", sep="")
Error in FUN(X[[i]], ...) : invalid first argument
Its a misleading error, the problem is the third argument, which doesn't do anything. Remove and it works:
> save(data, file="output.RData")
>
sep is used as an argument in writing CSV files to separate columns. save writes binary data which doesn't have rows and columns.

Why is creating CSV file in Sparklyr R shows an Error?

Introdution
I have written following R code by referring Link-1. Here, Sparklyr package is used in R programming to read huge data from JSON file. But, while creating CSV file, it has shown the error.
R code
sc <- spark_connect(master = "local", config = conf, version = '2.2.0')
sample_tbl <- spark_read_json(sc,name="example",path="example.json", header = TRUE,
memory = FALSE, overwrite = TRUE)
sdf_schema_viewer(sample_tbl) # to create db schema
sample_tbl %>% spark_write_csv(path = "data.csv") # To write CSV file
Last line shows the following error. Dataset contains different data types. If required I can show the database schema. It contains nested data columns.
Error
Error: java.lang.UnsupportedOperationException: CSV data source does not support struct,media:array,display_url:string,expanded_url:string,id:bigint,id_str:string,indices:array,media......
Question
How to resolve this error? Is it due to the different data types or deep level 2 to 3 nested columns? Any help would be appreciated.
It seems that your dataframe has array data type, which is NOT supported by CSV. It seems it's not possible that CSV file can include array or other nest structure for this scenario.
Therefore, If you want your data to be human readable text, write out as Excel file.
Please note that Excel CSV (very special case though) supports arrays in CSV using "\n"
inside quotes, but you have to use as EOL for the row "\r\n" (Windows EOL).

Importing manual gating from FlowJo to R (flow cytometry analysis)

I am experiencing an issue with reading an .xls or .wspt files into R. This is a table of flow cytometry manual gating schema. my code is as follows:
flowData = system.file("extdata",package="flowWorkspace")
file = list.files(flowData,pattern="manual.xls"/"manual.wspt", full=TRUE)
ws = openWorkspace(file)
When I try to read with openWorkspace, the .xls file gives an error:
Start tag expected. "(" not found.
I have seen this error in another post, but it doesn't seem to explain my case.
while, for opening the .wspt file, i receive an error:
error in data.frame...arguments imply differing number of rows:1,0.
Both of these files (.xls and .wspt) contain the same information. I just wanted to try to read both of them.

Resources