Does anyone know how to download TRMM 3B42 time series data? - r

I'm trying to download TRMM 3B42 3-hour binary data for a given time span from this NASA FTP server.
There is an excellent code made by Florian Detsch to download the daily product (here is the link: https://github.com/environmentalinformatics-marburg/Rsenal/blob/master/R/downloadTRMM.R) included in the GitHub-only Rsenal package. Unfortunately it is not working for the 3-hour data.
I changed the code:
downloadTRMM <- function(begin, end, dsn = ".", format = "%Y-%m-%d.%H") {
## transform 'begin' and 'end' to 'Date' object if necessary
if (!class(begin) == "Date")
begin <- as.Date(begin, format = format)
if (!class(end) == "Date")
end <- as.Date(end, format = format)
## trmm ftp server
ch_url <-"ftp://disc2.nascom.nasa.gov/data/TRMM/Gridded/3B42_V7/"
## loop over daily sequence
ls_fls_out <- lapply(seq(begin, end, 1), function(i) {
# year and julian day (name of the corresponding folder)
tmp_ch_yr <- strftime(i, format = "%Y%m")
#tmp_ch_dy <- strftime(i, format = "%j")
# trmm date format
tmp_dt <- strftime(i+1, format = "%Y%m%d.%H")
# list files available on server
tmp_ch_url <- paste(ch_url, tmp_ch_yr, "", sep = "/")
tmp_ch_fls <- tmp_ch_fls_out <- character(2L)
for (j in 1:2) {
tmp_ch_fls[j] <- paste0("3B42.", tmp_dt, "z.7.precipitation",
ifelse(j == 1, ".bin"))
tmp_ch_fls[j] <- paste(tmp_ch_url, tmp_ch_fls[j], sep = "/")
tmp_ch_fls_out[j] <- paste(dsn, basename(tmp_ch_fls[j]), sep = "/")
download.file(tmp_ch_fls[j], tmp_ch_fls_out[j], mode = "wb")
}
# return data frame with *.bin and *.xml filenames
tmp_id_xml <- grep("xml", tmp_ch_fls_out)
data.frame(bin = tmp_ch_fls_out[-tmp_id_xml],
xml = tmp_ch_fls_out[tmp_id_xml],
stringsAsFactors = FALSE)
})
## join and return names of processed files
ch_fls_out <- do.call("rbind",ls_fls_out)
return(ch_fls_out)
}
getwd()
setwd("C:/Users/joaoreis/Documents/Bases_Geograficas/trmm_3h/")
fls_trmm <- downloadTRMM(begin = "2008-01-01.00", end = "2008-01-05.00")
fls_trmm
But I get the following error:
trying URL
'ftp://disc2.nascom.nasa.gov/data/TRMM/Gridded/3B42_V7//200801//3B42.20080102.00z.7.precipitation.bin'
Error in download.file(tmp_ch_fls[j], tmp_ch_fls_out[j], mode = "wb")
: cannot open URL
'ftp://disc2.nascom.nasa.gov/data/TRMM/Gridded/3B42_V7//200801//3B42.20080102.00z.7.precipitation.bin'
In addition: Warning message: In download.file(tmp_ch_fls[j],
tmp_ch_fls_out[j], mode = "wb") : InternetOpenUrl failed: '' Called
from: download.file(tmp_ch_fls[j], tmp_ch_fls_out[j], mode = "wb")
Does anyone know how to fix it using R?
Thanks!

As of commit 909f98a, I have enabled the automated retrieval of 3-hourly data from ftp://disc3.nascom.nasa.gov/data/s4pa/TRMM_L3. Make sure you have the latest version of Rsenal installed using
devtools::install_github("environmentalinformatics-marburg/Rsenal")
and then have a look at the examples in ?downloadTRMM. For now, the function supports both character (requires 'format' argument passed on to strptime) and POSIXlt input. For example, something like
downloadTRMM(begin = "2015-01-01 12:00", end = "2015-01-03 12:00",
type = "3-hourly", format = "%Y-%m-%d %H:%M")
to download 3-hourly data from 1-3 January 2015 (noon to noon) should now work just fine.
Note that in contrast to the FTP server you mentioned, the data comes in .HDF format and a rasterize method has not been implemented so far, meaning that you have to deal with the container files yourself. I'll try to figure out something more convenient soon regarding the automated rasterization of the data.

Related

R seems not able to read EXIF data of large MP4 video files using exifr package

6 months ago I used the R exifr package to extract EXIF information from large MP4 video files and export to csv. Now I get NA's for some files. I have run repeat tests of old file sets that previously worked fine and what worked in the past now doesn't. The initial dat table viewed in R studio shows some NA's. Looking at the video files, it seems that small files of short duration are Ok, but larger video files throw NA. Is this a memory issue? I have updated to R v4.2.0
library(exifr)
library(dplyr)
library(tidyverse)
library(hms)
library(lubridate)
library(tidyr)
library(exifr)
setwd("D:\CAFNEC_GBRF\6_Hinchinbrook_Herbert\Victoria Ck\2021") #Insert Base Folder Location Here
#Set File Locations
survey.videos <- "Video_for_Analysis1/" #Folder with videos
#Get EXIF information from video files
files2 <- list.files(survey.videos, pattern = NULL, recursive = TRUE, full.names = TRUE)
dat <- read_exif(files2, tags=c("FilePath", "FileName",
"CreateDate", "Duration"))
dat <- mutate(dat,
DateTimeOriginal = CreateDate)
#Seperate DateTimeOriginal Column into Date & Time
dat2 <- dat %>% separate(DateTimeOriginal, c("Date", "Time"), sep = "([\\ ])") %>%
separate(Date, c("Year", "Month", "Day"), sep = "([\\:])")
dat2$Time <- strptime(dat2$Time, format = "%H:%M:%S")
dat2$Time <- dat2$Time + lubridate::hours(10)
dat2$Time <- substr(dat2$Time,12,19)
#COnvert video start time to hh:mm:ss
dat2$Video_Start <- as_hms(dat2$Time)
#Convert video duration to hh:mm:ss
dat2$Vid_duration <- as_hms(dat2$Duration)
#Calculate video duration
dat3 <- mutate(dat2, Vid_End = Video_Start + Vid_duration)
#COnvert duration to seconds
dat4 <- as_hms(dat3$Vid_End)
#Add Video End Time as column
dat5 <- mutate(dat2, Vid_End = dat4)
#Round Video End time to nearest second
dat5 <- mutate(dat5, Vid_Stop = round_hms(dat5$Vid_End, secs = 1))
#Export to CSV
write.csv(dat5, 'Output1.csv',
row.names = F)
Solved! To read Exif info of large video files using exifr you need to add a .ExifTool_config file to the ExifTool folder within exifr (#StarGeek FYI exifr calls ExifTool to read Exif info). Here are the steps I followed in case it's of use to anyone else in the future:
Copy text located in https://exiftool.org/config.html
Save as .ExifTool_config within the exifr exiftool folder of R in my case (C:\Users\YOURNAMEHERE\AppData\Local\R\win-library\4.2\exifr\exiftool)
At the bottom of the text where you see
%Image::ExifTool::UserDefined::Options = (
CoordFormat => '%.6f', # change default GPS coordinate format
Duplicates => 1, # make -a default for the exiftool app
GeoMaxHDOP => 4, # ignore GPS fixes with HDOP > 4
RequestAll => 3, # request additional tags not normally generated
);
Insert
LargeFileSupport => 1,
Save the file
Refer to https://exiftool.org/forum/index.php?topic=3916.15 for more info.

For Loop in R import multiple CSV from URL

I am importing 500 csv's that have the following similar format:
"https://www.quandl.com/api/v3/datasets/WIKI/stockname/data.csv?column_index=11&transform=rdiff&api_key=keyname"
Where stockname is the ticker symbol of a single stock. I have the list of stock tickers saved in a dataframe called stocklist.
I'd like to use lapply to iterate through my list of stocks. Here's what I have so far:
lst <- lapply(stocklist, function(i){
url <- paste0("https://www.quandl.com/api/v3/datasets/WIKI/",i,"/data.csv?column_index=11&transform=rdiff&api_key=XXXXXXXXXXXXXXX")
spdata <- read.csv(url, stringsAsFactors = FALSE)
})
I get the following error:
Error in file(file, "rt") : invalid 'description' argument
What could be causing this error? I tried using a for loop as well, but was unsuccessful and I have been told lapply is a better method in R for this type of task.
Edit:
Stucture of stocklist:
> dput(droplevels(head(stocklist)))
structure(list(`Ticker symbol` = c("MMM", "ABT", "ABBV", "ABMD",
"ACN", "ATVI")), .Names = "Ticker symbol", row.names = c(NA,
6L), class = "data.frame")
Second Edit (solution):
stockdata<-lapply(paste0("https://www.quandl.com/api/v3/datasets/WIKI/",stocklist[1][[1]],"/data.csv?column_index=11&transform=rdiff&api_key=XXXXXXX"),read.csv,stringsAsFactors=FALSE)
Add names to stockdata:
names(stockdata)<-stocklist[1][[1]]
I believe your 'i' variable is a vector.
Make sure you are sub-scripting it properly and only passing one stock at a time.
This: i would look something like this: i[x]
I can't tell what i is, but I'm guessing it's a number. Can you try this and see if it works for you? I think it's pretty close. Just make sure the URL matches the pattern of the actual URL you are fetching data from. I tried to find it; I couldn't find it.
seq <- c(1:10)
for (value in seq) {
mydownload <- function (start_date, end_date) {
start_date <- as.Date(start_date) ## convert to Date object
end_date <- as.Date(end_date) ## convert to Date object
dates <- as.Date("1970/01/01") + (start_date : end_date) ## date sequence
## a loop to download data
for (i in 1:length(dates)) {
string_date <- as.character(dates[i])
myfile <- paste0("C:/Users/Excel/Desktop/", string_date, ".csv")
string_date <- gsub("-", "-", string_date) ## replace "-" with "/"
myurl <- paste("https://www.quandl.com/api/v3/datasets/WIKI/", i, "/data.csv?column_index=11&transform=rdiff&api_key=xxxxxxxxxx", sep = "")
download.file(url = myurl, destfile = myfile, quiet = TRUE)
}
}
}

Error when converting csv stock data to xts

Below is the code I am trying to use to convert the csv file to xts so that I can perform analysis to it but nothing seems to work. I have even used answers for similar issue that have been posted on this platform but nothing seems to be working.
toDate <- function(x) as.Date(x, origin = "2015-02-15")
z <- read.zoo("Nasdaq.csv", header = TRUE, sep = ",", FUN = toDate)
x <- as.xts(z)
I get below error:
7. stop("character string is not in a standard unambiguous format")
6. charToDate(x)
5. as.Date.character(x, origin = "2015-02-15")
4. as.Date(x, origin = "2015-02-15")
3. FUN(...)
2. processFUN(ix)
1. read.zoo("Nasdaq.csv", header = TRUE, sep = ",", FUN = toDate)
Problem is when you reading file as zoo.
If is a time series, try load file, than turn it into ts.
data <- read.csv("anyfile.csv")
Y <- ts(data$Y, start = c(2015, 2), end = c(2017, 1), frequency = 12)
my_xts <- as.xts(Y)

R/quantmod using getSymbols to get closed prices from csv file with specified date

Trying to fetch the closing prices from a list of tickers listed in a csv. file using the following code:
date <- "2017-03-03"
tickers <- read.csv("us_tickerfeed.csv", header = TRUE)
for(i in 1:nrow(tickers)){
data <- getSymbols(tickers$ticker_th[i], from = date, to = date, src = "yahoo")
tickers$close_price[i] <- Cl(get(data))[[1]]
}
these codes worked before but now I'm getting the following error message:
Error in do.call(paste("getSymbols.", symbol.source, sep = ""), list(Symbols = current.symbols, :
could not find function "getSymbols.6"
Thanks!
I had this problem previously, could you try as.character(tickers$ticker_th)?

Reading data from text file and combining it with date in r

I downloaded data from the internet. I wanted to extract the data and create a data frame. You can find the data in the following filtered data set link: http://www.esrl.noaa.gov/gmd/dv/data/index.php?category=Ozone&type=Balloon . At the bottom of the site page from the 9 filtered data sets you can choose any station. Say Suva, Fiji (SUV):
I have written the following code to create a data frame that has Launch date as part of the data frame for each file.
setwd("C:/Users/")
path = "~C:/Users/"
files <- lapply(list.files(pattern = '\\.l100'), readLines)
test.sample<-do.call(rbind, lapply(files, function(lines){
data.frame(datetime = as.POSIXct(sub('^.*Launch Date : ', '', lines[grep('Launch Date :', lines)])),
# and the data, read in as text
read.table(text = lines[(grep('Sonde Total', lines) + 1):length(lines)]))
}))
The files are from FTP server. The pattern of the file doesn't look familiar to me even though I tried it with .txt, it didn't work. Can you please tweak the above code or any other code to get a data frame.
Thank you in advance.
I think the problem is that the search string does not match "Launch Date :" does not match what is in the files (at least the one I checked).
This should work
lines <- "Launch Date : 11 June 1991"
lubridate::dmy(sub('^.*Launch Date.*: ', '', lines[grep('Launch Date', lines)]))
Code would probably be easier to debug if you broke the problem down into steps rather than as one sentence
I took the following approach:
td <- tempdir()
setwd(td)
ftp <- 'ftp://ftp.cmdl.noaa.gov/ozwv/Ozonesonde/Suva,%20Fiji/100%20Meter%20Average%20Files/'
files <- RCurl::getURL(ftp, dirlistonly = T)
files <- strsplit(files, "\n")
files <- unlist(files)
dat <- list()
for (i in 1:length(files)) {
download.file(paste0(ftp, files[i]), 'data.txt')
df <- read.delim('data.txt', sep = "", skip = 17)
ld <- as.character(read.delim('data.txt')[9, ])
ld <- strsplit(ld, ":")[[1]][2]
df$launch.date <- stringr::str_trim(ld)
dat[[i]] <- df ; rm(df)
}

Resources