R: error while reading .h5 files from R using rdhf5 package - r

I am new to hdf5 files. Trying to read some sample files from the below URL..
https://support.hdfgroup.org/ftp/HDF5/examples/files/exbyapi/
while trying to reading one of the .h5 files in R environment
library(rhdf5)
h5ls("h5ex_d_sofloat.h5")
I am getting the below error
Error in H5Fopen(file, "H5F_ACC_RDONLY") :
HDF5. File accessability. Unable to open file.
help is appreciated.

There was some issues with windows itself which was encrypting the hdf5 file while downloading it with default arguments. while downloading the just change the mode as "wb"..
file_url <- "http://support.hdfgroup.org/ftp/HDF5/examples/files/exbyapi/h5ex_d_sofloat.h5"
library(rhdf5)
download.file(url = file_url,destfile = "h5ex_d_sofloat.binary.h5",mode = "wb")
h5ls("h5ex_d_sofloat.binary.h5")
> group name otype dclass dim
0 / DS1 H5I_DATASET FLOAT 64 x 32
i got this solution from biocondutor itself...
https://support.bioconductor.org/p/97311/#97362

Related

Accessing netcdf files from a URL linkwith ncdf4 in R

Are there any workarounds to the problem with ncdf4::nc_open not being able to access some .nc files from a URL? I would like to avoid having to download the file first since this is for a shiny app deployed on a server and so I want to avoid users being able to download files to the server.
Some URLs work e.g. this OPeNDAP URL from a THREDDS server:
library(ncdf4)
nc <- nc_open("https://dapds00.nci.org.au/thredds/dodsC/uc0/Test_pixel_count.nc")
But others do not e.g. this NetCDF Subset Service URL from a THREDDS server:
nc <- nc_open("https://dapds00.nci.org.au/thredds/ncss/uc0/Test_pixel_count.nc?var=Band1&north=-22.9556&west=142&east=143&south=-25.0706&disableProjSubset=on&horizStride=1")
# Error in nc_open("https://dapds00.nci.org.au/thredds/ncss/uc0/Test_pixel_count.nc?var=Band1&north=-22.9556&west=142&east=143&south=-25.0706&disableProjSubset=on&horizStride=1") :
# Error in nc_open trying to open file https://dapds00.nci.org.au/thredds/ncss/uc0/Test_pixel_count.nc?var=Band1&north=-22.9556&west=142&east=143&south=-25.0706&disableProjSubset=on&horizStride=1
or this file directly from a website:
nc <- nc_open("https://www.unidata.ucar.edu/software/netcdf/examples/ECMWF_ERA-40_subset.nc")
# syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
# context: <!DOCTYPE^ HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><html><head><title>404 Not Found</title></head><body><h1>Not Found</h1><p>The requested URL was not found on this server.</p></body></html>
# Error in R_nc4_open: NetCDF: file not found
# Error in nc_open("https://www.unidata.ucar.edu/software/netcdf/examples/ECMWF_ERA-40_subset.nc") :
# Error in nc_open trying to open file https://www.unidata.ucar.edu/software/netcdf/examples/ECMWF_ERA-40_subset.nc
What could be the reason why some methods work and some don't, and are there any ways to fix this?
I ran into the same issue and contacted the creator of ncdf4. Here is his reply:
The ncdf4 R package is an interface to the underlying netcdf library on your machine. The R package cannot avoid limitations or problems with the netcdf library itself. You can always see how the underlying netcdf library handles a URL by just trying to use "ncdump" with the URL. For example,
% ncdump "https://dapds00.nci.org.au/thredds/dodsC/uc0/Test_pixel_count.nc"
Gives:
netcdf Test_pixel_count {
dimensions:
lat = 283 ;
lon = 451 ;
variables:
byte crs ;
crs:_Unsigned = "false" ;
crs:grid_mapping_name = "latitude_longitude" ;
crs:long_name = "CRS definition" ;
crs:longitude_of_prime_meridian = 0. ;
crs:semi_major_axis = 6378137. ;
crs:inverse_flattening = 298.257222101 ;
crs:spatial_ref = "GEOGCS[\"GEOCENTRIC DATUM of AUSTRALIA\",DATUM[\"GDA94\",SPHEROID[\"GRS80\",6378137,298.257222101]],PRIMEM[\"Greenwich\",0],UNIT[\"degree\",0.0174532925199433],AXIS[\"Longitude\",EAST],AXIS[\"Latitude\",NORTH]]" ;
crs:GeoTransform = "141.4607205456306 0.0075 0 -22.95189554231917 0 -0.0075 " ;
etc. I.e., the netcdf library itself can read this URL/file.
However when I do this:
% ncdump "https://dapds00.nci.org.au/thredds/ncss/uc0/Test_pixel_count.nc?var=Band1&north=-22.9556&west=142&east=143&south=-25.0706&disableProjSubset=on&horizStride=1"
I get this error:
ncdump "https://dapds00.nci.org.au/thredds/ncss/uc0/Test_pixel_count.nc?var=Band1&north=-22.9556&west=142&east=143&south=-25.0706&disableProjSubset=on&horizStride=1"
ncdump: https://dapds00.nci.org.au/thredds/ncss/uc0/Test_pixel_count.nc?var=Band1&north=-22.9556&west=142&east=143&south=-25.0706&disableProjSubset=on&horizStride=1: https://dapds00.nci.org.au/thredds/ncss/uc0/Test_pixel_count.nc?var=Band1&north=-22.9556&west=142&east=143&south=-25.0706&disableProjSubset=on&horizStride=1:
NetCDF: Malformed or unexpected Constraint
So the netcdf library itself cannot access that URL/file, and therefore ncdf4 cannot do so. There's not anything I would be able to do to the ncdf4 library to make up for this.
You might try contacting the netcdf library folks and ask if this is expected behavior or a bug.

How to deal with errors for loading a rdata file?

I need the data of the .rdata file for text mining. These are my dataset. I don't know exactly what's in that file. The problem is i can't load it.
I tried to open the file with different windows computers but with the same errors. I used RStudio in the updated version. I google the error-Information but nothing worked. Because I can open other rdata files there should be no registry problem. I wanted to check in an other basic windows Editor to look what is in the file but there were only signs like: ‹ ìùuPo³6
’#p ÜÝ‚»»»[pw—…»»»»»†àînÁuA°E`!‡ßûîýí}æÌLMÕÌùæŸÝõTñÈ}÷ÝrõÕ½
¢hXˆ
I tried different possibilities to open the file in RStudio with different error informations as followed:
with load()
require("readr")
setwd("C:/Users/..")
options(stringsAsFactors = F)
load("file")
# Error in load("file") :
# bad restore file magic number (file may be corrupted) -- no data loaded
# In addition: Warning message:
# file ‘.rdata’ has magic number ''
# Use of save versions prior to 2 is deprecated
with source()
require("readr")
setwd("C:/Users/..")
options(stringsAsFactors = F)
source("file")
# Error in source("file") :
# file.rdata:1:1: unexpected input
# 1:
# ^
readRDS
setwd("C:/Users/..")
options(stringsAsFactors = F)
readRDS("file")
# Error in readRDS("file") : unknown input format

how to import .rec files in R

I have a .rec file that I want to import into R. I have saved the .rec file to my working directory. This is what I have tried.
library(foreign)
library(RODBC)
data.test <- read.epiinfo("data_in.rec")
I get this error:
Error in if (headerlength <= 0L)
stop("file has zero or fewer variables: probably not an EpiInfo file") :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1:
In readLines(file, 1L, ok = TRUE) :
line 1 appears to contain an embedded nul
2:
In strsplit(line, " ") : input string 1 is invalid in this locale
I have looked online and in the read.epiinfo help package in R. The help package says
Some later versions of Epi Info use the Microsoft Access file format
to store data. That may be readable with the RODBC package.
I have two questions.
1. Is the error I am getting because the .rec file I have is from an Epi Info version later than 6?
2. How do I use the RODBC library to open the .rec file?
The .rec (or .REC) file turned out to be a .EDF (European Data Format) file type. It was easily opened in R using the library edfReader. The edfReader library help file is very useful for opening the file and extracting the time series data. See code below for what I used. Code was adapted from the help file.
install.packages('edfReader')
library(edfReader)
?edfReader
lib.dir <- system.file("data_in.rec",package="edfReader")
Cfile <- paste(lib.dir,'/edfPlusC.edf',sep='')
CHdr <- readEdfHeader("data_in.rec")
CSignals <- readEdfSignals(CHdr)
summary(CSignals)

trouble unzipping file under Windows

I have the following code:
download.file(
"http://www.wikipathways.org//wpi/batchDownload.php?species=Homo%20sapiens&fileType=txt",
destfile="human.zip")
files <- unzip( "human.zip", list=T)
It works on Linux, but throws the following error on Windows:
Error in unzip("human.zip", list = T) :
error -103 with zipfile in unzGetCurrentFileInfo
Do you happen to know what's the problem?
In ?download.file, we read that:
If mode is not supplied and url ends in one of .gz, .bz2, .xz, .tgz,
.zip, .rda or .RData a binary transfer is done. Since Windows (unlike
Unix-alikes) does distinguish between text and binary files, care is
needed that other binary file types are transferred with mode = "wb".
Note that this list does not include .zip, although it is a binary file type. So you need to pass mode="wb".
I cannot reproduce your example, but it solved my identical problem. Here is an example:
url <- "https://www.bls.gov/cex/pumd/ce_pumd_interview_diary_dictionary.xlsx"
download.file(url, 'file1.xlsx')
download.file(url, 'file2.xlsx', mode="wb") # Try this instead
library(readxl)
read_xlsx('file1.xlsx', sheet='Variables') # Fails
# Error in sheets_fun(path) :
# Evaluation error: error -103 with zipfile in unzGetCurrentFileInfo
read_xlsx('file2.xlsx', sheet='Variables') # Works
# A tibble: 3,580 x 13

Cannot read data from an xlsx file in RStudio

I have installed the required packages - gdata and ggplot2 and I have installed perl.
library(gdata)
library(ggplot2)
# Read the data from the excel spreadsheet
df = data.frame(read.xls ("AssignmentData.xlsx", sheet = "Data", header = TRUE, perl = "C:\\Strawberry\\perl\\bin\\perl.exe"))
However when I run this I get the following error:
Error in xls2sep(xls, sheet, verbose = verbose, ..., method = method, :
Intermediate file 'C:\Users\CLAIRE~1\AppData\Local\Temp\RtmpE3UYWA\file8983d8e1efc.csv' missing!
In addition: Warning message:
running command '"C:\STRAWB~1\perl\bin\perl.exe" "C:/Users/Claire1992/Documents/R/win-library/3.1/gdata/perl/xls2csv.pl" "AssignmentData.xlsx" "C:\Users\CLAIRE~1\AppData\Local\Temp\RtmpE3UYWA\file8983d8e1efc.csv" "Data"' had status 2
Error in file.exists(tfn) : invalid 'file' argument
Thanks to #Stibu I realised I had to set my work directory. This is the command you use to run in Rstudio; setwd("C/Documents..."). The file path is where the excel file is located.
I had the issue but I solved it differently.
My problem was because my file was saved as Excel (extension .xls) but it was a txt file.
I corrected the file and I did not meet any other error with the R function.

Resources