"no valid lines found for this format" error when using readcol to read a FITS file - idl

I'm trying to write a script that not only reads in FITS files, but will then print and save the data to a table. So far my code does not seem to have a problem reading the files; printing them seems to be the issue. This is my code so far (when I run this I get the error message "no valid lines found for this format":
;Planck File read (used to read in and print individual fits files)
pro planck_file_read
readcol,'COM_PCCS_857_R1.20.fits',name,glon,glat,ra,dec,detflux,detflux_err,aperflux,aperflux_err,psfflux,psfflux_err,gauflux,gauflux_err,gau_semi1,gau_semi1_err,gau_semi2,gau_semi2_err,gau_theta,gau_theta_err,gau_fwhm_eff,extended,cirrus_n,ext_val,ercsc
openw,lun,'fits_857.tbl',/get_lun,width=400
printf,lun,'; ; name GLON GLAT RA DEC DETFLUX DETFLUXERR APERFLUX APERFLUXERR PSFFLUX PSFFLUXERR GAUFLUX GAUFLUXERR GAUSEMI1 GAUSEMI1ERR GAUSEMI2 GAUSEMI2ERR GAUTHETA GAUTHETAERR GAUFWHMEFF EXTENDED CIRRUSN EXTVAL ERCSC
printf,lun,'; ; DEG DEG DEG DEG MJY MJY MJY MJY MJY MJY MJY MJY ARCMIN ARCMIN ARCMIN ARCMIN DEG DEG ARCMIN NONE NONE NONE NONE
for i=0,n_elements(fits_name)-1 do printf,lun,name[i],glon[i],glat[i],ra[i],dec[i],detflux[i],detflux_err[i],aperflux[i],aperflux_err[i],psfflux[i],psfflux_err[i],gauflux[i],gauflux_err[i],gau_semi1[i],gau_semi1_err[i],gau_semi2[i],gau_semi2_err[i],gau_theta[i],gau_theta_err[i],gau_fwhm_eff[i],extended[i],cirrus_n[i],ext_val[i],ercsc[i]
free_lun,lun
end

That error message is coming from READCOL. READCOL is designed to read ASCII files, not FITS files. Use FITS routines like FITS_OPEN, FITS_READ, and FITS_CLOSE to read the data.

READCOL is designed to read free-format ASCII files with columns of data into vectors. You have to provide it with the exact number of columns you have in your data file in order to read the file in correctly. For instance, if I write
READCOL, 'file.txt', name, date, ID, num_cookies
and the file actually has another column of number of cakes, it won't read anything in because it will look for lines where there are only 4 variables. You can skip variables you don't want if you include a FORMAT string in your call to READCOL, like
READCOL, 'file.txt', name, date, ID, num_cookies, FORMAT = '(A,F,F,I,X)'
where 'X' indicates that there is a variable there that you are skipping.
But, if your file is a fits file it is probably formatted differently and you should look into the FITS routines as #mgalloy suggested above.

Related

Failure of unz() to unzip from a zip file offset of more than 2^31 bytes

I have been obtaining .zip archives of genome annotation from NCBI (mainly gff files). In order save disk space I prefer not to unzip the archive, but to read these files directly into R using unz(). However, it seems that unz() is unable to extract files from the end of 'large' zip files:
ncbi.zip <- "file_location/name.zip"
files <- unzip(ncbi.zip, list=TRUE)
gff.files <- files$Name[ grep("gff$", files$Name) ]
## this works
gff.128 <- readLines( unz(ncbi.zip, gff.files[128]) )
## this gives an empty data structure (read.table() stops
## with an error saying no lines or similar
gff.129 <- readLines( unz(ncbi.zip, gff.files[129]) )
## there are 31 more gff files after the 129th one.
## no lines are read from any of these.
The zip file itself seems to be fine; I can unzip the specific files using unzip on the command line and unzip -t does not report any errors.
I've tried this with R versions 3.5 (openSuse Leap 15.1), 3.6, and 4.2 (centOS 7) and with more than one zip file and get exactly the same result.
I attached strace to R whilst reading in the 128 and 129th file. In both cases I get a lot of lseek towards the end of file (offset 2845892608, larger than 2^31) to start with. This is where I assume the zip directory can be found. For the 128th file (the one that can be read), I eventually get an lseek to an offset slightly below 2^31, followed by a set of lseeks and reads (that extend beyone 2^31).
For the 129th file, I get the same reads towards the end of the file, but then rather than finding a position within the file I get:
lseek(3, 2845933568, SEEK_SET) = 2845933568
lseek(3, 4294963200, SEEK_SET) = 4294963200
read(3, "", 4096) = 0
lseek(3, 4095, SEEK_CUR) = 4294967295
read(3, "", 4096) = 0
Which is a bit weird since the file itself is only about 2.8 GB. 4294967295, is of course 2^32 - 1.
To me this feels like an integer overflow bug, and I am considering to post a bug report. But am wondering if anyone has seen something similar before or if I am doing something stupid.
Having done what I should have started with (reading the specification for the zip64 format specification), it's actually clear that this is not an integer overflow error.
Zip files contain a central directory at the end of the archive; this contains amongst other things the names of the compressed files and the offset of the compressed data in the zip archive. The offset (and file size fields) are only given 4 bytes each in the standard directory field; when the offset is larger than this it should instead be given in the extra fields section and the value in the standard field should be set to 0xFFFFFFFF. Since this is the offset that gets used when reading the file it seems clear that the problem lies in the parsing of the extra field.
I had a look at the source code for R 4.2.1 and it seems that the problem is due to the way the offset specified in the standard offset field is tested:
if(file_info.uncompressed_size == (ZPOS64_T)(unsigned long)-1)
changing this == 0xFFFFFFFF seems to fix the problem.
I've submitted a bug report to R. Hopefully changing the check will not have any unintended consequences and the issue will be fixed.
Still, I'm curious as to whether anyone else has come across the same issue. Seems a bit unlikely that my experience is unique.

Parse .a2l and .hex files

I'm trying to parse some .a2l and .hex files to extract variables and their values. So far l don't know how to find the values of the variables in the .hex file. Here is a link to download an example of these files.
To be more specific : How can I read the value at the address 0x810600 in the .hex file ?
/begin CHARACTERISTIC ASAM.C.DEPENDENT.REF_1.SWORD
"Dependent SWORD"
VALUE
0x810600
RL.FNC.SWORD.ROW_DIR
0
CM.IDENTICAL
-32268 32267
/begin DEPENDENT_CHARACTERISTIC
"X1 + 5"
ASAM.C.SCALAR.SBYTE.IDENTICAL
/end DEPENDENT_CHARACTERISTIC
DISPLAY_IDENTIFIER DI.ASAM.C.DEPENDENT.REF_1.SWORD
/end CHARACTERISTIC
In the same A2L, please find RL.FNC.SWORD.ROW_DIR item, I guess it might be kind of signed word (2 bytes) type.
I'm not sure if this is kind of array or some special type... I assume this is just single variable (scalar).
Again, find CM.IDENTICAL item, as it's name maybe it's identical compu_method. This means HEX value 0 -> displayed screen as 0, HEX value 100 -> displayed screen as 100, ... identical between internal value and physical value. No special conversion I guess.
Go to the address 0x810600 in HEX then you can find some values there. As it is identical compu_method type, the value in HEX might be identically displayed in M/C SW (INCA, Vision, CANape, ...) I guess.
HEX is of intel hex format. This format is used to map each part of the file to a part in virtual address space of device. You can also use the following command if you use Linux:
objdump -s file.hex

Reading binary file of lake depths in R

I am trying to open a file in R, which is binary and written in Fortran. The file is called GlobalLakeDepth.dat and is available at: http://www.flake.igb-berlin.de/gldbv2.tar.gz
The instructions specify that to open GlobalLakeDepth.dat (in Fortran), one would need to do the following:
An example of opening the binary file in FORTRAN90:
-- open(1, file = 'GlobalLakeDepth.dat', form='unformatted', access='direct', recl=2)
An example of reading the binary file in FORTRAN90:
-- read(1,rec=n) LakeDepth
-- where: n - record number, INTEGER(8);
LakeDepth - mean lake depth in decimeters, INTEGER(2).
My question is: Given these instructions in Fortran, how can I open this file in R? That is, is there an 'R way' of doing this?
I've been following the instructions at http://www.ats.ucla.edu/stat/r/faq/read_binary.htm, but, am still not any closer to getting anything from the data file. All I need is the information provided on the measured lake bathemetry for 36 large lakes.
You can use readBin to read a binary file. For this file, I think the correct command is
lk <- readBin("GlobalLakeDepth.dat", n = 43200 * 21600, what = "integer", endian = "little", size = 2)
This makes a very long vector that could be made into a 43200 * 21600 matrix.

Trouble concatenating netcdf files with ncrcat

I have a list of netcdf files that I am trying to concatenate along the time dimension.
I am attempting to use the steps outlined here, which seem simple enough. However, I am running into some errors (likely some small/stupid oversight on my part...)
When I try to first make time a record dimension, I am using the following command:
ncks -O --mk_rec_dmn time TiMREX_20080526_000001.nc test_out.nc
This, however, give me the following error:
ncks: invalid option -- '-'
It seems like this is just some simple syntax/typo error on my part, but try as I might I can' find anything wrong.
Just to be sure, when I run a ncdump -h on the file, it confirms that there is indeed a time dimension
ncdump -h TiMREX_20080526_000001.nc
netcdf TiMREX_20080526_000001 {
dimensions:
time = 1 ;
bounds = 2 ;
x0 = 300 ;
y0 = 300 ;
z0 = 40 ;
Additionally, if I try to skip this step and just go right to the ncrcat part...
ncrcat -O TiMREX_20080526_000001.nc TiMREX_20080526_000733.nc test_out.nc
I get the following error:
ncopen: filename "TiMREX_20080526_000001.nc": Not a netCDF file
Which is especially odd...I'm pretty confident it is indeed at netCDF file (I just ran ncdump on it after all, and have no problem viewing it with ncview...)
Any thoughts? What simple step am I embarrassingly missing?
This is a weird error as your command looks syntactically correct. To be sure, I copied it to my machine where it ran as expected, with no 'invalid option' error. Thus I am unable to reproduce the problem. Based on the error message you report, it seems as though you might (somehow) be using a character that the system does not understand as a dash. In other words, the error you report is what I would expect if ncks received a funky character that looks like a dash but is not really a dash. Maybe when you copy it to stackoverflow it gets converted to a dash, so it works for me (try copying your own command above back into your console). Make sure the dash character you type is the same as the minus sign on a normal keyboard, and something else. Some keyboard/character sets make characters that look similar to dashes but are not ASCII dashes. Good luck.

Reading in a binary grid file in Fortran 90

I'm having issues when trying to read in a binary file I've previously written into another program. I have been able to open it and read it to an array with out compilation errors, however, the array is not populated (all 0's). Any suggestions or thoughts would be great. Here is the open/read statement I'm using:
allocate(dummy(imax,jmax))
open(unit=io, file=trim(input), form='binary', access='stream', &
iostat=ioer, status='old', action='READWRITE')
if(ioer/=0) then
print*, 'Cannot open file'
else
print*,'success opening file'
end if
read(unit=io, fmt=*, iostat=ioer) dummy
j=0
k=0
size: do j=1, imax
do k=1, jmax
if(dummy(j,k) > 0.) print*,dummy(j,k)
end do
end do size
Please let me know if you need more info.
Here is how the file is originally written:
out_file = trim(output_dir)//'SEVIRI_FRP_.08deg_'//trim(season)//'.bin'
print*, out_file
print*, i_max,' i_max,',j_max,' j_max'
open (io, file = out_file, access = 'direct', status = 'replace', recl = i_max*j_max*4)
write(io, rec = 1) sev_frp
write(io, rec = 2) count_sev_frp
write(io, rec = 3) sum_sev_frp
check: do n=1, i_max
inna: do m=1, j_max
!if (sev_frp(n,m) > 0) print*, count_sev_frp(n,m)
end do inna
end do check
print*,'n-',n,'m-',m
close(io)
First of all the form takes two possible values as far as I know: "FORMATTED" or "UNFORMATTED".
Second, to read, you should use a open that is symmetric to the open statement that you used to write the file, Unless you know exactely what you are doing. I suggest that for reading, you open with:
open(unit=io, file=trim(input), access='direct', &
iostat=ioer, status='old', action='READ', recl = i_max*j_max*4)
That corresponds to the open statement that you used to save the file.
As innoSPG says, you have a mismatch in the way the file is written and how it is read.
An external file may be connected with one of three access methods: sequential; direct; stream. Further, a connection may be formatted or unformatted.
When the file is opened for writing it uses the direct access method with unformatted records. The records are unformatted because this is the default (in the abscence of the form= specifier).
When you open the file for reading you use the non-standard extension of form="binary" and stream access. There is possibly nothing wrong with this, but it does require care.
However, with the read statements you are using formatted (list-directed) input. This will not be allowed.
The way suggested in the previous answer, of using a similar access method and record length will require a further change to the code. [You'll also need to set the value of the record length somehow.]
Not only will you need to remove the format, to match the unformatted records written, but you'll want to use the rec= specifier to access the records of the file.
Finally, if you are using the iostat= specifier you really should check the resulting value.

Resources