I have a gzip file of size 325 MB. I just figured it that it is truncated by 361 bytes from the beginning.
Please advise how can I recover the compressed files from it.
You need to find the next deflate block boundary. Such a boundary can occur at any bit location. You will need to attempt decompression starting at every bit until you get successful decoding for at least a few deflate blocks.
You can use zlib's inflatePrime() to feed less than a byte to inflate(). You can use inflateSetDictionary() to provide a faux 32K dictionary to precede the data being inflated, in order to avoid distance-too-far-back errors.
Once you find a block boundary, you have solved half the problem. The next half is to find where in the deflate stream there is no longer a dependence on the unknown uncompressed data derived from that missing 361 bytes of compressed data. It is possible for such a dependency to very long lasting. For example, if the word " the " appears in that missing section, then it can be referred to after that as a missing string. However, you don't know that it is " the ". All you know is that there is a reference to a five-byte string in the missing data. Then where that five-byte string is copied to can itself be referenced by a later match. This could, in principle, propagate through the entire 325 MB, making the whole thing completely unrecoverable.
However that is unlikely. It is more likely that at some point the propagation of strings from the first 361 bytes stops. From there on, you can recover the uncompressed data.
In order to tell whether you are still seeing propagation or not, do the decompression twice. Once with an initial faux dictionary of all 0's, and once with an initial faux dictionary of all 1's. Where the decompressed data is the same for both decompressions, you have successfully recovered that data.
Then you will need to go up to the next level of structure in that data, and see if you can somehow make use of what you have recovered.
Good luck. And don't cut off the first 361 bytes next time.
Below is example code that does what is described above.
/* salvage -- recover data from a corrupted deflate stream
* Copyright (C) 2015 Mark Adler
* Version 1.0 28 June 2015 Mark Adler
*/
/*
This software is provided 'as-is', without any express or implied
warranty. In no event will the author be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
Mark Adler
madler#alumni.caltech.edu
*/
/* Attempt to recover deflate data from a corrupted stream. The corrupted data
is read on stdin, and any reliably decompressed data is written to stdout. A
deflate stream is deemed to have been found successfully if there are eight
or fewer bytes of compressed data unused when done. This can be changed
with the MAXLEFT macro below, or the conditional that currently uses
MAXLEFT. */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include <assert.h>
#include "zlib.h"
/* Get the size of an allocated piece of memory (usable size -- not necessarily
the requested size). */
#if defined(__APPLE__) && defined(__MACH__)
# include <malloc/malloc.h>
# define memsize(p) malloc_size(p)
#elif defined (__linux__)
# include <malloc.h>
# define memsize(p) malloc_usable_size(p)
#elif defined (_WIN32)
# include <malloc.h>
# define memsize(p) _msize(p)
#else
# error You need to find an allocated memory size function
#endif
#define local static
/* Load an entire file into a memory buffer. load() returns 0 on success, in
which case it puts all of the file data in *dat[0..*len - 1]. That is,
unless *len is zero, in which case *dat is NULL. *data is allocated memory
which should be freed when done with it. load() returns zero on success,
with *data == NULL and *len == 0. The error values are -1 for read error or
1 for out of memory. To guard against bogging down the system with
extremely large allocations, if limit is not zero then load() will return an
out of memory error if the input is larger than limit. */
local int load(FILE *in, unsigned char **data, size_t *len, size_t limit)
{
size_t size = 1048576, have = 0, was;
unsigned char *buf = NULL, *mem;
*data = NULL;
*len = 0;
if (limit == 0)
limit--;
if (size >= limit)
size = limit - 1;
do {
/* if we already saturated the size_t type or reached the limit, then
out of memory */
if (size == limit) {
free(buf);
return 1;
}
/* double size, saturating to the maximum size_t value */
was = size;
size <<= 1;
if (size < was || size > limit)
size = limit;
/* reallocate buf to the new size */
mem = realloc(buf, size);
if (mem == NULL) {
free(buf);
return 1;
}
buf = mem;
/* read as much as is available into the newly allocated space */
have += fread(buf + have, 1, size - have, in);
/* if we filled the space, make more space and try again until we don't
fill the space, indicating end of file */
} while (have == size);
/* if there was an error reading, discard the data and return an error */
if (ferror(in)) {
free(buf);
return -1;
}
/* if a zero-length file is read, return NULL for the data pointer */
if (have == 0) {
free(buf);
return 0;
}
/* resize the buffer to be just big enough to hold the data */
mem = realloc(buf, have);
if (mem != NULL)
buf = mem;
/* return the data */
*data = buf;
*len = have;
return 0;
}
#define DICTSIZE 32768
#if UINT_MAX <= 0xffff
# define BUFSIZE 32768
#else
# define BUFSIZE 1048576
#endif
/* Inflate the provided buffer starting at a specified bit offset. Use an
already-initialized inflate stream structure for rapid repeated attempts.
The structure needs to have been initialized using inflateInit2(strm, -15).
Inflation begins at data[off], starting at bit bit in that byte, going from
that bit to the more significant bits in that byte, and then on to the next
byte. bit must be in the range 0..7. bit == 0 uses the entire byte at
data[off]. bit == 7 uses only the most significant bit of the byte at
data[off]. Before inflation, the dictionary is initialized to
dict[0..DICTSIZE-1] so that references before the start of the uncompressed
data do not stop inflation. Inflation continues as long as possible, until
either an error is encountered, the end of the deflate stream is reached, or
data[len-1] is processed. On entry *recoup is a pointer to allocated memory
or NULL, and on return *recoup points to allocated memory with the
decompressed data. *got is set to the number of bytes of decompressed data
returned at *recoup.
inflate_at() returns Z_DATA_ERROR if an error was detected in the alleged
deflate data, Z_STREAM_END if the end of a valid deflate stream was reached,
or Z_OK if the end of the provided compressed data was reached without
encountering an erorr or the end of the stream. */
local int inflate_at(z_stream *strm, unsigned char *data, size_t len,
size_t off, int bit, size_t *unused, unsigned char *dict,
unsigned char **recoup, size_t *got)
{
int ret;
size_t left, size;
/* check input */
assert(data != NULL && off < len && bit >= 0 && bit <= 7);
assert(dict != NULL && recoup != NULL);
/* set up inflate engine, feeding first few bits if necessary */
ret = inflateReset(strm);
assert(ret == Z_OK);
ret = inflateSetDictionary(strm, dict, DICTSIZE);
assert(ret == Z_OK);
if (bit) {
ret = inflatePrime(strm, 8 - bit, data[off] >> bit);
assert(ret == Z_OK);
off++;
}
/* inflate as much as possible */
strm->next_in = data + off;
left = len - off;
*got = 0;
do {
strm->avail_in = left > UINT_MAX ? UINT_MAX : left;
left -= strm->avail_in;
do {
/* assure at least BUFSIZE available in recoup */
size = memsize(*recoup);
if (*got + BUFSIZE > size) {
size = size ? size << 1 : BUFSIZE;
assert(size != 0);
*recoup = reallocf(*recoup, size);
assert(*recoup != NULL);
}
/* inflate into recoup */
strm->next_out = *recoup + *got;
strm->avail_out = BUFSIZE;
ret = inflate(strm, Z_NO_FLUSH);
assert(ret != Z_STREAM_ERROR && ret != Z_MEM_ERROR);
/* set the number of compressed bytes unused so far, in case we
return */
if (unused != NULL)
*unused = left + strm->avail_in;
/* update the number of uncompressed bytes generated */
*got += BUFSIZE - strm->avail_out;
/* if we cannot continue to decompress, then return the reason */
if (ret == Z_DATA_ERROR || ret == Z_STREAM_END)
return ret;
/* continue with provided input data until all output generated */
} while (strm->avail_out == 0);
assert(strm->avail_in == 0);
/* provide more input data, if any */
} while (left);
/* ran through all compressed data with no errors or end of stream */
return Z_OK;
}
/* The criteria for success is the completion of inflate with no more than this
many bytes unused. (8 is the length of a gzip trailer.) */
#define MAXLEFT 8
/* Read a corrupted (or not) deflate stream from stdin and write the reliably
recovered data to stdout. */
int main(void)
{
int ret, bit;
unsigned char *data = NULL, *recoup = NULL, *comp = NULL;
size_t len, off, unused, got;
z_stream strm;
unsigned char dict[DICTSIZE] = {0};
/* read input into memory */
ret = load(stdin, &data, &len, 0);
if (ret < 0)
fprintf(stderr, "file error reading input\n");
if (ret > 0)
fprintf(stderr, "ran out of memory reading input\n");
assert(ret == 0);
fprintf(stderr, "read %lu bytes\n", len);
/* initialize inflate structure */
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
strm.next_in = Z_NULL;
strm.avail_in = 0;
ret = inflateInit2(&strm, -15);
assert(ret == Z_OK);
/* scan for an acceptable starting point for inflate */
for (off = 0; off < len; off++)
for (bit = 0; bit < 8; bit++) {
ret = inflate_at(&strm, data, len, off, bit, &unused, dict,
&recoup, &got);
if ((ret == Z_STREAM_END || ret == Z_OK) && unused <= MAXLEFT)
goto done;
}
done:
/* if met the criteria, show result and write out reliable data */
if (bit != 8 && (ret == Z_STREAM_END || ret == Z_OK)) {
fprintf(stderr,
"decoded %lu bytes (%lu unused) at offset %lu, bit %d\n",
len - off - unused, unused, off, bit);
/* decompress again with a different dictionary to detect unreliable
data */
memset(dict, 1, DICTSIZE);
inflate_at(&strm, data, len, off, bit, NULL, dict, &comp, &got);
{
unsigned char *p, *q;
/* search backwards from the end for the first unreliable byte */
p = recoup + got;
q = comp + got;
while (q > comp)
if (*--p != *--q) {
p++;
q++;
break;
}
/* write out the reliable data */
fwrite(q, 1, got - (q - comp), stdout);
fprintf(stderr,
"%lu bytes of reliable uncompressed data recovered\n",
got - (q - comp));
fprintf(stderr,
"(out of %lu total uncompressed bytes recovered)\n", got);
}
}
/* otherwise declare failure */
else
fprintf(stderr, "no deflate stream found that met criteria\n");
/* clean up */
free(comp);
free(recoup);
inflateEnd(&strm);
free(data);
return 0;
}
Related
Here, I tried to write a program(kprobe) to include the enum tcp mib like #tcp_states in the book BPF Performance Tools bpftrace. The enum tcp mib is in '/include/uapi/linux/snmp.h':
#!/usr/local/bin/bpftrace
#include <net/net_namespace.h>
#include <net/netns/mib.h>
#include <net/snmp.h>
#include <uapi/linux/snmp.h>
#define TCP_MIB_MAX __TCP_MIB_MAX
kprobe:sk_alloc
{
$net = (struct net *)arg0;
$mi = (struct netns_mib *)$net->mib;
$ib = (struct tcp_mib *)$mi;
#mib[1] = "TCP_MIB_NUM";
#mib[2] = "TCP_MIB_RTOALGORITHM";
#mib[3] = "TCP_MIB_RTOMIN";
#mib[4] = "TCP_MIB_RTOMAX;
#mib[5] = "TCP_MIB_MAXCONN";
#mib[6] = "TCP_MIB_ACTIVEOPENS";
#mib[7] = "TCP_MIB_PASSIVEOPENS";
#mib[8] = "TCP_MIB_ATTEMPTFAILS";
#mib[9] = "TCP_MIB_ESTABRESETS";
#mib[10] = "TCP_MIB_CURRESTAB";
#mib[11] = "TCP_MIB_INSEGS";
#mib[12] = "TCP_MIB_OUTSEGS";
#mib[13] = "TCP_MIB_RETRANSSEGS";
#mib[14] = "TCP_MIB_INERRS";
#mib[15] = "TCP_MIB_OUTRSTS";
#mib[16] = "TCP_MIB_CSUMERRORS";
printf("-------------------------------\n");
time();
printf("sk_alloc: %s pid: %d\n", comm, pid);
printf("\n");
printf("$ib: %u\n", $ib->miss[6]);
$mib_s = $ib->mibs[TCP_MIB_MAX];
$mib_str = #mib[$mib_s];
printf("TCP mib is: %s\n", $mib_str);
clear(#mib);
}
And when I tried to run it the output was:
the index 94779518808448 is out of bounds for array of size 16
Then I tried to instead of TCP_MIB_MAX, to put specific array positions e.g 5, (I modify the above code):
$mib_s = $ib->mibs[5];
And when I tried to run it, the output was:
...
-----------------------------
21:40:15
sk_alloc: systemd-logind pid: 920
$ib: 1516359680
TCP mib is:
-----------------------------
21:40:15
sk_alloc: systemd-logind pid: 920
$ib: 1516359680
TCP mib is:
...
Why does not show TCP mib? and shows nothing in the output?
How can I use the array properly to show #mib?
TCP_MIB_MAX and __TCP_MIB_MAX are equal to 16, which is equal to the size of the struct tcp_mib in the kernel:
enum
{
TCP_MIB_NUM = 0,
TCP_MIB_RTOALGORITHM, /* RtoAlgorithm */
TCP_MIB_RTOMIN, /* RtoMin */
TCP_MIB_RTOMAX, /* RtoMax */
TCP_MIB_MAXCONN, /* MaxConn */
TCP_MIB_ACTIVEOPENS, /* ActiveOpens */
TCP_MIB_PASSIVEOPENS, /* PassiveOpens */
TCP_MIB_ATTEMPTFAILS, /* AttemptFails */
TCP_MIB_ESTABRESETS, /* EstabResets */
TCP_MIB_CURRESTAB, /* CurrEstab */
TCP_MIB_INSEGS, /* InSegs */
TCP_MIB_OUTSEGS, /* OutSegs */
TCP_MIB_RETRANSSEGS, /* RetransSegs */
TCP_MIB_INERRS, /* InErrs */
TCP_MIB_OUTRSTS, /* OutRsts */
TCP_MIB_CSUMERRORS, /* InCsumErrors */ // == 15
__TCP_MIB_MAX // == 16
};
and
#define TCP_MIB_MAX __TCP_MIB_MAX
struct tcp_mib {
unsigned long mibs[TCP_MIB_MAX];
};
(include/uapi/linux/snmp.h and include/net/snmp.h)
But because arrays are indexed from 0, you can go only up to TCP_MIB_MAX - 1 when indexing $ib->mibs. This is why you get a complaint about the out-of-bound index.
Then when you select a smaller index, you can access to the array item as expected. But I'm not sure what you are trying to do with:
$mib_s = $ib->mibs[5];
$mib_str = #mib[$mib_s];
To me it looks like you are reading the value from the MIB ($ib->mibs[TCP_MIB_ACTIVEOPENS]), which may supply any value, possibly big, and likely null (I suspect this is the case here). Then you use that value as... an index in #mib? So if the counter is at 10k, you try to take the 10,000th cell of a 16-sized array? I suppose in your case the value is 0, so you are doing $mib_str = #mib[0], which is likely an empty string because you never set a value for #mib[0].
To fix all of this I would start by using the correct indices (from 0 to 15) for the #mib array, to avoid any confusion. Then you probably need to rethink what you are trying to print exactly, but I'm not sure that the two lines above is what you want.
I'm having a problem getting a CTRL slice.
I'm trying to analyze OpenSSL by running this:
the code is like below
int dtls1_process_heartbeat(SSL *s)
{
unsigned char *p = &s->s3->rrec.data[0], *pl;
unsigned short hbtype;
unsigned int payload;
unsigned int padding = 16; /* Use minimum padding */
/* Read type and payload length first */
hbtype = *p++;
n2s(p, payload);
pl = p;
if (s->msg_callback)
s->msg_callback(0, s->version, TLS1_RT_HEARTBEAT,
&s->s3->rrec.data[0], s->s3->rrec.length,
s, s->msg_callback_arg);
if (hbtype == TLS1_HB_REQUEST)
{
unsigned char *buffer, *bp;
int r;
/* Allocate memory for the response, size is 1 byte
* message type, plus 2 bytes payload length, plus
* payload, plus padding
*/
buffer = OPENSSL_malloc(1 + 2 + payload + padding);
bp = buffer;
/* Enter response type, length and copy payload */
*bp++ = TLS1_HB_RESPONSE;
s2n(payload, bp);
/*# slice pragma stmt; */
memcpy(bp, pl, payload);
bp += payload;
/* Random padding */
RAND_pseudo_bytes(bp, padding);
r = dtls1_write_bytes(s, TLS1_RT_HEARTBEAT, buffer, 3 + payload + padding);
if (r >= 0 && s->msg_callback)
s->msg_callback(1, s->version, TLS1_RT_HEARTBEAT,
buffer, 3 + payload + padding,
s, s->msg_callback_arg);
OPENSSL_free(buffer);
if (r < 0)
return r;
}
else if (hbtype == TLS1_HB_RESPONSE)
{
unsigned int seq;
/* We only send sequence numbers (2 bytes unsigned int),
* and 16 random bytes, so we just try to read the
* sequence number */
n2s(pl, seq);
if (payload == 18 && seq == s->tlsext_hb_seq)
{
dtls1_stop_timer(s);
s->tlsext_hb_seq++;
s->tlsext_hb_pending = 0;
}
}
return 0;
}
`
frama-c ./ssl/d1_both.c -main dtls1_process_heartbeat -slice-calls memcpy -cpp-command "gcc -C -E -I ./include/ -I ./" -then-on 'Slicing export' -print
That produced nothing, so I then tried this: want to get a backforward slicing
frama-c ./ssl/d1_both.c -main dtls1_process_heartbeat -slice-pragma dtls1_process_heartbeat -cpp-command "gcc -C -E -I ./include/ -I ./" -then-on 'Slicing export' -print
But I still get nothing like that
void dtls1_process_heartbeat(void);
void dtls1_process_heartbeat(void)
{
return;
}
How can I get a slice like that?
function A (){
…
memcpy()
...
}
function B (){
…
…
...
}
function C (){
…
memcpy()
...
}
I want to capture everything to do with memcpy(), so I want to keep A and C, but not B.
How should I choose an entry point? How do I choose the pragma?
I hope I've stated my question clearly; it's had me confused for days.
First, notice that Frama-C Fluorine is an obsolete version. It has been released more than 3 years ago. Some slicing-related bugs have been fixed in the meantine. Please upgrade to a newer version, preferably Aluminium.
Second, the documentation for option -slicing-value is
select the result of left-values v1,...,vn at the
end of the function given as entry point (addresses are
evaluated at the beginning of the function given as entry
point)
It is unlikely to do what you want. Did you try option -slice-calls, more precisely -slice-calls memcpy ?
Also, keep in mind that B will be kept in the slice if it computes a value that is later used within a call to memcpy.
I'm just learning how to handle sockets and TCP connections in C. I've got an application (a long one) which basically sends and receives char arrays with the system call write from server to client and vice versa (two separate C applications of course). As long as I use it with a local connection, on the same PC, running the server on a terminal and the client on an another, everything just works fine and the data arrives at the destination. But if I try it with the server on one computer and the client on another but on the same internet line, passing to the client an address like 192.168.1.X (took from the machine on which the server is running), after the connection is established, I've got an error that tells me that the number of expected bytes (which I pass before sending the real char[]) isn't arrived. Same thing if I try the server on my PC, and the client on another one with a different line on a different provider.
There's something I'm missing, are there any limitations in sending a bunch of bytes in sequence?
The code where the error pops up.
SERVER SIDE:
r=htonl(lghstr);
w=write(myFd,&r,sizeof(int));//writes the number of incoming bytes
if(w<0) perror("writeServer4"),exit(-1);
w=write(myFd,tmp->string,lghstr);
if(w<0) perror("writeServer5"),exit(-1);
if(w!=lghstr) perror("ERROR");
CLIENT SIDE
rC=read(fdc,&cod,sizeof(int));//read incoming number of bytes
lghstr=ntohl(cod);
if(rC<0) perror("readClient3"),exit(-1);
rC=read(fdc,dest,lghstr);
if(rC<0) perror("readClient4"),exit(-1);
if(rC!=lghstr) perror("error : "), printf("didn't read the right number of bytes"),exit(-1);
Now this is basically repeated a lot of times, let's even say 300 times, and it's with big numbers that the program doesn't work.
This is the problem:
rC=read(fdc,dest,lghstr);
...
if(rC!=lghstr) perror("error : ")
The #1 fallacy with socket programming is expecting that recv() and read() will return exactly the same number of bytes corresponding to the write/send call made by the other side.
In reality, partial data is extremely likely and expected. The simple workaround is to loop on read/recv until you get the exact number of bytes expected:
size_t count = 0;
while (count < lghstr)
{
ssize_t readresult = read(fdc, dest+count, lghstr-count);
if (readresult == -1)
{
// socket error - handle appropriately (typically, just close the connection)
}
else if (readresult == 0)
{
// The other side closed the connection - handle appropriately (close the connection)
}
else
{
count += readresult;
}
}
The other alternative to looping is to the use the MSG_WAITALL flag with the socket. This means, using recv() instead of read(). You'll still need to handle the error cases.
rc = recv(fdc, dest, lghstr, MSG_WAITALL);
if (rc == -1)
{
// socket error
}
else if (rc == 0)
{
// socket closed by remote
}
else if (rc < lghstr)
{
// the other side likely closed the connection and this is residual data (next recv will return 0)
}
You do ntohl() on one side and not the other. That might be interpreting the bytes with the wrong value.
You should printf() the bytes on both sides and see what the int is being evaluated to.
Edit: I'm convinced this is a programming bug for the record.
If I had to guess, I'd say that you are not synchronous with the other side for some reason. You say this runs 'about 300 times'.
Try adding a magic integer to the protocol.
Heres an example of a client that sends in this order.
A magic integer which is always constant.
A lengh of bytes about to be sent.
The bytes to be sent.
This uses scatter gather mechanics (its nicer for serialization) but other than that it effectively is doing the same thing yours is doing, as a client, just adding a magic value.
When the receiver receives the data, it can validate that the data is coming in the right order, by checking what the magic number was that came in. If the magic is wrong it means the client or server has lost themselves positionally in the stream.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/uio.h>
#include <err.h>
#include <time.h>
#define MAGIC 0xDEADBEEFLU
#define GARBAGE_MAX 65536
const int iterations = 3000;
char * create_garbage_buf(
void)
{
int rc = -1;
int fd = -1;
char *buf = NULL;
buf = malloc(GARBAGE_MAX);
if (!buf)
err(1, "Cannot allocate buf");
fd = open("/dev/urandom", O_RDONLY);
if (fd < 0)
err(1, "Cannot open urandom");
rc = read(fd, buf, GARBAGE_MAX);
if (rc < 0)
err(1, "Cannot read from urandom");
else if (rc != GARBAGE_MAX)
errx(1, "Expected %d bytes, but got %d reading from urandom",
GARBAGE_MAX, rc);
close(fd);
return buf;
}
int main() {
int fd, offset, i, rc;
uint32_t magic = MAGIC;
uint32_t blen = 0;
char *buf = NULL;
struct iovec vecs[3];
/* Seed poor random number generator */
srand(time(NULL));
/* Use a file for demonstration, but a socket will do just fine */
fd = open("/dev/null", O_WRONLY);
/* Create some garbage to send */
buf = create_garbage_buf();
if (fd < 0)
err(1, "Cannot open file");
/* The first vector, is always the magic */
vecs[0].iov_len = sizeof(uint32_t);
vecs[0].iov_base = &magic;
for (i=0; i < iterations; i++) {
/* The second vector represents lengh of what we send
* in this demonstration it is a number between 0 and
* GARBAGE_MAX/2.
*/
blen = rand() % (GARBAGE_MAX / 2);
vecs[1].iov_len = sizeof(uint32_t);
vecs[1].iov_base = &blen;
/* The last record is the data to send. Its another random
* number between 0 and GARBAGE_MAX which represents the offset
* in our garbage data to send */
offset = rand() % (GARBAGE_MAX / 2);
vecs[2].iov_len = blen;
vecs[2].iov_base = &buf[offset];
rc = writev(fd, vecs, 3);
if (rc < 0)
err(1, "Could not write data");
if (rc != (sizeof(uint32_t)*2 + blen))
errx(1, "Did not write proper number of bytes to handle");
printf("Wrote %u bytes from offset %u in garbage\n", blen, offset);
}
free(buf);
printf("Done!\n");
return 0;
}
Closely read the documentation for read()/write() and learn that those two functions do not necessarily read()/write() as much bytes as they were told to, but few. So looping around such calls counting until all data expected had been read/written is a good idea, not to say an essential necessity.
For examples how this could be done for writing you might like to have look at this answer: https://stackoverflow.com/a/24260280/694576 and for reading on this answer: https://stackoverflow.com/a/20149925/694576
I'm decompressing gzip data received from a http server, using the zlib library from Qt. Because qUncompress was no good, I followed the advice given here: Qt quncompress gzip data and created my own method to uncompress the gzip data, like this:
QByteArray gzipDecompress( QByteArray compressData )
{
//strip header
compressData.remove(0, 10);
const int buffer_size = 16384;
quint8 buffer[buffer_size];
z_stream cmpr_stream;
cmpr_stream.next_in = (unsigned char *)compressData.data();
cmpr_stream.avail_in = compressData.size();
cmpr_stream.total_in = 0;
cmpr_stream.next_out = buffer;
cmpr_stream.avail_out = buffer_size;
cmpr_stream.total_out = 0;
cmpr_stream.zalloc = Z_NULL;
cmpr_stream.zfree = Z_NULL;
cmpr_stream.opaque = Z_NULL;
int status = inflateInit2( &cmpr_stream, -8 );
if (status != Z_OK) {
qDebug() << "cmpr_stream error!";
}
QByteArray uncompressed;
do {
cmpr_stream.next_out = buffer;
cmpr_stream.avail_out = buffer_size;
status = inflate( &cmpr_stream, Z_NO_FLUSH );
if (status == Z_OK || status == Z_STREAM_END)
{
QByteArray chunk = QByteArray::fromRawData((char *)buffer, buffer_size - cmpr_stream.avail_out);
uncompressed.append( chunk );
}
else
{
inflateEnd(&cmpr_stream);
break;
}
if (status == Z_STREAM_END)
{
inflateEnd(&cmpr_stream);
break;
}
}
while (cmpr_stream.avail_out == 0);
return uncompressed;
}
Eveything seems to work fine if the decompressed data fits into the output buffer (ie. is smaller than 16 Kb). If it doesn't, the second call to inflate returns a Z_DATA_ERROR. I know for sure the data is correct because the same chunk of data is correctly decompressed if the output buffer is made large enough.
The server doesn't return a header with the size of the uncompressed data (only the size of the compressed one) so I followed the usage instructions in zlib: http://www.zlib.net/zlib_how.html
And they do exactly what I'm doing. Any idea what I could be missing? the next_in and avail_in members in the stream seem to be updated correctly after the first iteration. Oh, and if it's any useful, the error message when the data error is issued is: "invalid distance too far back".
Any thoughts? Thanks.
The Deflate/Inflate compression/decompression algorithm uses a 32Kb circular buffer. So a 16Kb buffer can never work if the decompressed data is bigger than 16Kb. (Not strictly true, because the data is allowed to be split into blocks, but you need to assume that there may be 32Kb blocks in there.) So just set buffer_size = 32768 and you should be OK.
i am trying to read file from a stream.
and i am using stream.read method to read the bytes. So the code goes like below
FileByteStream.Read(buffer, 0, outputMessage.FileByteStream.Length)
Now the above gives me error because the last parameter "outputMessage.FileByteStream.Length" returns a long type value but the method expects an integer type.
Please advise.
Convert it to an int...
FileByteStream.Read(buffer, 0, Convert.ToInt32(outputMessage.FileByteStream.Length))
It's probably an int because this operation blocks until it's done reading...so if you're in a high volume application, you may not want to block while you read in a massively large file.
If what you're reading in isn't reasonably sized, you may want to consider looping to read the data into a buffer (example from MSDN docs):
//s is the stream that I'm working with...
byte[] bytes = new byte[s.Length];
int numBytesToRead = (int) s.Length;
int numBytesRead = 0;
while (numBytesToRead > 0)
{
// Read may return anything from 0 to 10.
int n = s.Read(bytes, numBytesRead, 10);
// The end of the file is reached.
if (n == 0)
{
break;
}
numBytesRead += n;
numBytesToRead -= n;
}
This way you don't cast, and if you pick a reasonably large number to read into the buffer, you'll only go through the while loop once.