Modifying GDCL MP4 Muxer (Incorrect Header/Footer) - directshow

I want to modify GDLC MP4 Muxer so that
it will not send data to other writer but it will just record it
itself to a file data itself...
It will be not a muxer any will be a writer which have mp4 muxer...
But firstly i have to figure out, where is the last [muxed] data stay , so that i can write it to a file...
To get a playable file, i have to write the data where?
My Attempts:
I put debug info and see that it calls Append and this method call Replace periodically...I write the buffer [ BYTE pBuffer] which is given to Append method of MuxOutput .I get binary data which has some headers but not playable...So it its wrong place or i do it wrong.....Then i check what calls Append --- FillSpace methos and YUVVideoHandler::WriteDescriptor... But can not able to get usefull info from other methods call Append...
Well, i can able to write data to file at MuxOutput::Replace method...The problem is that the header info and footer(tables at the end of file) are wrong... The other data [ payload data] is correct... The playable file which is recorded by File writer started with 00 00 00 18...[hexadecimal] but my recorded data start with 00 00 00 08 [hexadecimal].... when i replace the mp4 header and footer parts with the file generated by the file writer using a hex editor tool, file becomes identical and plays.
What may be the problem?

In Mpeg4Mux::Pause, the MovieWriter is created with a pointer to an AtomWriter interface (in my case, implemented by the output pin by calls to the downstream file-writer filter). All writes to the file are via this interface. The data is written first, and then on stop, the index data (moov chunk) is written and the file header and data chunk headers are updated.

I think your problem is caused by Random File Access requirement which is supported by File Writer Filter by default. The steps you need to follow is:
1) Create an empty file at the beginning
std::ofstream outFile;"c:\\out.mp4", ios_base::out | ios_base::binary);
2) Open file for random access"c:\\out.mp4", ios_base::in | ios_base::out | ios_base::binary);
3) Right after output pin's Write() method add these lines (replace position, buffer and bufferSize with suitable variable names)
outFile.write(buffer, bufferSize);
4) At the end of the recording session (somewhere like a Close() method of the muxer) add
And you are done.


Writing chunks of a large HTTP response to disk as soon as chunks arrive, in Squeak

I am trying to download files to disk from squeak.
My method worked fine for small text/html files,
but due to lack of buffering,
it was very slow for the large binary file
Also, after it finished, the file was much larger (113 MB)
than shown on download page (75MB).
My code looks like this:
download: anURL
"download a file over HTTP and save it to disk under a name extracted from url."
| ios name |
name := ((anURL findTokens: '/') removeLast findTokens: '?') removeFirst.
ios := FileStream oldFileNamed: name.
ios nextPutAll: ((HTTPClient httpGetDocument: anURL) content).
ios close.
Transcript show: 'done'; cr.
I have tried [bytes = stream next bufSize. bytes printTo: ios] for fixed size blocks in HTTP response's contentStream using a [stream atEnd] whileFalse: loop, but that garbled the output file with single quotes around each block, and also extra content after the blocks, which looked like all characters of the stream, each single quoted.
How can I implement buffered writing of an HTTP response to a disk file?
Also, is there a way to do this in squeak while showing download progress?
As Leandro already wrote the issue is with #binary.
Your code is nearly correct, I have taken the liberty to run it - now it downloads the whole file correctly:
| ios name anURL |
anURL := ''.
name := ((anURL findTokens: '/') removeLast findTokens: '?') removeFirst.
ios := FileStream newFileNamed: 'C:\Users\user\Downloads\_squeak\', name.
ios binary.
ios nextPutAll: ((HTTPClient httpGetDocument: anURL) content).
ios close.
Transcript show: 'done'; cr.
As for the freezing, I think the issue is with the one thread for the whole environment while you are downloading. That means that means till you download the whole file you won't be able to use Squeak.
Just tested in Pharo (easier install) and the following code works as you want:
ZnClient new
url: '';
downloadTo: 'C:\Users\user\Downloads\_squeak'.
The WebResponse class, when building the response content, creates a buffer large enough to hold the entire response, even for huge responses! I think this happens due to code in WebMessage>>#getContentWithProgress:.
I tried to copy data from the input SocketStream of WebResponse directly to an output FileStream.
I had to subclass WebClient and WebResponse, and write a two methods.
Now the following code works as required.
| client link |
client := PkWebClient new.
link := 'http://localhost:8000/'.
client download: link toFile: '/home/yo/test'.
I have verified block by block update and integrity of the downloaded file.
I include source below. The method streamContentDirectToFile: aFilePathString is the one that does things differently and solves the problem.
WebClient subclass: #PkWebClient
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'PK'!
!PkWebClient commentStamp: 'pk 3/28/2018 20:16' prior: 0!
Trying to download http directly to file.!
!PkWebClient methodsFor: 'as yet unclassified' stamp: 'pk 3/29/2018 13:29'!
download: urlString toFile: aFilePathString
"Try to download large files sensibly"
| res |
res := self httpGet: urlString.
res := PkWebResponse new copySameFrom: res.
res streamContentDirectToFile: aFilePathString! !
WebResponse subclass: #PkWebResponse
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'PK'!
!PkWebResponse commentStamp: 'pk 3/28/2018 20:49' prior: 0!
To make getContentwithProgress better.!
!PkWebResponse methodsFor: 'as yet unclassified' stamp: 'pk 3/29/2018 13:20'!
streamContentDirectToFile: aFilePathString
"stream response's content directly to file."
| buffer ostream |
stream binary.
buffer := ByteArray new: 4096.
ostream := FileStream oldFileNamed: aFilePathString.
ostream binary.
[stream atEnd]
whileFalse: [buffer := stream nextInBuffer: 4096.
stream receiveAvailableData.
ostream nextPutAll: buffer].
stream close.
ostream close! !

With Qt, how to check if stdin is empty?

I have a Qt program that processes stdin data like this:
QTextStream qtin(stdin);
QString stdindata = qtin.readAll();
QByteArray ba;
ba = stdindata.toUtf8();
QJsonDocument exJSONDoc(QJsonDocument::fromJson(ba));
QJsonObject extRoot;
extRoot = exJSONDoc.object();
QStringList keys;
keys = extRoot.keys();
for (int n=0; n <= keys.count()-1; n++)
qDebug() << extRoot.value(keys[n]).toString();
It works when I call my program like this:
myprogram < ./data.json
But if I call it without any "<" it hangs in qtin.readAll().
How can I check with Qt if the stdin is empty?
(I am assuming a Linux -or at least POSIX- operating system)
QTextStream qtin(stdin);
QString stdindata = qtin.readAll();
This would read stdin till end-of-file is reached. So works with a redirected input like
myprogram < ./data.json
But if I call it without any "<" it hangs ...
But then (that is, if you run myprogram alone) stdin is not empty. It is the same as your shell's stdin. and your program, being the foreground job, is waiting for input on the terminal you are typing (see also tty(4)). Try (in that case) typing some input on the terminal (which you could end with Ctrl D to make an end-of-file condition). Read about job control and the tty demystified and see also termios(3).
Perhaps you could detect that situation with e.g. isatty(3) on STDIN_FILENO. But that won't detect a pipe(7) like
tail -55 somefile | myprogram
You need to define what an empty stdin is for you. I have no idea what that means to you, and I would instead think of myprogram < /dev/null (see null(4)) as the way to get an empty stdin.
Perhaps you should design myprogram so that some program
option (perhaps --ignore-stdin) is avoiding any read from stdin.
Problem here is readAll. See documentation:
Reads the entire content of the stream, and returns it as a QString.
Avoid this function when working on large files, as it will consume a
significant amount of memory.
So it reads stdin until it encounters end of file and since stdin is associated with console you have to signal end of file. Usually it is Ctrl-D and press enter.
It is more probable you what to read stdin line by line.
To alow user text editing console transfers data to standard input of the application only line by line. This was designed like this ages ago when computer had only a printer as user interface (no screen).
Now question is how to read JSon form stdin console connected with console without end of file information?
I would use some SAX parser, but this would be to complicated for you.
So is there another way to detect end of JSon?
You can try this approach (this is basic idea, not final solution, so it has couple shortcomings):
QFile file(stdin);
QByteArray data = file.peak(largeNumber);
QJsonParseError error;
QJSonDocument doc = QJSonDocument::fromJson(data, &error);
while (!doc.isValid() && JSonNotTerminatedError(error.error))
// TODO: wait for new data - it would be best to use readyRead signal
doc = QJSonDocument::fromJson(data, &error);
Where JSonNotTerminatedError returns true for respective QJsonParseError::ParseError values (see linked documentation) which are related with unterminated JSon data.
Now I see QFile doesn't have required constructor, but main concept should be clear. Read data from stdin and check if it is a valid JSon document.

S3: How to do a partial read / seek without downloading the complete file?

Although they resemble files, objects in Amazon S3 aren't really "files", just like S3 buckets aren't really directories. On a Unix system I can use head to preview the first few lines of a file, no matter how large it is, but I can't do this on a S3. So how do I do a partial read on S3?
S3 files can be huge, but you don't have to fetch the entire thing just to read the first few bytes. The S3 APIs support the HTTP Range: header (see RFC 2616), which take a byte range argument.
Just add a Range: bytes=0-NN header to your S3 request, where NN is the requested number of bytes to read, and you'll fetch only those bytes rather than read the whole file. Now you can preview that 900 GB CSV file you left in an S3 bucket without waiting for the entire thing to download. Read the full GET Object docs on Amazon's developer docs.
The AWS .Net SDK only shows only fixed-ended ranges are possible (RE: public ByteRange(long start, long end) ). What if I want to start in the middle and read to the end? An HTTP range of Range: bytes=1000- is perfectly acceptable for "start at 1000 and read to the end" I do not believe that they have allowed for this in the .Net library.
get_object api has arg for partial read
s3 = boto3.client('s3')
resp = s3.get_object(Bucket=bucket, Key=key, Range='bytes={}-{}'.format(start_byte, stop_byte-1))
res = resp['Body'].read()
Using Python you can preview first records of compressed file.
Connect using boto.
s3 = boto.connect_s3()
self.bucket = s3.get_bucket(bname, validate=False)
Read first 20 lines from gzip compressed file
#Read first 20 records
k = Key(self.bucket)
k.key = 'my_file.gz'
gzipped = GzipFile(None, 'rb', fileobj=k)
reader = csv.reader(io.TextIOWrapper(gzipped, newline="", encoding="utf-8"), delimiter='^')
for id,line in enumerate(reader):
if id>=int(limit): break
print(id, line)
So it's an equivalent of a following Unix command:
zcat my_file.gz|head -20

How do I convert my 5GB 1 liner file to lines based on pattern?

I have a 5GB 1 liner file with JSON data and each line starts from this pattern "{"created". I need to be able to use Unix commands on my Mac to convert this monster of a 1 liner into as many lines as it deserves. Any commands?
ASCII English text, with very long lines, with no line terminators
If you have enough memory you can open the file once with the TextWrangler application (the free BBEdit cousin) and use regular search/replace on the whole file. Use \r in replace to add a return. Will be very slow at opening the file, may even hang if low on memory, but in the end it may probably work. No scripting, no commands,.. etc.. I did this with big SQL files and sometimes it did the job.
You have to replace your line-start string with the same string with \n or \r or \r\n in front of it.
Unclear how it can be a “one liner” file but then each line starts with "{"created", but perhaps python -mjson.tool can help you get started:
cat your_source_file.json | python -mjson.tool > nicely_formatted_file.json
Piping raw JSON through ``python -mjson.tool` will cleanly format the JSON to be more human readable. More info here.
OS X ships with both flex and bison, you can use those to write a parser for your data.
You can use PHP as a shell command (if PHP is installed), just save a text file with name "myscript" and appropriate code (I cannot test code now, but the idea is as follows)
$REPLACE_STRING='{"created'; // anything you like
// open input file with fopen() in read mode
$inFp=fopen('big_in_file.txt', "r");
// open output file with fopen() in write mode
$outFp=fopen('big_out_file.txt', "w+");
// while not end of file
while (!feof($inFp)) {
// read file chunks here with fread() in variable $chunk
$chunk = fread($inFp, 8192);
// do a $chunk=str_replace($REPLACE_STRING,"\r".$REPLACE_STRING; // to add returns
// (or use \r\n for windows end of lines)
// problem: if chunk contains half the string at the end
// easily solved if $REPLACE_STRING is a one char like '{'
// otherwise test for fist char { in the end of $chunk
// remove final part and save it in a var for nest iteration
// write $chunk to output file
fwrite($outFp, $chunk);
// End while
After you save it you must make it executable whith sudo chmod a+x ./myscript
and then launch it as ./myscript in terminal
After this, the myscript file is a full unix command

unix: can i write to the same file in parallel without missing entries?

I wrote a script that executes commands in parallel. I let them all write an entry to the same log file. It does not matter if the order is wrong or entries are interleaved, but i noticed that some entries are missing. I should probably lock the file before writing, however, is it true that if multiple processes try to write to a file simultaneously, it will result in missing entries?
Yes, if different processes independently open and write to the same file, it may result in overlapping writes and missing data. This happens because each process will get its own file pointer, that advances only by local writes.
Instead of locking, a better option might be to open the log file once in an ancestor of all worker processes, have it inherited across fork(), and used by them for logging. This means that there will be a single shared file pointer, that advances when any of the processes writes a new entry.
In a script you should use ">> file" (double greater than) to append output to that file. The interpreter will open the destination in "append" mode. If your program also wants to append, follow the directives below:
Open a text file in "append" mode ("a+") and give preference to printing only full lines (don't do multiple 'print' followed by a final 'println', but print the entire line with a single 'println').
The fopen documentation states this:
The fopen() function opens the file whose pathname is the
string pointed to by filename, and associates a stream with
The argument mode points to a string beginning with one of
the following sequences:
r or rb Open file for reading.
w or wb Truncate to zero length or create file
for writing.
a or ab Append; open or create file for writing
at end-of-file.
r+ or rb+ or r+b Open file for update (reading and writ-
w+ or wb+ or w+b Truncate to zero length or create file
for update.
a+ or ab+ or a+b Append; open or create file for update,
writing at end-of-file.
The character b has no effect, but is allowed for ISO C
standard conformance (see standards(5)). Opening a file with
read mode (r as the first character in the mode argument)
fails if the file does not exist or cannot be read.
Opening a file with append mode (a as the first character in
the mode argument) causes all subsequent writes to the file
to be forced to the then current end-of-file, regardless of
intervening calls to fseek(3C). If two separate processes
open the same file for append, each process may write freely
to the file without fear of destroying output being written
by the other. The output from the two processes will be
intermixed in the file in the order in which it is written.
It is because of this intermixing that you want to give preference to
using only 'println' (or its equivalent).
