Converting MP3 Duration to Megabytes - math

If I have an MP3 that has a duration of 3:02 with a bitrate of 192kbps is it possible to get an approximate, or exact, size of the file programatically?
So, taking the 192kbps and multiplying by 182 seconds (3:02) gives
192 x 182 = 34944
Convert that to megabytes and you get 4.26562
In PHP:
($this->duration * $this->bitrate) / 8192;
Is it safe to assume that the approximate filesize of the given MP3 would be 4.2 megabytes?

Yes, you're absolutely right. I even found a forum with a similar discussion ending with the same conclusion. It contains interesting examples : http://www.wjunction.com/5-general-discussion/80348-calculate-mp3-time-length-bitrate-file-size.html

Yes, excluding the metadata. Unless it includes lyrics and a thumbnail, the base estimate should be accurate.

Related

How does sound data look like?

I read how sounds represented with numbers in computer here.
And I figured out that usual representation is that, we get 44,100 numbers between [-32767, 32767] per second.
Then to my imagination, there's got to be a big one-column matrix, right?
I'm a R user, so speaking in R, sound data of 3 seconds would be,
s <- 3
sound <- matrix(0, ncol = 1, nrow = 44100 * s)
nrow(sound)
#> [1] 132300
one-column matrix with 132,300 rows.
Is this really the case?
I want some analogous picture in my head, say, in case of a picture with 256 * 256,
if we RGB that picture, we get 3 matrices each with 256 * 256.
And in the case of sounds, we get a long long column? As I think about this again, it's not even a matrix after all. It's a column.
Am I right? I can't find any similar dataset searching Internet.
Any advices will be welcomed. Thanks.
The raw format that is created early in that linked question could look a lot like a single dimension array. And probably the signal that is sent to the speaker to make the sound could be represented similarly.
But you're unlikely to find a file on your computer that looks like that for several reasons:
Sound can be stored at different bit depth - that is how many bits for each 'number' CD Audio tracks have a 16 bit depth, but you could have 8 or 32 bits etc. In a straight stream of these numbers you need some how to know how far to read to the next number, so that information needs to be safed somewhere.
Sample rate can vary. If you've got a sequence of numbers representing an audio signal, then you need to know how long each number lasts for.
mostly sounds are more complex. Instead of a single source, you have stereo, or 5 channel, or whatever, so the system needs to be able to store / decode multiple pieces of information for the sounds you want to hear at a particular time
much of sound is repetitive, and so can often benefit from compression.
So most sounds are stored in a compressed format that includes wrapper information about how to decode it. The wrapper information includes how to decode the different audio channels, what sort of compression was used etc.
The closest you're likely to find are a .wav file (Windows) or .aiff (Mac). But even these include some metadata (sample rate and bit depth to start).

fast performance when reading multiple *.gif images

My computer (i5-6500 3.2 GHZ, 8 GB RAM) takes a long time: something like 10 minutes (havent yet measured exactly).
i currently have to
read 400 images. (*.gif format, should all be b&w, resolution of approx. 200*400 px.) (3520 images in total)
i want to "add" all images "cell-wise".
here is how im doing it at the moment: Read image with raster than turn it into matrix, then sum it.
library(rgdal)
library(raster)
library(magrittr)
oldPic <- raster("initalImage.gif") %>% as.matrix
for (pat_IND in currSide) {
newPic <- raster(pat_IND) %>% as.matrix
oldPic <- oldPic + newPic
}
This takes for ever. I used caTools::read.gif() which was even much slower. Do i have a bottle neck in my code? Is there a faster implementation?
Edit: Image Properties
i use "no dither", mono palette (b&w).
Edit2
i want to add the images pixel-wise. Lets take pic A and pic B.
A + B = C. If A(1,1) = 1 and B(1,1) = 1, C(1,1) should be 2. Its a simple matrix addition.
test image:
reading with raster takes 0.03699994 secs
reading with raster + as.matrix takes: 0.201 secs
you need to measure... without any sample image is hard to say and we can only guess. You need to take into account that loading/decoding JPG take time in milliseconds and encoding of GIF can be time consuming even 200 ms. Depends on kind of encoding. To speed up GIF encoding you can:
use single global palette + dithering
GIF is 8 bpp and JPG is 24 bpp so your encoder needs to do the transformation. That is called color quantization and is the most expensive operation while encoding which can take even ~200 ms per frame on average PC machine in well optimized C++ code. for more info see:
Effective gif/image color quantization?
To remedy this you can use single palette dedicated to dithering (like default VGA or use some WEB palette they have the same purpose) and use dithering with is much much faster. See:
simple and fast Dithering
btw if you need to preserve colors take a look at this:
Images lose quality after saving as GIF
So try to find out how to configure your encoder to force dithering instead of color quantization based on K-means or similar ....
limit encoding dictionary to less then 4096
The encoding/decoding is based on creating dictionary and encoding need to search it more than once on per pixel basis. So lovering its size to 1024 gets significant boost to speed. Of coarse you need to access to encoding code to change this unless this can be configured somehow in it... The compression will be decreased by this however and more clear codes will be present in the stream.
use multi-threading
you can fully parallelize this and encode with each core present in your system.
I strongly recommend you to measure how long it take to encode single frame of GIF. If you take advantage on both bullets #1,#2 then I estimate you can get near times around ~5 ms per frame with dithering and ~60 ms per frame with fast quantization. So with 3520 frames it would take around 17.6 or 211.2 seconds just to encode GIF so add the file memory and JPG manipulation and take into account all is heavily guessed/estimated as you did not provide sample data. And divide by number of cores if you use #3 +/- shared disc access waits.

How can a file be compressed to less than its entropy?

I created a file that contains 100,000 numbers that were drawn uniformly (with probability 1/8) from the set {1,2,3,4,5,6,7,8}.
When a look at the size of this file on my hard-disk it is 293 KB (kilo-byte) which makes sense because one needs 3 bits to "identify" a number between 1 and 8 and 3*100,000 = 300 KB.
Next I compress the file using Win-zip and find that the file is reduced to only 57 KB ! How can this be since I expect that the random-number generator I used for my draws is - for all practical purposes - ideal. This means that the sequence should be truly random and the size of the file should therefore be given by its entropy ( which is 300 KB)?
I am afraid you are confused about certain concepts.
3 bits times 100,000 gives you 300,000 bits, and there are 8 bits to the byte, which corresponds to roughly 37.5 KB. That's a far cry from 300 KB.
(And in any case, if you were to create "a file that contains 100,000 numbers", there is no magic fairy sitting on your hard disk, who will figure out the min & max range of your numbers, and store them in the file using the smallest number of bits necessary to represent them all.)
So, it is very important to get it out of the way that 300 KB has absolutely nothing to do with the entropy of 100,000 single-digit numbers.
You told us absolutely nothing about how you created that file, so its file format is a mystery, but we can make some simple calculations and guesses: 293 KB times 1024 is 300,000, so what you have is a 300,000 byte file. Which means that you are writing 3 bytes per number. Which means that you have written these numbers as text, in a text file, either each digit followed by a comma, then followed by a space, or each digit followed by a carriage return and a linefeed, or something similar.
Text file formats are extremely wasteful in terms of storage space.
So, yes, this is a highly compressible file consisting mostly of identical bytes, and even the bytes that are not identical (the digits) all map to just 3 bits each, so it is no wonder that the entire file gets compressed so well.
No laws of nature were harmed during the making of this question.

Storing a BMP image in a QR code

I'm trying to create (or, if I've somehow missed it in my research, find) an algorithm to encode/decode a bmp image into/from a QR code format. I've been using a guide (Thonky) to try to understand the basics of QR codes and I'm still not sure how to go about this problem, specifically:
Should I encode the data as binary or would numeric be more reasonable (assuming each pixel will have a max. value of 255)?
I've searched for information on the structured append capabilities of QR codes but haven't found much detail beyond the fact that it's supported by QR codes -- how could I implement/utilize this functionality?
And, of course, if there are any tips/suggestions to better store an image as binary data, I'm very open to suggestions!
Thanks for your time,
Sean
I'm not sure you'll be able to achieve that, as the amount of information a QR Code can hold is quite limited.
First of all, you'll probably want to store your image as raw bytes, as the other formats (numeric and alphanumeric) are designed to hold text/numbers and would provide less space to store your image. Let's assume you choose the biggest possible QR Code (version 40), with the smallest level of error correction, which can hold up to 2953 bytes of binary information (see here).
First option, as you suggest, you store the image as a bitmap. This format allows no compression at all and requires (in the case of an RGB image without alpha channel) 3 bytes per pixel. If we take into account the file header size (14 to 54 bytes), and ignore the padding (each row of image data must be padded to a length being a multiple of 4), that allows you to store roughly 2900/3 = 966 pixels. If we consider a square image, this represents a 31x31 bitmap, which is small even for a thumbnail image (for example, my avatar at the end of this post is 32x32 pixels).
Second option, you use JPEG to encode your image. This format has the advantage of using a compression algorithm that can reduce the file size. This time there is no exact formula to get the size of an image fitting in 2.9kB, but I tried using a few square images and downsizing them until they fit in this size, keeping a good (93) quality factor: this gives an average of about 60x60 pixel images. (On such small images, it's normal not to see an incredible compression factor between jpeg and bmp, as the file header in a jpeg file is far larger than in a bmp file: about 500 bytes). This is better than bitmap, but remains quite small.
Finally, even if you succeed in encoding your image in this QR Code, you will encounter an other problem: a QR Code this big is very, very hard to scan successfully. As a matter of fact, this QR Code will have a size of 177x177 modules (a "module" being a small white or black square). Assuming you scan it using a smartphone providing so-called "HD" frames (1280x720 pixels), each module will have a maximum size on the frame of about 4 pixels. If you take into account the camera noise, the aliasing and the blur due to the fact that the user is never perfectly idle when scanning, the quality of the input frames will make it very hard for any QR Code decoding algorithm to successfully get the QR Code (don't forget we set its error correction level on low at the beginning of this!).
Even though it's not very good news, I hope this helps you!
There is indeed a way to encode information on several (up to 16) QR Codes, using a special header in your QR Codes called "Structured append". The best source of information you can use is the norm about QR Codes (ISO 18004:2006); it's possible (but not necessarily easy) to find it for free on the web.
The relevant part (section 9) of this norm says:
"Up to 16 QR Code symbols may be appended in a structured format. If a symbol is part of a Structured Append message, it is indicated by a header block in the first three symbol character positions.
The Structured Append Mode Indicator 0011 is placed in the four most significant bit positions in the first symbol character.
This is immediately followed by two Structured Append codewords, spread over the four least significant bits of the first symbol character, the second symbol character and the four most significant bits of the third symbol character. The first codeword is the symbol sequence indicator. The second codeword is the parity data and is identical in all symbols in the message, enabling it to be verified that all symbols read form part of the same Structured Append message. This header is immediately followed by the data codewords for the symbol commencing with the first Mode Indicator."
Nevertheless, i'm not sure most QR Code scanners can handle this, as it's a quite advanced feature.
You can define a fixed image size, reduce jpg header parts and using just vital information about it, so you can save up to 480bytes of a ~500bytes normal header.
I was using this method to store people photos for a small-club ID cards, images about 64x64 pixels is enough.

Converting bytes to megabytes

I've seen three ways of doing conversion from bytes to megabytes:
megabytes=bytes/1000000
megabytes=bytes/1024/1024
megabytes=bytes/1024/1000
Ok, I think #3 is totally wrong but I have seen it. I think #2 is right, but I am looking for some respected authority (like W3C, ISO, NIST, etc) to clarify which megabyte is a true megabyte. Can anyone cite a source that explicitly explains how this calculation is done?
Bonus question: if #2 is a megabyte what are #1 and #3 called?
BTW: Hard drive manufacturers don't count as authorities on this one!
Traditionally by megabyte we mean your second option -- 1 megabyte = 220 bytes. But it is not correct actually because mega means 1 000 000. There is a new standard name for 220 bytes, it is mebibyte (http://en.wikipedia.org/wiki/Mebibyte) and it gathers popularity.
There's an IEC standard that distinguishes the terms, e.g. Mebibyte = 1024^2 bytes but Megabyte = 1000^2 (in order to be compatible to SI units like kilograms where k/M/... means 1000/1000000). Actually most people in the IT area will prefer Megabyte = 1024^2 and hard disk manufacturers will prefer Megabyte = 1000^2 (because hard disk sizes will sound bigger than they are).
As a matter of fact, most people are confused by the IEC standard (multiplier 1000) and the traditional meaning (multiplier 1024). In general you shouldn't make assumptions on what people mean. For example, 128 kBit/s for MP3s usually means 128000 bits because the multiplier 1000 is mostly used with the unit bits. But often people then call 2048 kBit/s equal to 2 MBit/s - confusing eh?
So as a general rule, don't trust bit/byte units at all ;)
Divide by 2 to the power of 20, (1024*1024) bytes = 1 megabyte
1024*1024 = 1,048,576
2^20 = 1,048,576
1,048,576/1,048,576 = 1
It is the same thing.
BTW: Hard drive manufacturers don't count as authorities on this one!
Oh, yes they do (and the definition they assume from the S.I. is the correct one). On a related issue, see this post on CodingHorror.
for convert byte to megabyte(MB)
use totalbyte/1000/1000
for convert byte to mebibyte (MiB)
use totalbyte/1024/1024
https://en.wikipedia.org/wiki/Byte#Multiple-byte_units
The answer is that #1 is technically correct based on the real meaning of the Mega prefix, however (and in life there is always a however) the math for that doesn't come out so nice in base 2, which is how computers count, so #2 is what people really use.
Megabyte means 2^20 bytes. I know that technically that doesn't mesh with the SI units, and that some folks have come up with a new terminology to mean 2^20. None of that matters. Efforts to change the language to "clarify" things are doomed to failure.
Hard-drive manufacturers use it to mean 1,000,000 bytes, because that's what it means in SI so they figure technically they aren't lying (while actually they are). That falls under lies, damn lies, and marketing.
Use the computation your users will most likely expect. Do your users care to know how many actual bytes are on a disk or in memory or whatever, or do they only care about usable space? The answer to that question will tell you which calculation makes the most sense.
This isn't a precision question as much as it is a usability question. Provide the calculation that is most useful to your users.
In general, it's wrong to use decimal SI prefixes (e.g. kilo, mega) when referring to binary data sizes (except in casual usage). It's ambiguous and causes confusion. To be precise you can use binary prefixes (e.g. 1 mebibyte = 1 MiB = 1024 kibibytes = 2^20 bytes). When someone else uses decimal SI prefixes for binary data you need to get more information before you can know what is meant.
Microsoft Windows Explorer shows file size in the "Properties" window. This is a conversion from the byte count using 2^20

Resources