How to tell Byte size of MP3?

Blitz3D Forums/Blitz3D Programming/How to tell Byte size of MP3?

_PJ_(Posted 2010) [#1]
Does anybody have a clue how to resolve whether an mp3 is encoded at 16,24 or maybe even 8 or 32-bit?

There don't appear to be details in the Audioframe headers for this specifically, except for the frame bitrate and the samplerate.
If there's a link between these values, that might help.

The only other thing I can think of, is that it's somehow part of the CODEC information field ???


Rob the Great(Posted 2010) [#2]
You're pretty brave to do this with an .mp3 file. I have a hard enough time trying to figure out the header on a .wav file, and that's uncompressed.

I did a quick search, and I found the link below, which may be helpful. Not sure if you've seen the same thing or not, but other than this, there's not much I can do to help.

http://www.mp3-tech.org/programmer/frame_header.html


Ginger Tea(Posted 2010) [#3]
im not clicking on the link to find out, but it looks like the one i posted for darkshadowwing way back when (i dont call him rez as i might type ruz and cause confusion)
it gave me a headache and i went cross eyed at it alot
especially as each second or so had its own frame info instead of a global header like a wav file


stanrol(Posted 2010) [#4]
filesize ?


_PJ_(Posted 2010) [#5]
No, stanrol, thanks, but I'm really after the "Word Size" I suppose is more appropriate. As in, how many bits (and therefore bytes) are required for each piece of information.

i.e. 8-bit, 16-bit, 24 or 32 etc.

This is especially important for decoding stereo, since the channels are encoded in a manner like so:

Channel 1 Word 1
Channel 2 Word 1
Channel 1 Word 2
Channel 2 Word 2

Thanks for the link, Rob and Noel, yes it is the same (at least, contains the same description).
The actual format isn't so bad once you get into it, in fact, I find it more meaningful than OGG, though of course, not as straightforward as WAV but that's to be expected. As the audioframes are all labelled and defined so one can find them through iterating bytes within the file (or bank), but as you could see on that webpage, there doesn't appear to be any reference at all to the word-length.

I've been able to extract the audioframe data with all its relevant details no problem, but to actually identify the waveform bytes themselves is a stumbling block due to not knowing how many bytes to read in sequence each time.

The need for so many individual frames each with their own header not only ensures a greater security against corruption, but also allows for more VBR control, a key feature of increased compression with less quality loss. It can seem complex and awkward, but once you separate the frames, it's no different than working with with a numebr of different files, such as, in an archive for example really.

The webpage itself only refers to the frame headers, it's the actual audio data following each frameheader that I need. The mention to the end of the description:

Frames may also feature an optional CRC checksum. It is 16 bits long and, if it exists, immediately follows the frame header. After the CRC comes the audio data. By re-calculating the CRC and comparing its value to the sored one, you can check if the frame has been altered during transmission of the bitstream.

Does, however introduce a new complexity, that whether the CRC is included or not, will determine whether the audio data starts then or after the next 16 bits.

Again, I'm left assuming that the WordLength must be keyed into the actual codec reference itself - OR - there are no distinguished words as such, but only a sequence of bits (i.e. Bytes may be represented as nibbles, bytes or doublebytes etc. as and when required) but even so, there must be something to "tell" the decoder what length to use when, surely?


Adam Novagen(Posted 2010) [#6]
Try this:

[a href="http://www.mp3-converter.com/mp3codec/mp3_anatomy.htm"]Anatomy of an MP3 File[/a]

I basically just typed "mp3 file structure" into Google. Google is your friend, c'mon people... It's not hard... XD


_PJ_(Posted 2010) [#7]
Well, Adam, I looked on that site, and aside from the standard header details, nothing seemed to indicate what I was after.
Just the usual fileheader and frameheader info which for some reason doesn;t seem to include the size of the bytes.

I'm glad you found it so simple,. and commenting that "it's not hard" when I certainly find it so, begs the obvious response, please could you write something for me then, since I'vve pored over many such documents but still find the actial information I'm looking for eludes me.

The only possible conclusion is that the bytes are always 32-bits, only those with the private or reserved bits in use restrict the data for audio to 16 or 24 bytes or such???


jfk EO-11110(Posted 2010) [#8]
What is really confusing me is the fact, that for example a MP3 file may be encoded it 128 kbps, with say 44100 Hz. Now, when you divide 128000 trough 44100, you should get the bit-resolution for a single amplitude - no? But this is only about 3 Bits. When you assume 128000 is meant for the (psychoaccousticly and otherwise) packed data and it might be ten times bigger, once extracted and ready to be played at 44.1 kHz, then it really could be 32 Bit.

Last edited 2010


_PJ_(Posted 2010) [#9]
Well, jfk, the value given as the bitrate is an average. Each audioframe has a separate bitrate (perhaps some audioframes are less 'noisy' than others etc.) - bloody longwinded way of doing things, but it gives greater precision when compressing, at the expense of taking a bit longer to process.

Also, this bitrate is shared so each channel will only have half of that.

The frequency does not necessarily correspond to bitrate, since the audio freq. can be thought of as the sound 'resolution', as an example,. I'l use a graphics analogy.

Say you have a widescreen, fancy monitor with a resolution of 64000x19200 or so.
That's pretty high res, and should give a real fine picture...
But if you're looking at an image of a snowfield, a huge chunk might just be white. 255,255,255 pixels which don't matter if there are 100 of them in a row or just 10, the result is the same, just a white blob.
This white blob, when encoded, can be streamed at a pretty fast (high) bitrate) since it is unchanging for x pixels.
This way, the resolution has little obvious effect on the bitrate and vice versa, and happens throughout the process, but is dependant on the OVERALL average bitrate due to the compression level chosen.

Also, remember that 128 Kbits per second is only 192 Bytes per second, not to be confused with 6 kilobytes.


jfk EO-11110(Posted 2010) [#10]
Hu? 128 kBits only 192 Bytes? Are you serious? I thought its 16 Kilobytes, no? Where 44.1 kHz at 32 Bit takes 192 KB/s - are you mixing these 2?


skidracer(Posted 2010) [#11]
Does anybody have a clue how to resolve whether an mp3 is encoded at 16,24 or maybe even 8 or 32-bit?


From my understanding most of the magic of mp3 compression is based in the frequency domain.

Put simply, mp3 is not an encoding of amplitude over time and hence the data in an mp3 file has no native bit depth or sample rate to speak of.


_PJ_(Posted 2010) [#12]

Also, remember that 128 Kbits per second is only 192 Bytes per second, not to be confused with 6 kilobytes.



Hu? 128 kBits only 192 Bytes? Are you serious? I thought its 16 Kilobytes, no? Where 44.1 kHz at 32 Bit takes 192 KB/s - are you mixing these 2?


I think you're right. I'm really not sure what I weas on about there... I think I mixed up kilobits and kilobytes somewhere along with the relationship of a Kilobyte being 1024 Bytes or something...

Anyway, yeah, the frequency (resolution) is a constant, that's why the compression can allow for bitrate to change, if a bunch of the 44100 different values in a second of sound can be represented by fewer bits.

This measn that the 8, 16 or 32 bit notation for the bytes MUST be linked to the length of sound (a number of milliseconds) that each byte can address?