how to read the original data of opus file into python/numpy array? #76

yongxuUSTC · 2023-05-27T00:07:32Z

I am not familiar with the format of opus file (after I opusenc in.wav out.opus).
How many bytes are in the head of out.opus? Are the rest all actual audio encoded data (2 bytes for each)? Are there any other non-data bytes in the tail?

Could any one can post a sample code to help me to read the out.opus into python/numpy array with Uint16 format? I do not need to decode the out.opus file, I just need the original encoded data in the out.opus file.

Thank you very much

rillian · 2023-05-27T01:00:47Z

It's not as simple as header and tail bytes. The compressed opus audio packages are split into segments which are grouped with periodic headers including timestamps for seeking. The format is documented in

I looked briefly and didn't find a pure python library that mentioned access to the raw encoded data, although several will decode opus files to pcm audio in numpy arrays.

You could however use the pyogg.ogg ctypes wrapper to access the libogg C implementation directly and pull the data out that way. It's also possible some of the higher-level python libraries have accessible decapsulation functions.

But you mention Uint16 which is confusing. That doesn't make a lot of sense for the compressed opus-encoded data, which is a complex, entropy-coded data structure packed into bytes.

yongxuUSTC · 2023-05-28T06:50:59Z

It's not as simple as header and tail bytes. The compressed opus audio packages are split into segments which are grouped with periodic headers including timestamps for seeking. The format is documented in

https://datatracker.ietf.org/doc/html/rfc8251.html and

https://datatracker.ietf.org/doc/html/rfc3533.html

I looked briefly and didn't find a pure python library that mentioned access to the raw encoded data, although several will decode opus files to pcm audio in numpy arrays.

You could however use the pyogg.ogg ctypes wrapper to access the libogg C implementation directly and pull the data out that way. It's also possible some of the higher-level python libraries have accessible decapsulation functions.

But you mention Uint16 which is confusing. That doesn't make a lot of sense for the compressed opus-encoded data, which is a complex, entropy-coded data structure packed into bytes.

Thank you very much for your reply. Is it possible to get the discrete representation (index or int ?) of the codes in the opus file? Just like, nowadays, the neural network based codec (e.g., soundstream https://arxiv.org/abs/2107.03312) can produce the discrete representation through RVQ.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to read the original data of opus file into python/numpy array? #76

how to read the original data of opus file into python/numpy array? #76

yongxuUSTC commented May 27, 2023

rillian commented May 27, 2023

yongxuUSTC commented May 28, 2023

how to read the original data of opus file into python/numpy array? #76

how to read the original data of opus file into python/numpy array? #76

Comments

yongxuUSTC commented May 27, 2023

rillian commented May 27, 2023

yongxuUSTC commented May 28, 2023