audio, music and speech

speech

lame

lame 3.100 vs ffmpeg 5.1

Use lame –abr 8 *.wav or *.mp4

codec2, c2enc, c2dec

compile from https://github.com/drowe67/codec2

Compress, decompress and play a file using Codec 2 at 2400 bit/s:

./src/c2enc 2400 ../raw/hts1a.raw hts1a_c2.bit
./src/c2dec 2400 hts1a_c2.bit hts1a_c2_2400.raw

playe with aplay:

aplay -f S16_LE hts1a_c2_2400.raw

Or using Codec 2 at 700C (700 bits/s):

./src/c2enc 700C ../raw/hts1a.raw hts1a_c2.bit
./src/c2dec 700C hts1a_c2.bit hts1a_c2_700.raw
aplay -f S16_LE hts1a_c2_700.raw

opus

ffmpeg -i a.mp3 -vn -c:a libopus -ac 1 -ar 8000 -b:a 500 -vbr constrained -compression_level 0 -application lowdelay output22.mkv
ffmpeg -i <input> -c:a libopus -ac 1 -ar 16000 -b:a 8K -vbr constrained out.opus

Here, -ac sets the output to mono, -ar sets the sampling rate to 16 kHz, and -b:a sets the bitrate to 8 kBit/s. The constrained variable bitrate mode is used here. In principle, it’s not strictly necessary to downsample and downmix to mono with ffmpeg , as that is something libopus will do on its own to reach the specified bitrate target.

music

Opus can also be used as a high fidelity music compression codec.

There is little support on iOS.

text to speech

festival

flite

espeak

mbrola

google

speech to text

google

samsung