Loading... ## 1、前言 本例主要参考FFmpeg官方例程[decode_audio.c](http://ffmpeg.org/doxygen/trunk/decode_audio_8c-example.html)。该例程提供的音频解码思路时将一个mp2压缩格式的音频解码为PCM格式的纯音频,并不涉及解封装的过程。所以在本例中你可以看到,并没有使用`av_read_frame()`从读取AVPacket对象,而是使用`fread()`函数读取二进制文件数据,然后使用`av_parse_parse2()`函数将缓存中的数据读入AVPacket的data缓存中。另外需要注意的是,FFmpeg自3.0版本之后,新的音频解码方式从原有的`avcodec_decodec_audio4`更改为`avcodec_send_packet`配合`avcodec_receive_frame`进行解码。因此,如果你是FFmpeg旧版本API的使用者,那么这篇文章的部分内容可能不适合你,建议你去看[雷霄骅的博客](https://blog.csdn.net/leixiaohua1020/article/details/42181571),他的博客中使用的FFmpeg版本比较旧。 ## 2、FFmpeg解码音频API调用流程  ## 3、相关方法介绍 ### 3.1 avcodec_find_decoder():寻找解码器 ```cpp AVCodec* avcodec_find_decoder(enum AVCodecID id); ``` 这个函数根据编码格式寻找对应的解码器。在本例中,我的输入音频为`aac`格式,所以我需要找`*AV_CODEC_ID_AAC `对应的解码器。而在官方例子中提供的是`AV_CODEC_ID_MP2`,所以如果你完全使用了官方例程的代码,那么你的输入文件需要是mp2格式。 ### 3.2 avcodec_open2():根据解码器初始化AVCodecContext ```cpp int avcodec_open2(AVCodecContext *avctx, const AVCodec *codec, AVDictionary **options) ``` - avctx:需要初始化的AVCodecContext - codec:输出的AVcodec - options:一些选项。例如使用libx264编码的时候,“preset”,“tune”等都可以通过该参数设置。 ### 3.3 av_parser_parse2():解析数据获得一个AVPacket ```cpp int av_parser_parse2(AVCodecParserContext *s, AVCodecContext *avctx, uint8_t **poutbuf, int *poutbuf_size, const uint8_t *buf, int buf_size, int64_t pts, int64_t dts, int64_t pos); ``` - poutbuf:指向解析后输出的压缩编码数据帧 - buf:指向输入的压缩编码数据 - 返回值:输入比特流被使用的大小 ### 3.4 avcodec_send_packet()和avcodec_recevie_frame ffmpeg3版本的解码接口做了不少调整,之前的视频解码接口avcodec_decode_video2和avcodec_decode_audio4音频解码被设置为deprecated,对这两个接口做了合并,使用统一的接口。并且将音视频解码步骤分为了两步,第一步avcodec_send_packet,第二步avcodec_receive_frame,通过接口名字我们就可以知道第一步是发送编码数据包,第二步是接收解码后数据。新版本是否只是做了接口的变化,还有有哪些我们需要注意的事项,我们来分析一下。 ```cpp int avcodec_send_packet(AVCodecContext *avctx, const AVPacket *avpkt); ``` - avctx:AVCodecContext对象,音视频解码上下文,包含解码器。 - avpkt:编码的音视频帧数据 ```cpp int avcodec_receive_frame(AVCodecContext *avctx, AVFrame *frame); ``` - avctx:与上面接口一致 - frame:解码后的音频/视频帧数据 ### 3.5 av_get_bytes_per_sample():获取一个采样的字节数 ```cpp int av_get_bytes_per_sample (enum AVSampleFormat sample_fmt) ``` ## 4、转码流程 ### 4.1 找到AAC解码器 ```cpp codec = avcodec_find_decoder(AV_CODEC_ID_AAC); if (!codec) { fprintf(stderr, "Codec not found\n"); exit(1); } ``` ### 4.2 初始化数据解析器 ```cpp parser = av_parser_init(codec->id); if (!parser) { fprintf(stderr, "Parser not found\n"); exit(1); } ``` ### 4.3 初始化解码器 ```cpp //初始化解码器 c = avcodec_alloc_context3(codec); if (!c) { fprintf(stderr, "Could not allocate audio codec context\n"); exit(1); } ``` ### 4.4 使用解码器初始化AVCodecContext ```cpp if (avcodec_open2(c, codec, NULL) < 0) { fprintf(stderr, "Could not open codec\n"); exit(1); } ``` ### 4.5 循环读取数据帧并解码 ```cpp while (!feof(f)) { if (!decoded_frame) { //初始化AVFrame if (!(decoded_frame = av_frame_alloc())) { fprintf(stderr, "Could not allocate audio frame\n"); exit(1); } } /* read raw data from the input file */ //从文件中读取数据 data_size = fread(inbuf, 1, AUDIO_INBUF_SIZE, f); if (!data_size) break; /* use the parser to split the data into frames */ data = inbuf; while (data_size > 0) { //解析数据获得AVPacket ret = av_parser_parse2(parser, c, &pkt->data, &pkt->size, data, data_size, AV_NOPTS_VALUE, AV_NOPTS_VALUE, 0); if (ret < 0) { fprintf(stderr, "Error while parsing\n"); exit(1); } data += ret; data_size -= ret; if (pkt->size) //解码数据 decode(c, pkt, decoded_frame, outfile); } } ``` ### 4.6 流处理结束,flush codec ```cpp pkt->data = NULL; pkt->size = 0; decode(c, pkt, decoded_frame, outfile); ``` ### 4.7 释放资源 ```cpp fclose(outfile); fclose(f); avcodec_free_context(&c); av_parser_close(parser); av_frame_free(&decoded_frame); av_packet_free(&pkt); ``` ## 5、代码 下面的代码是参照了FFmpeg官方例子[decode_audio.c](http://ffmpeg.org/doxygen/trunk/decode_audio_8c-example.html)。不同之处在于,它提供的例子在循环读取数据帧的时候,还多加了一个步骤,代码如下: ```cpp if (data_size < AUDIO_REFILL_THRESH) { memmove(inbuf, data, data_size); data = inbuf; len = fread(data + data_size, 1, AUDIO_INBUF_SIZE - data_size, f); if (len > 0) data_size += len; } ``` 这段代码的作用是:当数据缓存大小低于设置的阈值`AUDIO_REFILL_THRESH`时,需要从文件中重新读取大小为`AUDIO_INBUF_SIZE - data_size`的数据,然后与之前剩余的数据合并成新的缓存data,然后再循环通过`av_parse_parse2()`函数将数据读入AVPacket,进行解码。但是这部分代码对于初学者来说有点晦涩难懂,至少我最初看到这里的时候就很疑惑。所以,我将源码中循环读取数据帧的代码全部注释,然后重写了这部分代码,也就是**4.2节循环读取数据帧**部分内容。 下面的代码是我在官方提供例程之上更改后的代码,可直接运行,查看效果(输入为aac音频,输出为pcm原始音频数据)。代码中的关键部分我都有注释。  ```cpp /* * Copyright (c) 2001 Fabrice Bellard * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * THE SOFTWARE. */ /** * @file * audio decoding with libavcodec API example * * @example decode_audio.c */ #include <QCoreApplication> #include <stdio.h> #include <stdlib.h> #include <string.h> extern "C"{ #include <libavutil/frame.h> #include <libavutil/mem.h> #include <libavcodec/avcodec.h> } #define AUDIO_INBUF_SIZE 20480 #define AUDIO_REFILL_THRESH 4096 static int get_format_from_sample_fmt(const char **fmt, enum AVSampleFormat sample_fmt) { struct sample_fmt_entry { enum AVSampleFormat sample_fmt; const char *fmt_be, *fmt_le; } sample_fmt_entries[] = { { AV_SAMPLE_FMT_U8, "u8", "u8" }, { AV_SAMPLE_FMT_S16, "s16be", "s16le" }, { AV_SAMPLE_FMT_S32, "s32be", "s32le" }, { AV_SAMPLE_FMT_FLT, "f32be", "f32le" }, { AV_SAMPLE_FMT_DBL, "f64be", "f64le" }, }; *fmt = NULL; for (unsigned int i = 0; i < FF_ARRAY_ELEMS(sample_fmt_entries); i++) { struct sample_fmt_entry *entry = &sample_fmt_entries[i]; if (sample_fmt == entry->sample_fmt) { *fmt = AV_NE(entry->fmt_be, entry->fmt_le); return 0; } } fprintf(stderr, "sample format %s is not supported as output format\n", av_get_sample_fmt_name(sample_fmt)); return -1; } static void decode(AVCodecContext *dec_ctx, AVPacket *pkt, AVFrame *frame, FILE *outfile) { int i, ch; int ret, data_size; /* send the packet with the compressed data to the decoder */ ret = avcodec_send_packet(dec_ctx, pkt); if (ret < 0) { fprintf(stderr, "Error submitting the packet to the decoder\n"); exit(1); } /* read all the output frames (in general there may be any number of them */ while (ret >= 0) { ret = avcodec_receive_frame(dec_ctx, frame); if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) return; else if (ret < 0) { fprintf(stderr, "Error during decoding\n"); exit(1); } data_size = av_get_bytes_per_sample(dec_ctx->sample_fmt); if (data_size < 0) { /* This should not occur, checking just for paranoia */ fprintf(stderr, "Failed to calculate data size\n"); exit(1); } printf("nb_samples: %d\n", frame->nb_samples); for (i = 0; i < frame->nb_samples; i++) for (ch = 0; ch < dec_ctx->channels; ch++) fwrite(frame->data[ch] + data_size*i, 1, data_size, outfile); } } int main(int argc, char **argv) { QCoreApplication a(argc, argv); const char *outfilename, *filename; const AVCodec *codec; AVCodecContext *c= NULL; AVCodecParserContext *parser = NULL; int len, ret; FILE *f, *outfile; uint8_t inbuf[AUDIO_INBUF_SIZE + AV_INPUT_BUFFER_PADDING_SIZE]; uint8_t *data; size_t data_size; AVPacket *pkt; AVFrame *decoded_frame = NULL; enum AVSampleFormat sfmt; int n_channels = 0; const char *fmt; if (argc <= 2) { fprintf(stderr, "Usage: %s <input file> <output file>\n", argv[0]); exit(0); } filename = argv[1]; outfilename = argv[2]; pkt = av_packet_alloc(); //找到AAC解码器 codec = avcodec_find_decoder(AV_CODEC_ID_AAC); if (!codec) { fprintf(stderr, "Codec not found\n"); exit(1); } //初始化数据解析器 parser = av_parser_init(codec->id); if (!parser) { fprintf(stderr, "Parser not found\n"); exit(1); } //初始化解码器 c = avcodec_alloc_context3(codec); if (!c) { fprintf(stderr, "Could not allocate audio codec context\n"); exit(1); } /* open it */ if (avcodec_open2(c, codec, NULL) < 0) { fprintf(stderr, "Could not open codec\n"); exit(1); } f = fopen(filename, "rb"); if (!f) { fprintf(stderr, "Could not open %s\n", filename); exit(1); } outfile = fopen(outfilename, "wb"); if (!outfile) { av_free(c); exit(1); } while (!feof(f)) { if (!decoded_frame) { if (!(decoded_frame = av_frame_alloc())) { fprintf(stderr, "Could not allocate audio frame\n"); exit(1); } } /* read raw data from the input file */ data_size = fread(inbuf, 1, AUDIO_INBUF_SIZE, f); if (!data_size) break; /* use the parser to split the data into frames */ data = inbuf; while (data_size > 0) { ret = av_parser_parse2(parser, c, &pkt->data, &pkt->size, data, data_size, AV_NOPTS_VALUE, AV_NOPTS_VALUE, 0); if (ret < 0) { fprintf(stderr, "Error while parsing\n"); exit(1); } data += ret; data_size -= ret; if (pkt->size) decode(c, pkt, decoded_frame, outfile); } } /* decode until eof */ //地址 // data = inbuf; // //每个数据块大小为1,数据个数AUDIO_INBUF_SIZE // data_size = fread(inbuf, 1, AUDIO_INBUF_SIZE, f); // while (data_size > 0) { // if (!decoded_frame) { // if (!(decoded_frame = av_frame_alloc())) { // fprintf(stderr, "Could not allocate audio frame\n"); // exit(1); // } // } // //将data数据给AVPacket,返回值ret是输入比特流被使用的个数 // ret = av_parser_parse2(parser, c, &pkt->data, &pkt->size, // data, data_size, // AV_NOPTS_VALUE, AV_NOPTS_VALUE, 0); // printf("ret: %d, data_size: %d\n", ret, data_size); // if (ret < 0) { // fprintf(stderr, "Error while parsing\n"); // exit(1); // } // data += ret; // data_size -= ret; // if (pkt->size) // decode(c, pkt, decoded_frame, outfile); // //当数据缓存被填充得低于所述阈值时,重新得到音频媒体流的一部分。 // //新读到的数据与之前剩余的数据组成新的缓存 // if (data_size < AUDIO_REFILL_THRESH) { // memmove(inbuf, data, data_size); // data = inbuf; // //将文件中读取的数据保存到data+data_size内存之后的位置 // len = fread(data + data_size, 1, // AUDIO_INBUF_SIZE - data_size, f); // if (len > 0) // data_size += len; // } // } /* flush the decoder */ //流处理结束需要flush codec. 因为codec可能在内部缓冲多个frame或packet //将pkt设为NULL pkt->data = NULL; pkt->size = 0; decode(c, pkt, decoded_frame, outfile); /* print output pcm infomations, because there have no metadata of pcm */ //音频采样格式 sfmt = c->sample_fmt; //如果音频采样格式是平面(planar)格式,则需要转换为packed格式 //因为SDL不能播放plannar格式 if (av_sample_fmt_is_planar(sfmt)) { const char *packed = av_get_sample_fmt_name(sfmt); printf("Warning: the sample format the decoder produced is planar " "(%s). This example will output the first channel only.\n", packed ? packed : "?"); sfmt = av_get_packed_sample_fmt(sfmt); } n_channels = c->channels; if ((ret = get_format_from_sample_fmt(&fmt, sfmt)) < 0) goto end; printf("Play the output audio file with the command:\n" "ffplay -f %s -ac %d -ar %d %s\n", fmt, n_channels, c->sample_rate, outfilename); end: fclose(outfile); fclose(f); avcodec_free_context(&c); av_parser_close(parser); av_frame_free(&decoded_frame); av_packet_free(&pkt); return a.exec(); } ``` ## 6、参考资料 [1] [FFmpeg源代码简单分析:avcodec_open2()](https://blog.csdn.net/leixiaohua1020/article/details/44117891) [2] [FFmpeg3最新的解码接口avcodec_send_packet和avcodec_receive_frame](https://blog.csdn.net/boonya/article/details/80598419) [3] [FFmpeg学习笔记之av_parser_parse2()](https://blog.csdn.net/wqwqh/article/details/90713396?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522161404889116780264046549%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=161404889116780264046549&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduend~default-1-90713396.first_rank_v2_pc_rank_v29&utm_term=av_parser_parse2) Last modification:June 15th, 2021 at 06:00 pm © 允许规范转载