AudioRecord 是 Android 基于原始 PCM 音频数据录制的类,WebRCT 对其封装的代码位置位于 org/webrtc/audio/WebRtcAudioRecord.java,接下来我们学习一下 AudioRecord 是如何创建启动,读取音频采集数据以及销毁等功能的。
创建和初始化
private int initRecording(int sampleRate, int channels) {
Logging.d(TAG, "initRecording(sampleRate=" + sampleRate + ", channels=" + channels + ")");
if (audioRecord != null) {
reportWebRtcAudioRecordInitError("InitRecording called twice without StopRecording.");
return -1;
}
final int bytesPerFrame = channels * (BITS_PER_SAMPLE / 8);
final int framesPerBuffer = sampleRate / BUFFERS_PER_SECOND;
byteBuffer = ByteBuffer.allocateDirect(bytesPerFrame * framesPerBuffer);
Logging.d(TAG, "byteBuffer.capacity: " + byteBuffer.capacity());
emptyBytes = new byte[byteBuffer.capacity()];
// Rather than passing the ByteBuffer with every callback (requiring
// the potentially expensive GetDirectBufferAddress) we simply have the
// the native class cache the address to the memory once.
nativeCacheDirectBufferAddress(byteBuffer, nativeAudioRecord);
// Get the minimum buffer size required for the successful creation of
// an AudioRecord object, in byte units.
// Note that this size doesn't guarantee a smooth recording under load.
final int channelConfig = channelCountToConfiguration(channels);
int minBufferSize =
AudioRecord.getMinBufferSize(sampleRate, channelConfig, AudioFormat.ENCODING_PCM_16BIT);
if (minBufferSize == AudioRecord.ERROR || minBufferSize == AudioRecord.ERROR_BAD_VALUE) {
reportWebRtcAudioRecordInitError("AudioRecord.getMinBufferSize failed: " + minBufferSize);
return -1;
}
Logging.d(TAG, "AudioRecord.getMinBufferSize: " + minBufferSize);
// Use a larger buffer size than the minimum required when creating the
// AudioRecord instance to ensure smooth recording under load. It has been
// verified that it does not increase the actual recording latency.
int bufferSizeInBytes = Math.max(BUFFER_SIZE_FACTOR * minBufferSize, byteBuffer.capacity());
Logging.d(TAG, "bufferSizeInBytes: " + bufferSizeInBytes);
try {
audioRecord = new AudioRecord(audioSource, sampleRate, channelConfig,
AudioFormat.ENCODING_PCM_16BIT, bufferSizeInBytes);
} catch (IllegalArgumentException e) {
reportWebRtcAudioRecordInitError("AudioRecord ctor error: " + e.getMessage());
releaseAudioResources();
return -1;
}
if (audioRecord == null || audioRecord.getState() != AudioRecord.STATE_INITIALIZED) {
reportWebRtcAudioRecordInitError("Failed to create a new AudioRecord instance");
releaseAudioResources();
return -1;
}
if (effects != null) {
effects.enable(audioRecord.getAudioSessionId());
}
logMainParameters();
logMainParametersExtended();
return framesPerBuffer;
}
复制代码
在初始化的方法中,主要做了两件事。
创建缓冲区
由于实际使用数据的代码在 native 层,因此这里创建了一个 Java 的 direct buffer,而且 AudioRecord 也有通过 ByteBuffer 读数据的接口,并且实际把数据复制到 ByteBuffer 的代码也在 native 层,所以这里使用 direct buffer 效率会更高。
ByteBuffer 的容量为单次读取数据的大小。Android 的数据格式是打包格式(packed),在多个声道时,同一个样点的不同声道连续存储在一起,接着存储下一个样点的不同声道;一帧就是一个样点的所有声道数据的合集,一次读取的帧数是 10ms 的样点数(采样率除以 100,样点个数等于采样率时对应于 1s 的数据,所以除以 100 就是 10ms 的数据);ByteBuffer 的容量为帧数乘声道数乘每个样点的字节数(PCM 16 bit 表示每个样点为两个字节)。
这里调用的 nativeCacheDirectBufferAddress JNI 函数会在 native 层把 ByteBuffer 的访问地址提前保存下来,避免每次读到音频数据后,还需要调用接口获取访问地址。
创建 AudioRecord 对象,构造函数有很多参数,分析如下
audioSource
指的是音频采集模式,默认是 VOICE_COMMUNICATION,该模式会使用硬件 AEC(回声抑制)
sampleRate
采样率
channelConfig
声道数
audioFormat
音频数据格式,这里实际使用的是 AudioFormat.ENCODING_PCM_16BIT,即 PCM 16 bit 的数据格式。
bufferSize
系统创建 AudioRecord 时使用的缓冲区大小,这里使用了两个数值的较大者:通过 AudioRecord.getMinBufferSize 接口获取的最小缓冲区大小的两倍,读取数据的 ByteBuffer 的容量。通过注释我们可以了解到,考虑最小缓冲区的两倍是为了确保系统负载较高的情况下音频采集仍能平稳运行,而且这里设置更大的缓冲区并不会增加音频采集的延迟。
启动
private boolean startRecording() {
Logging.d(TAG, "startRecording");
assertTrue(audioRecord != null);
assertTrue(audioThread == null);
try {
audioRecord.startRecording();
} catch (IllegalStateException e) {
reportWebRtcAudioRecordStartError(AudioRecordStartErrorCode.AUDIO_RECORD_START_EXCEPTION,
"AudioRecord.startRecording failed: " + e.getMessage());
return false;
}
if (audioRecord.getRecordingState() != AudioRecord.RECORDSTATE_RECORDING) {
reportWebRtcAudioRecordStartError(
AudioRecordStartErrorCode.AUDIO_RECORD_START_STATE_MISMATCH,
"AudioRecord.startRecording failed - incorrect state :"
+ audioRecord.getRecordingState());
return false;
}
audioThread = new AudioRecordThread("AudioRecordJavaThread");
audioThread.start();
return true;
}
复制代码
在该方法中,首先启动了 audioRecord,接着判断了读取线程事都正在录制中。
读数据
private class AudioRecordThread extends Thread {
private volatile boolean keepAlive = true;
public AudioRecordThread(String name) {
super(name);
}
// TODO(titovartem) make correct fix during webrtc:9175
@SuppressWarnings("ByteBufferBackingArray")
@Override
public void run() {
Process.setThreadPriority(Process.THREAD_PRIORITY_URGENT_AUDIO);
Logging.d(TAG, "AudioRecordThread" + WebRtcAudioUtils.getThreadInfo());
assertTrue(audioRecord.getRecordingState() == AudioRecord.RECORDSTATE_RECORDING);
long lastTime = System.nanoTime();
while (keepAlive) {
int bytesRead = audioRecord.read(byteBuffer, byteBuffer.capacity());
if (bytesRead == byteBuffer.capacity()) {
if (microphoneMute) {
byteBuffer.clear();
byteBuffer.put(emptyBytes);
}
// It's possible we've been shut down during the read, and stopRecording() tried and
// failed to join this thread. To be a bit safer, try to avoid calling any native methods
// in case they've been unregistered after stopRecording() returned.
if (keepAlive) {
nativeDataIsRecorded(bytesRead, nativeAudioRecord);
}
if (audioSamplesReadyCallback != null) {
// Copy the entire byte buffer array. Assume that the start of the byteBuffer is
// at index 0.
byte[] data = Arrays.copyOf(byteBuffer.array(), byteBuffer.capacity());
audioSamplesReadyCallback.onWebRtcAudioRecordSamplesReady(
new AudioSamples(audioRecord, data));
}
} else {
String errorMessage = "AudioRecord.read failed: " + bytesRead;
Logging.e(TAG, errorMessage);
if (bytesRead == AudioRecord.ERROR_INVALID_OPERATION) {
keepAlive = false;
reportWebRtcAudioRecordError(errorMessage);
}
}
if (DEBUG) {
long nowTime = System.nanoTime();
long durationInMs = TimeUnit.NANOSECONDS.toMillis((nowTime - lastTime));
lastTime = nowTime;
Logging.d(TAG, "bytesRead[" + durationInMs + "] " + bytesRead);
}
}
try {
if (audioRecord != null) {
audioRecord.stop();
}
} catch (IllegalStateException e) {
Logging.e(TAG, "AudioRecord.stop failed: " + e.getMessage());
}
}
// Stops the inner thread loop and also calls AudioRecord.stop().
// Does not block the calling thread.
public void stopThread() {
Logging.d(TAG, "stopThread");
keepAlive = false;
}
}
复制代码
从 AudioRecord 去数据的逻辑在 AudioRecordThread 线程的 Run 函数中。
在线程启动的地方,先设置线程的优先级为 URGENT_AUDIO,这里调用的是 Process.setThreadPriority。
在一个循环中不停地调用 audioRecord.read 读取数据,把采集到的数据读到 ByteBuffer 中,然后调用 nativeDataIsRecorded JNI 函数通知 native 层数据已经读到,进行下一步处理。
停止和销毁
private boolean stopRecording() {
Logging.d(TAG, "stopRecording");
assertTrue(audioThread != null);
audioThread.stopThread();
if (!ThreadUtils.joinUninterruptibly(audioThread, AUDIO_RECORD_THREAD_JOIN_TIMEOUT_MS)) {
Logging.e(TAG, "Join of AudioRecordJavaThread timed out");
WebRtcAudioUtils.logAudioState(TAG);
}
audioThread = null;
if (effects != null) {
effects.release();
}
releaseAudioResources();
return true;
}
复制代码
可以看到,这里首先把 AudioRecordThread 读数据循环的 keepAlive 条件置为 false,接着调用 ThreadUtils.joinUninterruptibly 等待 AudioRecordThread 线程退出。
这里有一点值得一提,keepAlive 变量加了 volatile 关键字进行修饰,这是因为修改和读取这个变量的操作可能发生在不同的线程,使用 volatile 关键字进行修饰,可以保证修改之后能被立即读取到。
AudioRecordThread 线程退出循环后,会调用 audioRecord.stop()停止采集;线程退出之后,会调用 audioRecord.release()释放 AudioRecord 对象。
以上,就是 Android WebRTC 音频采集 Java 层的大致流程。
参考《WebRTC 开发实战》
https://chromium.googlesource.com/external/webrtc/+/HEAD/sdk/android/src/java/org/webrtc/audio/WebRtcAudioRecord.java
评论