混合16位线性PCM流并避免削波/溢出

我试图将2个16位线性PCM音频流混合在一起，我似乎无法克服噪音问题。我认为将样品混合在一起时它们会溢出。

我有以下function……

short int mix_sample(short int sample1, short int sample2) { return #mixing_algorithm#; }

……这就是我尝试过的＃mixing_algorithm＃

 sample1/2 + sample2/2 2*(sample1 + sample2) - 2*(sample1*sample2) - 65535 (sample1 + sample2) - sample1*sample2 (sample1 + sample2) - sample1*sample2 - 65535 (sample1 + sample2) - ((sample1*sample2) >> 0x10) // same as divide by 65535

其中一些产生了比其他产品更好的效果，但即使是最好的结果也包含了很多噪音。

任何想法如何解决？

这是一个描述性的实现：

 short int mix_sample(short int sample1, short int sample2) { const int32_t result(static_cast(sample1) + static_cast(sample2)); typedef std::numeric_limits Range; if (Range::max() < result) return Range::max(); else if (Range::min() > result) return Range::min(); else return result; }

混合，它只是添加和剪辑！

为了避免剪切伪像，您将需要使用饱和度或限制器。理想情况下，您将拥有一个带有少量前瞻的小型int32_t缓冲区。这将引入延迟。

比限制在任何地方更常见的是在信号中留下一些“净空”值。

我发现的最佳解决方案由Viktor Toth提供。他为8位无符号PCM提供了解决方案，并为16位带符号PCM改变了这种解决方案，产生了这样的结果：

 int a = 111; // first sample (-32768..32767) int b = 222; // second sample int m; // mixed result will go here // Make both samples unsigned (0..65535) a += 32768; b += 32768; // Pick the equation if ((a < 32768) || (b < 32768)) { // Viktor's first equation when both sources are "quiet" // (ie less than middle of the dynamic range) m = a * b / 32768; } else { // Viktor's second equation when one or both sources are loud m = 2 * (a + b) - (a * b) / 32768 - 65536; } // Output is unsigned (0..65536) so convert back to signed (-32768..32767) if (m == 65536) m = 65535; m -= 32768;

使用此算法意味着几乎不需要剪切输出，因为它只是一个在范围内的值。与直接平均不同，即使其他源静音，也不会减少一个源的音量。

这是我在最近的合成器项目中所做的。

 int* unfiltered = (int *)malloc(lengthOfLongPcmInShorts*4); int i; for(i = 0; i < lengthOfShortPcmInShorts; i++){ unfiltered[i] = shortPcm[i] + longPcm[i]; } for(; i < lengthOfLongPcmInShorts; i++){ unfiltered[i] = longPcm[i]; } int max = 0; for(int i = 0; i < lengthOfLongPcmInShorts; i++){ int val = unfiltered[i]; if(abs(val) > max) max = val; } short int *newPcm = (short int *)malloc(lengthOfLongPcmInShorts*2); for(int i = 0; i < lengthOfLongPcmInShorts; i++){ newPcm[i] = (unfilted[i]/max) * MAX_SHRT; }

我将所有PCM数据添加到整数数组中，这样我就可以得到所有未经过滤的数据。

在这之后，我在整数数组中查找绝对最大值。

最后，我将整数数组放入一个短的int数组中，将每个元素除以该最大值，然后乘以max short int值。

这样，您就可以获得满足数据所需的最小“净空”量。

您可能能够对整数数组进行一些统计并整合一些剪辑，但是对于我所需要的，最小量的余量对我来说已经足够了。

我认为它们应该是映射[MIN_SHORT, MAX_SHORT] -> [MIN_SHORT, MAX_SHORT]它们显然不是（除了第一个），因此会发生溢出。

如果unwind的命题不起作用，你也可以尝试：

 ((long int)(sample1) + sample2) / 2

由于您在时域中，频率信息是连续样本之间的差异，当您除以2时会损坏该信息。这就是为什么添加和剪辑效果更好的原因。剪切当然会增加非常高的频率噪声，这可能会被过滤掉。

混合16位线性PCM流并避免削波/溢出

C api使用ffmpeg从网络摄像头捕获video流

函数声明只需要extern“C”吗？

C ++：0和0.0之间的差异？

请查看重叠内存块的memcpy（）的这种莫名其妙的行为和输出

数组类型 – 指定/用作函数参数的规则

数组元素是C中字符串数组的变量

获取Cstrings的控制台输入

使用gcc时，为什么在Linux和Windows上打包结构的大小会有所不同？

是否存在EOF！= -1或WEOF！= -1的常见C环境

Scanfvalidation