************
Volume level
************

.. Every program that works with audio has some sort of volume slider. And, they all have different meanings! In Blender, you can find it as a Property of the Sound Strip. It is a number between 0 and 100. A value > 1 will amplify the sound; a value < 1 will decrease the sound level. Simple, isn't it? But how loud is a volume level of 0.5? Is it half as loud as 1.0? And what about the max value of 100? Really? One hundred?

オーディオを扱うすべてのプログラムには、何らかのボリューム スライダーがあります。
そして、それらはそれぞれ異なる意味を持っています！
Blender では、サウンド ストリップのプロパティとして見つけることができます。
これは 0 ～ 100 の数値です。値 > 1 はサウンドを増幅します。値が 1 未満の場合、サウンド レベルが減少します。
シンプルですね。しかし、音量レベル 0.5 はどれくらい大きいのでしょうか? 1.0の半分くらいの音量でしょうか？最大値の 100 はどうでしょうか?本当に？100？


.. Relationship between Volume level value and loudness
Volume level value と loudness(音量) の関係
====================================================

.. You can calculate the gain or loss in decibels (dB) with the following formula:
次の式を使用して、ゲインまたは損失をデシベル (dB) 単位で計算できます。

* dB = 20 x log (volume level)
* volume level = antilog (dB/20)

.. So, if you decrease the volume level to 0.5, this will result in a loss of -6 dB. Increasing the volume level to 100 will result in a gain of 20 x log(100) = 40 db. You can use the antilog function (10<sup>x</sup>) of a calculator to do the inverse: antilog(-6dB/20) = 0.5012.

したがって、ボリューム レベルを 0.5 に下げると、-6 dB の損失が生じます。音量レベルを 100 に上げると、20 x log(100) = 40 db のゲインになります。電卓の逆対数関数 (10<sup>x</sup>) を使用すると、antilog(-6dB/20) = 0.5012 という逆計算を行うことができます。

.. csv-table:: デシベルと音量レベルの関係
   :header: "db loss/gain", "-40", "-30", "-20", "-10", "-6", "0", "+6", "+10", "+20", "+30", "+40""
   :widths: 12, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4

   "Volume Level", 0.01, 0.0316, 0.1, 0.316, 0.501, 1, 1.995, 3.162, 10, 31.62, 100

.. According to psychological research, perceived loudness is doubled every 10 dB. So, to make a sound twice as loud, you have to set the volume level to antilog(+10dB/20) = 3.16. To reduce the perceived sound to half, you have to decrease the volume level to antilog(-10dB/20) = 0.316.

心理学の研究によると、知覚される音量は 10 dB ごとに 2 倍になります。したがって、サウンドを 2 倍大きくするには、音量レベルを antilog(+10dB/20) = 3.16 に設定する必要があります。知覚される音を半分に減らすには、音量レベルをantilog(-10dB/20) = 0.316まで下げる必要があります。

.. Remember that decibel is not a linear but a logarithmic scale. This is the reason of the asymmetrical Volume slider (0 - 100) instead of 0 - 1. If you set the slider to +100, you'll increase the loudness with 40 dB. To decrease the loudness with the same amount, you'll have to set the volume to 0.01. A handy conversion table can be found at `Wikipedia <https://en.wikipedia.org/wiki/Decibel>`_.

デシベルは線形ではなく対数スケールであることに注意してください。これが、ボリューム スライダーが 0 ～ 1 ではなく非対称 (0 ～ 100) になっている理由です。スライダーを +100 に設定すると、音量が 40 dB 増加します。同じ量で音量を下げるには、音量を 0.01 に設定する必要があります。便利な変換表が `Wikipedia <https://en.wikipedia.org/wiki/Decibel>`_ にあります。

.. When a sound is digitized, the analog sound wave is sampled at discrete moments in time and the amplitude (the loudness) of the wave is stored. Two concepts are very important:
音がデジタル化されると、アナログ音波が離散的な瞬間にサンプリングされ、波の振幅 (音量) が保存されます。次の 2 つの概念が非常に重要です。

..
  * Sample rate: a typical sample rate is 44.100 Hz (audio CD quality), which means that 44100 samples per second are taken. More samples results in better quality but also higher storage needs.
  * Bit depth:  at each sample the amplitude of the analog sound wave is converted to a voltage that is stored in a number, typically a 16 or 24 bit integer. Because a 16 bit integer has only 2<sup>16</sup> = 65536 values, there will be a maximum amplitude that can be represented. These numbers are converted to a floating point number between -1 and +1. Amplitudes that exceed these thresholds (-1 or +1) will be clipped at -1 or +1  and result in distortion.
..
* サンプル レート: 一般的なサンプル レートは 44.100 Hz (オーディオ CD 品質)、つまり 1 秒あたり 44100 個のサンプルが取得されることを意味します。サンプルが多いほど品質は向上しますが、必要なストレージも増えます。
* ビット深度: 各サンプルで、アナログ音波の振幅が電圧に変換され、数値 (通常は 16 ビットまたは 24 ビットの整数) で保存されます。 16 ビット整数には 2<sup>16</sup> = 65536 個の値しかないため、表現できる最大振幅が存在します。これらの数値は、-1 から +1 までの浮動小数点数に変換されます。これらのしきい値 (-1 または +1) を超える振幅は、-1 または +1 でクリップされ、歪みが生じます。


.. The volume or loudness of a sound is a perceptual/psychological phenomenon. As such, the loudness of a sound is influenced by many factors: the amplitude of the sound wave but also the frequency (bass tones need more power), the environment (e.g. distance to the sound source), the habituation of the listener, ... To simplify the following discussion, we reduce loudness (a psychological phenomenon) to the amplitude or voltage of the sound wave (a physical phenomenon).

音の音量や大きさは、知覚的/心理的現象です。そのため、音の大きさは多くの要因によって影響されます。音波の振幅だけでなく、周波数（低音にはより多くのパワーが必要です）、環境（音源までの距離など）、リスナーの慣れなどです。以下の議論を簡単にするために、ラウドネス (心理的現象) を音波の振幅または電圧 (物理的現象) に換算します。


.. The decibel scale is a relative scale; it's the ratio of a sound compared to some reference sound. For digital sound the reference is a sound with an amplitude of +1V or the maximum amplitude that can be stored with the chosen bit depth. Because of this definition - dB = log(sample voltage/1 V) - the dB value of a sample voltage = 1V is zero, because log(1/1) = 0. Zero dB is thus the maximum sound level we can have without distortion. Decreasing the sound level will result in a negative dB.

dbという尺度は相対的な尺度です。それはあるサウンドと基準サウンドとの比です。
デジタル サウンドの場合、基準は +1V の振幅、または選択したビット深度で保存できる最大振幅のサウンドです。この定義( ``dB = log(sample voltage/1 V) `` )のため 、サンプル電圧 = 1V の dB 値はゼロです。 ``log(1/1) = 0`` であるためです。
したがって、ゼロ dB は、何もせずに得られる最大音響レベルです。サウンドレベルを下げるとマイナスのdBになります。

.. How to calculate the dB value in Blender Python?
Blender Python で dB 値を計算するにはどうすればよいですか?
================================================

.. The attached blend-file contains a Python script to calculate the dB value of a sound strip, as well as some other sound-parameters. The code is based on [Snuq's VSEQF-addon](https://github.com/snuq/VSEQF) (vu_meter.py). As example we use Blender's open movie [Spring](https://www.youtube.com/watch?v=WhWc3b3KhnY).

添付のブレンド ファイルには、サウンド ストリップの dB 値とその他のサウンド パラメータを計算する Python スクリプトが含まれています。コードは `Snuq's VSEQF-addon <https://github.com/snuq/VSEQF>`_ (vu_meter.py) に基づいています。
例として、Blender のオープン ムービー `Spring <https://www.youtube.com/watch?v=WhWc3b3KhnY>`_ を使用します。

.. You first need a handle to the selected audio strip and the sound data that it contains. Because strip properties (eg Volume) can be animated, we need access to the animated data blocks which are continuously updated in the dependency graph for the current scene and view layer.

まず、選択したオーディオ ストリップとそれに含まれるサウンド データへのハンドルが必要です。ストリップのプロパティ (ボリュームなど) はアニメーション化できるため、現在のシーンおよびビュー レイヤの依存関係グラフで継続的に更新されるアニメーション化されたデータ ブロックにアクセスする必要があります。

.. code-block:: Python

   import bpy
   import math

   strip = bpy.context.scene.sequence_editor.active_strip
   depsgraph = bpy.context.evaluated_depsgraph_get()
   sound = strip.sound.evaluated_get(depsgraph).factory


.. dB value for the whole strip
ストリップ全体のdB値
----------------------------

.. There are quite a few measures. The most used are:
対策はかなりあります。最もよく使用されるのは次のとおりです。

.. * Peak value: what is the highest/lowest dB value in the strip. This is what you see in Volume Meter(VU-meter).
.. * RMS (Root-Mean-Squared): the average dB of the whole strip.
* ピーク値: ストリップ内の最高/最低 dB 値。これはボリュームメーター(VUメーター)で表示されるものです。
* RMS (二乗平均平方根): ストリップ全体の平均 dB。

.. First, we need access to the sampled data. This is a two dimensional numpy float array. The first dimension is the number of samples. The second is the number of channels (1 for mono, 2 for stereo). The values are floating point numbers between 0 - 1.

まず、サンプリングされたデータにアクセスする必要があります。これは 2 次元の numpy float 配列です。最初の次元はサンプル数です。 2 番目はチャンネル数です (モノラルの場合は 1、ステレオの場合は 2)。値は 0 ～ 1 の浮動小数点数です。

.. code-block:: Python

   max = sound.data().max()
   min = sound.data().min()

   if abs(max) > abs(min):
      peak = abs(max)
   else:
      peak = abs(min)

   db = 20 * math.log10(peak)

.. The data()-method of the Sound-object returns the sampled data for the entire strip, even if the strip is trimmed and without the Volume-level applied. In other words, these are the raw sampled data. The peak value of a sound, however, is sometimes not a good estimate of the loudness of the sound. We need some kind of average. Because a sound-wave contains positive and negative sample values, a simple average will cancel out to about zero-level. A RMS value is the solution. Each sample value is squared (eliminating the negative numbers), then the mean (average) of these squares is calculated and reduced again to the original level with a square root.

Sound オブジェクトの data() メソッドは、ストリップがトリミングされ、ボリューム レベルが適用されていない場合でも、ストリップ全体のサンプリングされたデータを返します。言い換えれば、これらはサンプリングされた生のデータです。ただし、サウンドのピーク値は、サウンドの大きさの適切な推定値ではない場合があります。ある種の平均値が必要です。音波には正と負のサンプル値が含まれているため、単純な平均ではほぼゼロレベルに打ち消されます。 RMS 値が解決策です。各サンプル値は 2 乗され (負の数が削除され)、これらの 2 乗の平均 (平均) が計算され、平方根で元のレベルに再び減らされます。

.. Because the sampled data are contained in a numpy array, the code is relatively simple.
サンプリングされたデータは numpy 配列に含まれているため、コードは比較的単純です。

.. code-block:: Python

   samples = sound.data()
   m = np.mean(samples**2)
   rms =  np.sqrt(m)
   db = 20 * math.log10(rms)

.. dB value of the current frame or a section of a strip
現在のフレームまたはストリップのセクションの dB 値
-----------------------------------------------------

.. Suppose, you want the dB value for the sound samples underneath the playhead/cursor in the timeline. Audio strips however work with time code, not frames. This is the cause of many misunderstandings. For example, if you have a MP4-file (video + audio) and you change the frame-per-second parameter of the scene, then the length of the video will change but not the length of the audio strip. The duration of an audio-strip can be calculated, based on the sample rate and the number of samples.

タイムラインのPlayhead/カーソルの下にあるサウンド サンプルの dB 値が必要だとします。ただし、オーディオ ストリップはフレームではなくタイムコードで機能します。これが多くの誤解の原因となっています。たとえば、MP4 ファイル (ビデオ + オーディオ) があり、シーンのフレーム/秒パラメータを変更すると、ビデオの長さは変わりますが、オーディオ ストリップの長さは変わりません。オーディオ ストリップの長さは、サンプル レートとサンプル数に基づいて計算できます。

.. code-block:: Python

   sample_rate = sound.specs[0]
   nr_of_samples = sound.length
   duration = nr_of_samples/sample_rate


.. You would expect that sound.length is equal to len(sound.data()) but apparently that's not true. For the open movie spring: sound.length = 20466685 and len(sound.data() = 20466688.

sound.length が len(sound.data()) に等しいと期待するかもしれませんが、どうやらそうではありません。オープン ムービー スプリングの場合: sound.length = 20466685 および len(sound.data() = 20466688)。

.. Because there are many more samples than frames, the sound data for each frame will contain several samples.

サンプル数はフレームよりもはるかに多いため、各フレームのサウンド データには複数のサンプルが含まれます。

.. code-block:: Python

   nr_of_frames = strip.frame_final_end
   nr_of_samples_per_frame = nr_of_samples/nr_of_frames

.. In case of the open movie "Spring": duration = 464.1s, nr_of_frames = 11139 and nr_of_sample_per_frame = 1837.38. So, for each frame there are 1837 samples. To calculate the dB value underneath the playhead, we need the peak or rms value of these 1837 samples.

公開中のムービー "Spring" の場合: duration = 464.1s、nr_of_frames = 11139、nr_of_sample_per_frame = 1837.38 です。
したがって、各フレームには 1837 個のサンプルがあります。
Playhead の下の dB 値を計算するには、これらの 1837 サンプルのピーク値または rms 値が必要です。

.. You can access these samples with the limit-method. This method works with timecode. So, you need the start and end time code of the current frame. These start and end times are relative to the selected strip. The first frame of the selected strip is zero. The current frame however is relative to the timeline. So, you have to subtract the strip.frame_start from the current frame. The calculation of the db or RMS value is the same as above.

これらのサンプルには、limit メソッドを使用してアクセスできます。この方法はタイムコードで機能します。したがって、現在のフレームの開始タイムコードと終了タイムコードが必要になります。これらの開始時間と終了時間は、選択したストリップを基準としています。選択したストリップの最初のフレームはゼロです。ただし、現在のフレームはタイムラインを基準としています。したがって、現在のフレームからstrip.frame_startを減算する必要があります。 db または RMS 値の計算は上記と同じです。

.. code-block:: Python

   cur_frame = bpy.context.scene.frame_current
   fps = bpy.context.scene.render.fps / bpy.context.scene.render.fps_base
   time_from = (cur_frame - 1 - strip.frame_start) / fps
   time_to = (cur_frame - strip.frame_start) / fps
   sound_cur_frame = sound.limit(time_from, time_to)

.. The decibel value of a trimmed strip
トリミングされたストリップのデシベル値
------------------------------------

.. Because the Data method of a Sound object always returns all samples of the entire strip, you also have to use the limit-method (see above section 2.2) for a trimmed strip. [Trimming and cutting](../video/trimming.md) of a strip is extensively described in a separate post. Fig. 3 summarizes the different achor points.

Sound オブジェクトの Data メソッドは常にストリップ全体のすべてのサンプルを返すため、トリミングされたストリップに対しては、limit メソッド (上記のセクション 2.2 を参照) も使用する必要があります。ストリップの [トリミングとカット](../video/trimming.md) については、別の投稿で詳しく説明します。図 3 は、さまざまなアンカー ポイントをまとめたものです。

.. code-block:: Python

   time_from = (strip.frame_final_start - strip.frame_start)/fps
   time_to = (strip.frame_final_end -  strip.frame_start)/fps
   sound_trimmed_strip = sound.limit(time_from, time_to)

.. The decibel value of an animated strip
アニメーションストリップのデシベル値
--------------------------------------

.. The sound data contain the raw sampled data. To account for a changed volume level for the entire strip, you simply multiply this level with the raw data.
サウンド データには、生のサンプリング データが含まれています。ストリップ全体のボリューム レベルの変更を考慮するには、このレベルと生データを単純に乗算します。

.. code-block:: Python

   max = sound.data().max() * strip.volume

.. The volume level of a strip, however can be changed and animated on a per frame basis. In section 2.2 you calculated the dB value for one frame. Thanks to the dependency graph, we can use the strip.volume value because this value will updated for that specific frame. So, to calculate the dB value for an entire animated strip, the easiest (but perhaps not the most efficient way) will be to loop through the strip frame by frame, set the playhead to that frame, retrieving the sound.data for the frame, multiplying it with the sound volume and cumulating these data into an array.

ただし、ストリップのボリューム レベルは、フレームごとに変更したりアニメーション化したりできます。セクション 2.2 では、1 フレームの dB 値を計算しました。依存関係グラフのおかげで、strip.volume 値がその特定のフレームに対して更新されるため、この値を使用できます。したがって、アニメーション ストリップ全体の dB 値を計算するには、ストリップをフレームごとにループし、Playheadをそのフレームに設定し、そのフレームの sound.data を取得するのが最も簡単です (ただし、最も効率的ではないかもしれません)。 、それに音量を乗算し、これらのデータを配列に累積します。

.. code-block:: Python

   def get_rms(samples):
      m = np.mean(samples**2)
      rms =  np.sqrt(m)
      return 20 * math.log10(rms)

   animated_samples = np.empty(shape=[0, 1])
   for f in range (strip.frame_final_start, strip.frame_final_end):
      bpy.context.scene.frame_set(f)

      time_from = (f - strip.frame_start - 1)/fps
      time_to   = (f - strip.frame_start)/fps

      chunk = sound.limit(time_from, time_to).data()
      chunk = chunk * strip.volume


      animated_samples = np.append(animated_samples, chunk, axis=0)
      print("frame ", bpy.context.scene.frame_current,"time:", time_from, "-", time_to, "-",
        "size:", len(chunk), "cum", len(animated_samples),
        "RMS:", get_rms(chunk), "volume:" ,strip.volume)
    print("total rms: ", get_rms(animated_samples))


Some useful URL
---------------

* https://www.sawvideo.com/index.php/news/2020/measuring-sound-levels-volume-and-loudness
* http://www.sengpielaudio.com/TableOfSoundPressureLevels.htm
* http://www.sengpielaudio.com/calculator-soundlevel.htm
* http://sengpielaudio.com/calculator-gainloss.htm
* http://www.sengpielaudio.com/calculator-loudness.htm
* http://sengpielaudio.com/calculator-levelchange.htm
* https://archive.org/details/fundamentalsofhe00yost/mode/2up