论文笔记 - HOA 的必读论文 poletti2005

Three-dimensional surround sound systems based on spherical harmonics

Poletti, M. A. (2005). Three-dimensional surround sound systems based on spherical harmonics. Journal of the Audio Engineering Society, 53(11), 1004-1025.


Introduction

Approaches to 3D sound field reproduction

  1. The Kirchhoff–Helmholtz integral
    • The Kirchhoff–Helmholtz integral shows that reproduction is possible inside a region if the pressure and normal velocity are known on the surface of the region.
    • basis for the wave field synthesis (WFS).
    • In practice simplifications are possible; for example, monopole sources are sufficient, and only those transducers in the direction of the sound source are required.
  2. Inverse method
    • an inverse matrix is derived for a given geometry of loudspeakers and receiver positions, which allows the creation of the required sound pressure at a set of discrete points.
  3. 3D Ambisonics approach
    • based on a spherical harmonic decomposition of the sound field

This paper reviews and extends the theory of 3D sound systems based on spherical harmonics.

Description of 3D sound fields

这一章主要讲如何将录制声场并转化为球谐系数。

spherical harmonics/Spherical Bessel Description

For the interior case, where all sources lie outside the region of interest:

$$
p(r,\theta,\phi,k)=\sum_{n=0}^{\infty}{\sum_{m=-n}^{n}{A_n^m(k)j_n(kr)Y_n^m}(\theta,\phi)}
$$

$j_n(kr)$: the spherical Bessel function of the first kind.

An important feature of this expansion is that for small $kr$, that is, for low frequencies or small distances from the origin, the summation in $n$ may be truncated to a finite value $N$ with little error, because only the low-order spherical Bessel functions have significant values for small $kr$. The pressure is then accurately represented by a total of $(N+1)^2$ terms.

For the exterior case, where all sources lie within a region of space:

$$
p(r,\theta,\phi,k)=\sum_{n=0}^{\infty}{\sum_{m=-n}^{n}{B_n^m(k)h_n(kr)Y_n^m}(\theta,\phi)}
$$

$h_n(kr)$: the spherical Hankel function. $h_n(kr)=j_n(kr)+iy_n(kr)$, $y_n(kr)$ is the spherical Bessel function of the second kind.

The Wronskian relationship:

$$
j_n(x)h_n’(x)-j_n’(x)h_n(x)=\frac{i}{x^2}
$$

Spherical harmonics form an orthonormal basis for any function defined on a sphere

$$
f(\theta,\phi)=\sum_{n=0}^{\infty}{\sum_{m=-n}^{n}{A_n^mY_n^m}(\theta,\phi)}
$$

$$
A_n^m = \int_0^{2\pi}{\int_0^{\pi}{f(\theta,\phi)Y_n^m(\theta,\phi)^*\sin(\theta)d\theta}d\phi}
$$

Plane and spherical wave expansions

The spherical harmonic expansion of a plane wave arriving from incidence angles $(\theta_i, \phi_i)$ is

$$
e^{i\boldsymbol{k_ir}}=4\pi\sum_{n=0}^{\infty}{i^nj_n(kr)\sum_{m=-n}^{n}{Y_n^m(\theta,\phi)Y_n^m(\theta_i,\phi_i)^*}}.
$$

The spherical harmonic expansion of the wavefield due to a point source with position $(r_s, \theta_s, \phi_s)$, and with $r<r_s$, is

$$
G(\boldsymbol{r}|\boldsymbol{r_s},k)=\frac{e^{ik|\boldsymbol{r}-\boldsymbol{r_s}|}}{4\pi|\boldsymbol{r}-\boldsymbol{r_s}|}=ik\sum_{n=0}^{\infty}{j_n(kr)h_n(kr_s)\sum_{m=-n}^{n}{Y_n^m(\theta,\phi)Y_n^m(\theta_i,\phi_i)^*}}.
$$

Kirchhoff-Helmholtz description

The Kirchhoff–Helmholtz integral shows that the sound pressure within an arbitrarily shaped volume of space may be calculated from the pressure and normal velocity on its surface

$$
p(\boldsymbol{r},k)=\int_S{\int{[ G(\boldsymbol{r}|\boldsymbol{r_s},k)\frac{\partial p(\boldsymbol{r_s},k)}{\partial n}-p(\boldsymbol{r_s},k)\frac{\partial G(\boldsymbol{r}|\boldsymbol{r_s},k)}{\partial n}]}dS}
$$

This shows that the sound field may be reproduced from an infinite distribution of monopole sources excited by the normal gradient of the pressure on the surface ($G(\boldsymbol{r}|\boldsymbol{r_s},k)\frac{\partial p(\boldsymbol{r_s},k)}{\partial n}$), and an infinite distribution of dipole sources excited by the pressure on the surface ($p(\boldsymbol{r_s},k)\frac{\partial G(\boldsymbol{r}|\boldsymbol{r_s},k)}{\partial n}$). In practice, since most loudspeakers are monopoles at low frequencies, the dipole sources are less practical to implement (requiring, for example, unbaffled drivers). It has been shown that the monopole and dipole sources are not independent and that in practice the dipole sources may be ignored.

3D sound field recording

The accurate recording of sound fields requires the synthesis of higher directivities than are available from first-order microphones. These directivities can be obtained using microphone arrays.

  • array beamforming
  • multiple discrete microphones
  • the combination of dipole responses
  • single with multiple radii

A first-order microphone has response that is proportional to the pressure gradient, whereas a second-order microphone design has response proportional to the gradient of the gradient.

Free-field sphere Decomposition

$$
A_n^m(k) = \frac{1}{j_n(kr)} \int_0^{2\pi}{\int_0^{\pi}{p(r,\theta,\phi,k)Y_n^m(\theta,\phi)^*\sin(\theta)d\theta}d\phi}
$$

The zeros of $j_n(kr)$ produce equalization problems.

One approach to removing the problem is to use first-order microphones facing radially outward. The general form of a radial first-order response is

$$
s_\alpha(r,\theta,\phi,k) = \alpha p(r,\theta,\phi,k)-(1-\alpha)\rho cv_R(r,\theta,\phi,k)
$$

where $\alpha$ is the first-order parameter, $v_R$ is the radial velocity, and $\rho c$ is the impedance of free space.

The first-order response then has the spherical harmonic expansion

$$
s_\alpha(r,\theta,\phi,k) = \sum_{n=0}^{\infty}{[\alpha j_n(kr)-\mathrm{i}(1-\alpha)j_n’(kr)]\sum_{m=-n}^{n}{A_n^m(k)Y_n^m}(\theta,\phi)}
$$

and the spherical harmonic coefficients are

$$
A_n^m(k) = \frac{1}{\alpha j_n(kr)-\mathrm{i}(1-\alpha)j_n’(kr)} \int_0^{2\pi}{\int_0^{\pi}{s_\alpha(r,\theta,\phi,k) Y_n^m(\theta,\phi)^*\sin(\theta)d\theta}d\phi}
$$

Solid sphere decomposition

A solid sphere containing flush mounted pressure microphones also allows the coefficients to be found without the risk of zeros in the response.

The scattering of sound around a sphere for a sound field is obtained by assuming that the resultant field is the sum of the original field and a scattered field that is radiating outward.

$$
p_t(r,\theta,\phi,k) = \sum_{n=0}^{\infty}{[j_n(kr)-\frac{j_n’(ka)}{h_n’(ka)}h_n(kr)]\sum_{m=-n}^{n}{A_n^m(k)Y_n^m}(\theta,\phi)}
$$

where $a$ is the radius of the sphere. This is the sum of the original wavefield without the sphere and a scattering field consisting of outgoing waves whose coefficients are modified by a ratio of spherical Bessel terms. The sound field coefficients may be found from the pressure at $r = a$,

$$
A_n^m(k) = -\mathrm{i}(ka)^2h_n’(ka) \int_0^{2\pi}{\int_0^{\pi}{p_t(a,\theta,\phi,k) Y_n^m(\theta,\phi)^*\sin(\theta)d\theta}d\phi}
$$