Weird Stuff in High Dimensions
Tamanna Hossain-Kay / 2021-11-28
We’re usually putzing around in a \(3\) dimensional space, perceiving the \(4^{th}\) dimension of time as something we’re moving forward through. Attempts to explain our reality, however, have intriguingly and disturbingly given rise to the possibility of higher dimensions. Superstring Theory, for example, posits \(10\) dimensions to provide a unifying framework between general relativity and quantum mechanics. M-Theory, on the other hand, posits \(11\) dimensions, and Bosonic String Theory a daunting \(26\) dimensions! In machine learning we work in even higher dimensions to model data. For example, Google’s BERT projects words on to \(768\) or \(1024\) dimensional spaces in order to model human language.
What do these higher dimensions “look like”? The short answer is that to beings such as ourselves, i.e., mostly \(3D\) with some limited \(4D\) perception: very weird! Visualizing these higher dimensions in the traditional sense doesn’t seem possible for us, but we can still get insight into these spaces through mathematics. Frustratingly, many of these insights break from our usual \(2D\) or \(3D\) intuitions. As Thomas Banchoff wrote, “All of us are slaves to the prejudices of our own dimension.”
Read on to mathematically “see” some weird stuff in high dimensions.
Sources: This post is compiled using notes from CMU, Cornell, and UC Davis with some additional figures and exposition for my own understanding.
Warmup
Consider a \(d-\)dimensional unit square. If were to place \(n\) points inside the square such that each coordinate for each point is drawn uniformly at random from the interval [0,1], what can we say about the distribution of distances between points?
Our usual intuition might be that the points will be spread out somewhat evenly. This holds true in our familiar \(2D\) and \(3D\) spaces. However, something different happens in higher dimensions.
Consider that the distance between any two points \(\mathbf{X}=(x_i, \ldots, x_d)\) and \(\mathbf{Y}=(y_1, \ldots, y_d)\) is defined as ,
\[ D= |\mathbf{X}-\mathbf{Y}|=\sqrt{\sum_{i=1}^{d}\left(x_{i}-y_{i}\right)^{2}} \]
where for each dimension \(i \in \{ 1, \ldots, d \}\),
\[ x_i \overset{i.i.d}{\sim} Unif(0,1)\\ y_i \overset{i.i.d}{\sim} Unif(0,1) \]
Then \(\left(x_{i}-y_{i}\right)^{2}\) are i.i.d random variables for all \(i\), and \(D^2\) is a summation of independent random variables.
So in high dimensions, the Law of Large Numbers applies and \(D^2\) converges to its expected value. The distances are then concentrated around a mean! However, in lower dimensions (like the 2D and 3D we’re perceptually accustomed to), \(D^2\) is pre-convergence and the distances are still spread out.
Sphere
Consider a \(d-\) dimensional sphere of unit radius centered at the origin: \(\left\{\left(x_{1}, x_{2}, \cdots, x_{d}\right): \sum_{i=1}^{d} x_{i}^{2} \leq 1\right\}\)
Compute surface area and volume
Volume goes to zero
Volume is near the equator
Volume is a narrow annulus
Compute surface area and volume
The surface area \(A(d)\) and the volume \(V(d)\) of a unit-radius sphere in \(d\) dimensions are given by \[ A(d)=\frac{2 \pi^{\frac{d}{2}}}{\Gamma\left(\frac{d}{2}\right)} \quad \text { and } \quad V(d)=\frac{\pi^{\frac{d}{2}}}{\frac{d}{2} \Gamma\left(\frac{d}{2}\right)} \] where \(\Gamma\) is the Gamma functions.
Proof
Cartesian Coordinates
First, lets recall that in 3-dimensions a sphere of radius \(a\) is defined as \(x^{2}+y^{2}+z^{2}=a^{2}\). In order to compute its volume we consider the region \(x^{2}+y^{2}+z^{2}\leq a^{2}\).
- Solving for \(z\) gives: \(-\sqrt{a^{2}-x^{2}-y^{2}} \leq z \leq \sqrt{a^{2}-x^{2}-y^{2}}\)
- The projection of the sphere onto the \(xy-\)plane when \(z=0\) is the circle \(x^{2}+y^{2}=a^{2}\). Then solving for \(y\) within the desired region gives: \(-\sqrt{a^{2}-x^{2}} \leq y \leq \sqrt{a^{2}-x^{2}}\)
- \(-a \leq x \leq a\)
Thus, the volume of the sphere can be computed as:
\[ V(3)=\int_{-a}^{a} \int_{-\sqrt{a^{2}-x^{2}}}^{\sqrt{a^{2}-x^{2}}} \int_{-\sqrt{a^{2}-x^{2}-y^{2}}}^{\sqrt{a^{2}-x^{2}-y^{2}}} 1 d z d y d x \] Extending this formula to a \(d\)-dimensional sphere of radius 1 gives:
\[ V(d)=\int_{-1}^{1} \int_{-\sqrt{1-x_{1}^{2}}}^{\sqrt{1-x_{1}^{2}}} \cdots \int_{-\sqrt{1-x_{1}^{2}-\cdots-x_{d-1}^{2}}}^{\sqrt{1-x_{1}^{2}-\cdots-x_{d-1}^{2}}} \mathrm{~d} x_{d} \mathrm{~d} x_{d-1} \cdots \mathrm{d} x_{1} \]
Polar Coordinates
In 3-dimensions, a sphere of radius \(a\) is defined as all points where \(0 \leq \phi \leq \pi\), \(0 \leq \theta \leq 2 \pi\), and \(0 \leq r \leq a\). Then the volume of the sphere can be computed as,
\[ V(3)=\int_{0}^{\pi} \int_{0}^{2 \pi} \int_{0}^{a} r^{2} \sin (\phi) dr d \theta d \phi \] Using the concept of a solid angle to further simplify the integral by allowing us to write it over a coordinate free ‘direction’. Let \(d \Omega=\sin (\phi) d \theta d \phi\) be the surface area of the infinitismal piece of the solid angle \(S^3\) of the sphere.
Then,
\[ V(3)=\int_{S^{3}} \int_{r=0}^{a} r^{2} d \Omega d r \] Extending this formula to a d-dimensional sphere of unit radius gives:
\[ \begin{aligned} V(d) &=\int_{S^{d}} \int_{r=0}^{1} r^{d-1} \mathrm{~d} r \mathrm{~d} \Omega \\ &=\left(\int_{S^{d}} \mathrm{~d} \Omega\right)\left(\int_{r=0}^{1} r^{d-1} \mathrm{~d} r\right) \quad\quad \text{(Variables are independent)}\\ &=\frac{1}{d} \int_{S^{d}} \mathrm{~d} \Omega \\ &=\frac{1}{d} A(d) \quad\quad(i) \end{aligned} \]
A Trick
Consider a different problem: \[ I(d) \stackrel{\text { def }}{=} \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} \mathrm{e}^{-\left(x_{1}^{2}+x_{2}^{2}+\cdots+x_{d}^{2}\right)} \mathrm{d} x_{d} \cdots \mathrm{d} x_{2} \mathrm{~d} x_{1} \]
This can be computed in both Cartesian and Polar coordinates. They will be equated to solve our problem.
Cartesian: \[ \begin{aligned} I(d) &=\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \ldots \int_{-\infty}^{\infty} \mathrm{e}^{-\left(x_{1}^{2}+x_{2}^{2}+\cdots+x_{d}^{2}\right)} \mathrm{d} x_{d} \cdots \mathrm{d} x_{2} \mathrm{~d} x_{1} \\ &=\left[\int_{-\infty}^{\infty} \mathrm{e}^{-\left(x_{1}\right)^{2}} \mathrm{~d} x_{1}\right]^{d} \quad\text{(independent coordinates across same range)}\\ &=(\sqrt{\pi})^{d} \\ &=\pi^{\frac{d}{2}} \end{aligned} \] Polar: \[\begin{aligned} I(d) &=\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \ldots \int_{-\infty}^{\infty} \mathrm{e}^{-\left(x_{1}^{2}+x_{2}^{2}+\cdots+x_{d}^{2}\right)} \mathrm{d} x_{d} \cdots \mathrm{d} x_{2} \mathrm{~d} x_{1} \\ &=\int_{S^{d}} \int_{-\infty}^{\infty} \mathrm{e}^{-r^{2}} r^{d-1} \mathrm{~d} r \mathrm{~d} \Omega \quad\quad\text{(Definition of sphere+ change to polar)}\\ &=\left(\int_{S^{d}} \mathrm{~d} \Omega\right)\left(\int_{-\infty}^{\infty} \mathrm{e}^{-r^{2}} r^{d-1} \mathrm{~d} r\right) \quad\quad\text{(Independent variables)}\\ &=A(d) 2 \int_{0}^{\infty} e^{-t} t^{\frac{d-1}{2}} \frac{1}{2 \sqrt{t}} \mathrm{~d} t \quad\quad \text { (Let } t=r^{2} \geq 0, \text { so } r=\sqrt{t}, \mathrm{~d} r=\frac{1}{2 \sqrt{t}} \mathrm{~d} t \text { ) }\\ &=A(d) \int_{0}^{\infty} e^{-t} t^{\frac{d}{2}-1} \mathrm{~d} t \\ &=A(d) \Gamma\left(\frac{d}{2}\right) \quad\quad\text{(Definition of $\Gamma$ function)} \end{aligned}\].
Then by equating the solution for \(I(d)\) in cartesian and polar coordinates,
\[ A(d)=\frac{2 \pi^{\frac{d}{2}}}{\Gamma\left(\frac{d}{2}\right)} \]
Finally, using equation \((i)\),
\[ V(d)=\frac{\pi^{\frac{d}{2}}}{\frac{d}{2} \Gamma\left(\frac{d}{2}\right)} \]
More generally, the surface area and volume equations for a \(d-\)sphere of radius \(r\) are,
\[ A_{r}(d)=\frac{2 \pi^{\frac{d}{2}}}{\Gamma\left(\frac{d}{2}\right)} r^{d-1} = A(d)r^{d-1} \quad \text{and}\\ V_{r}(d)=\frac{\pi^{\frac{d}{2}}}{\Gamma\left(\frac{d}{2}+1\right)} r^{d}=V(d)r^d \]
Volume goes to zero
By Stirling’s Formula, \[ \Gamma(\frac{d}{2}) \approx \sqrt{\frac{2 \pi}{d/2}}\left(\frac{d/2}{e}\right)^{\frac{d}{2}}=\sqrt{\frac{4 \pi}{d}}\left(\frac{d}{2e}\right)^{\frac{d}{2}} \]
So, the denominator of \(V(d)\) grows faster than its numerator since \(d^\frac{d}{2}\) grows much faster than \(\pi^\frac{d}{2}\),. Thus,
\[ V(d) \rightarrow 0 \quad \text { as } \quad d \rightarrow \infty \]
Volume is near the equator
Without loss of generality, pick any of the \(d\) poles of the sphere as the North Pole, say the pole along \(x_1\). Then the North Pole is \(x_1=1\).
Define the equator as the \((d-1)\) dimensional sphere: \(\left\{\mathbf{x}|| \mathbf{x} \mid \leq 1, x_{1}=0\right\}\). This is formed by the intersection of the \(x_1=0\) plane with the \(d-\)sphere. It has volume \(V(d-1)\). Note that, in 3-dimensions this is (familiarly) a circle.
The equator divides the sphere into two hemispheres, or \(d\)-dimensional half spheres, one upper and one lower.
Consider a plane \(x_1=\epsilon\) slightly above the \(x_1=0\) plane, for some \(\epsilon>0\). Less than \(\frac{2}{\varepsilon \sqrt{(d-1)}} e^{-\frac{d-1}{2} \varepsilon^{2}}\) fraction of the volume in the upper hemisphere lies above the \(x_1=\epsilon\) plane. This value drops exponentially with \(\epsilon^2\) as \(\epsilon\) increases and \(d\) goes to infinity. Same is true for volume below \(x_1=-\epsilon\) in the lower hemisphere.
Proof
Without loss of generality we will show the bound for the upper hemisphere.
Define the region \(T\) as the portion of the sphere above the \(x_1=\epsilon\) plane: \(T=\left\{\mathbf{x}|| \mathbf{x} \mid \leq 1, x_{1} \geq \varepsilon\right\}\). Then,
\[ \begin{aligned} \text {Upper bound on proportion of volume above }\epsilon&=\frac{\text { Upper bound of } \operatorname{Vol}(T)}{\text { Lower bound of } \operatorname{Vol}(\text { hemisphere })}\\ &=\frac{\text { Upper bound of } \operatorname{Vol}(T)}{\text { Lower bound of } \frac{1}{2}\operatorname{Vol}(d)} \end{aligned} \]
\[ \begin{aligned} \operatorname{Vol}(T) &=\int_{\epsilon}^{1} \Delta V \mathrm{~d} x_1 \quad \quad(\Delta V \text { is a cross-sectional slice }) \\ &=\int_{\epsilon}^{1} V(d-1) r^{d-1} \mathrm{~d} x_1 \quad (\text{Volume of $(d-1)$ dimensional sphere of radius $r$}) \\ &=V(d-1) \int_{\epsilon}^{1}(1-x_1)^{\frac{d-1}{2}} \mathrm{~d} x_1 \quad\left(r=\sqrt{1-x_1^{2}}\right)\\ &\leq V(d-1) \int_{\epsilon}^{1} \mathrm{e}^{-x_1^{2} \frac{d-1}{2}} \mathrm{~d} x_1 \quad\text{(Lemma 1)}\\ &\leq V(d-1) \int_{\epsilon}^{\infty} \mathrm{e}^{-x_1^{2} \frac{d-1}{2}} \mathrm{~d} x_1 \quad\left(\mathrm{e}^{-x_1^{2} \frac{d-1}{2}} \geq 0 \forall x_1 \in\left[\epsilon, \infty\right]\right)\\ &\leq V(d-1) \int_{\epsilon}^{\infty} \frac{x_1}{\epsilon} \mathrm{e}^{-x_1^{2} \frac{d-1}{2}} \mathrm{~d} x_1 \quad\left(\frac{x_1}{\epsilon} \geq 1 \forall x_1 \in\left[\epsilon, \infty\right]\right)\\ &=\frac{V(d-1)}{-2 \epsilon \frac{d-1}{2}} \int_{\epsilon}^{\infty} \mathrm{e}^{-x_1^{2} \frac{d-1}{2}} \mathrm{~d}\left(-x_1^{2} \frac{d-1}{2}\right) \quad(\text{Subst: } u=-x_1^{2} \frac{d-1}{2}; du=-2\frac{d-1}{2}x_1 dx_1)\\ &=\left.\frac{V(d-1)}{\epsilon(d-1)} \mathrm{e}^{-x_1^{2} \frac{d-1}{2}}\right|_{\epsilon} ^{\infty} \\ &=\frac{1}{\epsilon (d-1) } \mathrm{e}^{-\frac{d-1}{2} \epsilon^{2}} V(d-1) \end{aligned} \] \[ \begin{aligned} \frac{1}{2} V(d) &=\int_{0}^{1} V(d-1) r^{d-1} \mathrm{~d} x_1 \\ &=V(d-1) \int_{0}^{1}\left(1-x_1^{2}\right)^{\frac{d-1}{2}} \mathrm{~d} x_1 \\ & \geq V(d-1) \int_{0}^{\frac{1}{\sqrt{d-1}}}\left(1-x_1^{2}\right)^{\frac{d-1}{2}} \mathrm{~d} x_1 \\ & \geq V(d-1) \int_{0}^{\frac{1}{\sqrt{d-1}}}\left(1-\frac{d-1}{2} x_1^{2}\right) \mathrm{d} x_1 \quad(\text { Lemma } 2)\\ & \geq V(d-1) \int_{0}^{\frac{1}{\sqrt{d-1}}}\left(1-\frac{d-1}{2} \frac{1}{d-1}\right) \mathrm{d} x_1 \quad \left(t^{2} \leq \frac{1}{d-1} \forall x_1 \in\left[0, \frac{1}{\sqrt{d-1}}\right]\right)\\ &=V(d-1) \int_{0}^{\frac{1}{\sqrt{d-1}}} \frac{1}{2}\\ &=\frac{1}{2 \sqrt{d-1}} V(d-1) \\ \end{aligned} \]
Thus,
\[ \begin{aligned} \text{Upper bound on proportion of volume above }\epsilon&=\frac{\frac{1}{\epsilon (d-1) } \mathrm{e}^{-\frac{d-1}{2} \epsilon^{2}} V(d-1)}{\frac{1}{2 \sqrt{d-1}} V(d-1) } \\ &= \frac{2}{\varepsilon \sqrt{(d-1)}} e^{-\frac{d-1}{2} \varepsilon^{2}} \end{aligned} \]
Volume is a narrow annulus
The ratio of the volume of a \(d-\)sphere of radius \(1-\epsilon\) to one with unit raduis is,
\[ \frac{V_{1-\epsilon}(d)}{V_1(d)}=\frac{(1-\varepsilon)^{d} V(d)}{V(d)}=(1-\varepsilon)^{d} \] This ratio \(\rightarrow 0\) as \(d \rightarrow \infty\). So in high dimensions, all of the volume of the sphere is concentrated in a narrow annulus at the surface.
Cube
Consider a \(d-\)dimensional cube with unit-length sides centered at the origin.
- As \(d\) increases, the volume remains constant at \(1^d=1\).
- However, the maximum distant between points, i.e. from one vertex to another, keeps increasing as \(\sqrt{d}\). Eg. the distance between the vertex at \((\frac{1}{2}, \ldots, \frac{1}{2})\) and \((-\frac{1}{2}, \ldots, -\frac{1}{2})\) is \(\sqrt{d* \left( \frac{1}{2}+ \frac{1}{2}\right)^2}=\sqrt{d}\)
This is in interesting contrast to what we saw of the \(d-\)dimensional sphere of unit radius we saw earlier where,
- As \(d\) increases the volume \(\rightarrow 0\), and
- The maximum distance between two points given by the diameter remains constant at \(2r=2\).
The relationship between the unit-length hypercube and unit-radius hypersphere as \(d\) increases is shown below.
For \(d=2\), the cube is completely inside the sphere. Distance from origin to a vertex\(=\sqrt{\left(\frac{1}{2}\right)^{2}+\left(\frac{1}{2}\right)^{2}}=\frac{\sqrt{2}}{2} \cong 0.707\)
For \(d=4\), the vertices of the cube lies exactly on the boundary of the sphere since the distance of from the origin to a vertex of the cube=\(\sqrt{\left(\frac{1}{2}\right)^{2}+\left(\frac{1}{2}\right)^{2}+\left(\frac{1}{2}\right)^{2}+\left(\frac{1}{2}\right)^{2}}=1\)
For any higher dimensions, the vertices of the cube exceed the confines of the sphere! However, there are still points of the cube that remain within the sphere (eg. \((\frac{1}{2},0,\ldots,0)\)). So the high dimensional hypercube ends up with a “spiky” shape with distance from origin to a vertex=\(\sqrt{d* \left( \frac{1}{2}\right)^2}=\sqrt{\frac{d}{4}}=\frac{\sqrt{d}}{2}\)
Other weird stuff
- Most of the volume is within a narrow O(1/d) annulus
- Most of the volume is concentrated about the equator, defined as
Gaussians
For a d-dimensional spherical Gaussian of variance 1 , all but \(\frac{4}{c^{2}} e^{-c^{2} / 4}\) fraction of its mass is within the annulus \(\sqrt{d-1}-c \leq r \leq \sqrt{d-1}+c\) for any \(c>0\)
\(\Omega\left(d^{1 / 4}\right)\) separation is required to separate means of d-dimensional spherical Gaussian of variance 1.