1. Functions¶
In Diffusion Geometry, a function on a point cloud $M$ is an assignement $f:M\to \mathbb{R}$. As $M$ is assumed to be finite by virtue of fitting into a computer, functions can can be represented as vectors, with one real value per point $f\in\mathbb{R}^n$. Going back to our point cloud from chapter 0, we give a simple example a function by looking at the $x$-coordinate of $M$.
import sys
import os
sys.path.append(os.path.abspath('..')) # Add parent directory to path
from diffusion_geometry.visualisation import plot_scatter_2d
import numpy as np
from plotly.subplots import make_subplots
np.random.seed(0)
# Generate synthetic 2D point cloud M
n = 150
xx = np.linspace(0, 2*np.pi, n)
M = np.array([[np.cos(x), np.sin(x)] for x in xx]) + 0.2*np.random.randn(xx.shape[0], 2)
M = 1*np.array([[np.cos(x)/(1 + np.sin(x)**2), np.sin(x)*np.cos(x)/(1 + np.sin(x)**2)] for x in xx]) + 0.1*M + 0.0*np.random.randn(xx.shape[0], 2)
M = np.concatenate((M, 0.18*np.random.randn(200,2) - [0.2,0]))
# Define a point-wise function on M by projecting onto the first coordinate
f_point_wise = M[:,0]
fig = plot_scatter_2d(M, color = f_point_wise)
# Add a colorbar to the first trace
fig.data[0].marker.showscale = True
fig.data[0].marker.colorbar = dict(
title=dict(
text="Function",
side="top"
),
ticks="outside",
lenmode="fraction",
len=0.8
)
fig.show()
Compression of the Space of Functions¶
We denote the space of functions by $A$. It is an algebra: a vector space equipped with multiplication. For a finite dataset $M$ of $n$ points, a basis of $A$ can be identified with $\mathbb{R}^n$, with one basis element per point representing the function value there.
For large datasets, this can make the function space prohibitively large. One solution is to choose a compressed of "smoothest" functions. If we want to restrict our function space from an $n$-dimensional space to an $n_0$-dimensional space, we take the first $n_0$-eigenvectors of the $n\times n$ matrix given by the Markov chain $P$ (defined in chapter 0), ordered by eigenvalue. We then define the compressed function space to be $$A=\mathrm{Span}\{ \phi_1, \dots, \phi_{n_0}\}.$$ We store these eigenvectors in the $n\times n_0$ matrix $U$. These functions are orthonormal with respect to the measure $\mu$ and we have $$U^T\mathrm{diag}(\mu) U = I_{n_0}.$$ Given a function $f\in\mathbb{R}^n$ in the point-wise basis, we project into the compressed function space by $$f^*:= U^T\mathrm{diag}(\mu) f \in\mathbb{R}^{n_0}.$$ If $n_0<n$, the conversion back is lossy $$f \approx \sum_i f_i^* \phi_i \in\mathbb{R^n}.$$
In our framework, we are able to set the truncation parameter $n_0$ explicitly. Moreover, we can define functions pointwise and the framework handles the conversion into the compressed function basis. This, however, reveals an important pitfall: if one wants to visualise function, one has to convert back into the pointwise, "ambient", basis.
The class 'Function' is responsible for handling functions.
# in the code n_function_basis=n_0 and restrict the compressed function space
# to the first 50 eigen functions
from diffusion_geometry.core.geometry.diffusion_geometry import DiffusionGeometry
from diffusion_geometry.tensors.functions.function import Function
dg = DiffusionGeometry.from_point_cloud(M, n_function_basis=50)
f_point_wise = M[:,0]
# Construct an instance of a Function from its point-wise representation
f_function_basis = Function.from_pointwise_basis(f_point_wise, dg)
# The function is now represented in the compressed function basis
# with n_function_basis coefficients
print(f"Function basis shape: {f_function_basis.shape}")
# To reconstruct the plottable function in the ambient space
f_reconstructed = f_function_basis.to_ambient()
print(f"Reconstructed function shape: {f_reconstructed.shape}")
# How well this reconstruction approximates the original point-wise function
# depends on the number of function basis elements used and the regularity of the function.
plot_scatter_2d(M, color = f_reconstructed).show()
Function basis shape: (50,) Reconstructed function shape: (350,)
As functions internally use a representation in terms of eigenfunctions of the diffusion process, we are able to visualise them by manually setting the coefficients of the function basis representation. Since the diffusion process we use is defined via the heat kernel, it is intricately linked to the Laplacian, and the eigenfunctions agree with those of the Laplacian $\Delta$.
from plotly.subplots import make_subplots
from diffusion_geometry.visualisation import clean_fig
# Generate perfect circle data
n = 50
xx = np.linspace(0, 2*np.pi, n)
M_circle = np.array([[np.cos(x), np.sin(x)] for x in xx])
n_function_basis = 40
dg_circle = DiffusionGeometry.from_point_cloud(M_circle, n_function_basis=n_function_basis)
fig = make_subplots(
rows=3, cols=3,
)
eigenfunctions = list(range(9))
# Generate the first 9 diffusion eigenfunctions and plot them
for i in range(9):
# the i-th diffusion eigenfunction corresponds to the i-th basis vector and can be constructed as follows
eigenfunction = Function.from_coeffs(np.eye(n_function_basis)[i], dg_circle)
plot_scatter_2d(M_circle, color = eigenfunction.to_ambient())
# We always need to convert the functions to the ambient space for plotting
fig1 = plot_scatter_2d(M_circle, color=eigenfunction.to_ambient())
fig.add_trace(fig1.data[0], row=(i//3)+1, col=(i%3)+1)
clean_fig(fig)
fig.update_layout(
title=dict(
text="First 9 diffusion eigenfunctions on the circle",
x=0.5,
xanchor="center",
yanchor="top"
),
margin=dict(t=60)
)
fig.update_layout(width=800, height=600)
fig.show()
As one can see, the eigenfunctions, ordered by eigenvalue, are able to express higher and higher variation functions (the first eigenfunction is always the constant one).