Linear systems
This page is about solutions to linear systems [math]Ax=b[/math]. We call d the dimension of [math]x[/math].
Contents
Applications
Least square regression. PageRank.
Complexity
The complexity of solving Ax=b is that of matrix multiplication. The naive approach, say Gauss-Jordan, yields [math]O(d^3)[/math]. But using the Optimized CW-like algorithms yields [math]O(d^{2.373})[/math].
A lot of the work is about the parallelization of matrix multiplication using hardware like GPU.
Quantum algorithm for sparse linear systems
HHL09 propose quantum algorithms to compute the norm of the solution [math]x[/math] to [math]Ax=b[/math]. The complexity is [math]O(k^2 \log d)[/math], where [math]d[/math] is the dimension of [math]x[/math] and [math]k[/math] is the condition number ([math]= ||A|| \cdot ||A^{-1}||[/math] for operator norms). This is an exponential quantum speedup with respect to best known classical algorithm (in [math]O(kd)[/math]).
Multiplierless algorithm for Ax=b
ThaoRzepka19 (actually most of it seems to be in an unpublished paper) propose a multiplierless iteration algorithm to solve [math]Ax=b[/math]. The basic idea is to decompose it into equations [math]a_i x=b_i[/math]. Then we iterate for values of [math]i[/math], and adjust only [math]x_i[/math] to obtain a new value of [math]x[/math] for which [math]a_i x = b_i[/math]. It turns out that if [math]A[/math] is symmetric positive definite, then this algorithm converges. They show this by factorizing [math]A = MM^T[/math] and to show that, in terms of [math]M^Tx[/math], the dynamic is that of projections onto convex sets (POCS). By relaxing the computation of [math]x_i[/math] at each step up to some power of 1/2, they propose a sequence of multiplierless approximate projections that nevertheless converges. Further acceleration can be achieved by choosing the order in which to update i-th coordinates in a "weak greedy" manner.
It's not clear how critical this will be. It turns out that, so far, hardware has been so optimized for float multiplications. But there is a lot of research on specialized hardware. A big question mark is the parallelizability of the algorithm.