Examination projects (numbering starts with zero)

Cubic sub-spline for data with derivatives
Introduction

In practice a function is often tabulated together with its derivative (for example, after numerical integration of a second order ordinary differential equation). The table is then given as {x_i, y_i, y'_i}_i=1,..,n , where y_i is value of the tabulated function, and y'_i is the value of the derivative of the tabulated function at the point x_i.

Problem

Given a data-set, {x_i, y_i, y'_i}_i=1,..,n , build a cubic sub-spline,
S(x) _{x∈[x_i,x_i+1]} =S_i(x)
where
S_i(x)= y_i +b_i(x-x_i) +c_i(x-x_i)² +d_i(x-x_i)³.
For each interval the three coefficients b_i, c_i, d_i are determined by the three conditions,
S_i(x_i+1)=y_i+1,
S'_i(x_i)=y'_i,
S'_i(x_i+1)=y'_i+1.

See the subsection "Akima sub-spline interpolation" for the inspiration.

Extra
1. Derivative and the integral of the spline.
2. Make the second derivative of the spline continuous by increasing the order of the spline to four, for example, in the form
  S_i(x)= y_i +b_i(x-x_i) +c_i(x-x_i)² +d_i(x-x_i)³ +e_i(x-x_i)²(x-x_i+1)² ,
  and choosing the coefficients e_i such that the spline has continuous second derivative.
Yet another cubic sub-spline
Introduction

The ordinary cubic spline simetimes makes unpleasant wiggles, for example, around an outlier, or around a step-like feature of the tabulated function (read the Akima sub-spline chapter in the lecture notes). Here is yet another attempt to reduce the wiggles by building a sub-spline.

Problem
Consider a data set {x_i, y_i}_i=1,..,n which represents a tabulated function.
Implement a sub-spline of this data using the following algorithm:
1. For each inner point x_i, i=2,..,n-1, build a quadratic interpolating polynomial through the points x_i-1, x_i, x_i+1 and estimate, using this polynomial, the first derivative, p_i, of the function at the point x_i;
2. For the the first and the last points estimate p₁ and p_n from the the same polynomial that you used to estimate correspondingly p₂ and p_n-1.
3. Now you have a data set {x_i, y_i, p_i}_i=1,..,n where the function is tabulated together with its derivative: build a cubic sub-spline for it,
  S_i(x)= y_i +b_i(x-x_i) +c_i(x-x_i)² +d_i(x-x_i)³,
  where for each interval the three coefficients b_i, c_i, d_i are determined by the three conditions,
  S_i(x_i+1)=y_i+1,
  S'_i(x_i)=p_i,
  S'_i(x_i+1)=p_i+1.
Extra

Derivative and integral of the spline.
Bi-linear interpolation on a rectilinear grid in two dimensions

Introduction

A rectilinear grid (note that rectilinear is not necessarily cartesian nor regular) in two dimensions is a set of n_x×n_y points where each point can be adressed by a double index (i,j) where 1 ≤ i ≤ n_x, 1 ≤ j ≤ n_y and the coordinates of the point (i,j) are given as (x_i,y_j), where x and y are vectors with sizes n_x and n_y correspondingly. The values of the tabulated function F at the grid points can then be arranged as a matrix {F_i,j=F(x_i,y_j)}.
Problem

Build an interpolating routine which takes as the input the vectors {x_i} and {y_j}, and the matrix {F_i,j} and returns the bi-linear interpolated value of the function at a given 2D-point p=(p_x,p_y).

Hints

See the chapter "Bi-linear interpolation" in the book.
The signature of the interpolating subroutine can be
double bilinear(int nx, int ny, double* x, double* y, double** F, double px, double py)
or
double bilinear(gsl_vector* x, gsl_vector* y, gsl_matrix* F, double px, double py)
Generalized eigenvalue problem
Introduction

A generalized eigenvalue problem is the problem of finding all generalized-eigenvectors x and the corresponding generalized-eigenvalues λ which obey the generalized-eigenvalue-equation
Ax=λNx,
where A is a real symmetric matrix and N is a real symmetric positive-definite matrix (all eigenvalues of a positive-definite matrix are positive).

Problem
Implement a routine for solving the generalized eigenvalue problem using the following algorithm:
1. Using your Jacobi-eigenvalue routine find the eigenvalue decomposition of the matrix N,
  N=VDV^T,
  where D is the diagonal matrix with (positive) eigenvalues of the matrix N on the diagonal and V is the matrix with the corresponding eigenvectors as its columns. Now the original generalized eigenvalue problem can be represented as an ordinary eigenvalue problem (prove it analytically),
  By=λy,
  where the real symmetric matrix B=√D^-1V^TAV√D^-1, and y=√DV^Tx.
2. Using your Jacobi-eigenvalue routine find the the eigenvalues and eigenvectors of the matrix B.
3. Restore the original eigenvectors x.
Jacobi eigenvalue algorithm: several smallest (largest) eigenvalues.

Introduction

In practice, for example in quantum mechanics, one often needs to calculate only few lowest (or largest) eigenvalues and eigenvectors of a given (real symmetric) matrix. In this case the full diagonalization of the matrix might be ineffective.

Problem

Implement the variant of the Jacobi eigenvalue algorithm which calculates the given number n of the smallest (largest) eigenvalues and corresponding eigenvectors by consecutively zeroing the off-diagonal elements only in the first (last) n rows of a real symmetrix matrix.

Extra

Check how faster it is to calculate only the lowest eigenvalue as compared to full diagonalization.
Find out for how large n the full diagonalization becomes more effective.
Hessenberg factorization of a real square matrix using Jacobi transformations

Why Hessenberg matrix?

If the constraints of a linear algebra problem do not allow a general matrix to be conveniently reduced to a triangular form, reduction to Hessenberg form is often the next best thing. Reduction of any matrix to a Hessenberg form can be achieved in a finite number of steps. Many linear algebra algorithms require significantly less computational effort when applied to Hessenberg matrices.

Definitions

A lower Hessenberg matrix H is a square matrix that has zero elements above the first sup-diagonal, {H_ij=0}_j>i+1.

An upper Hessenberg matrix H is a square matrix that has zero elements above the first sub-diagonal, {H_ij=0}_i>j+1.

Hessenberg factorization is a representation of a square real matrix A in the form A=QHQ^T where Q is an orthogonal matrix and H is a Hessenberg matrix.

If A is a symmetrix matrix its Hessenberg factorization produces a tridiagonal matrix H.

Hessenberg factorization by Jacobi transformations

Consider a Jacobi transformation with the matrix J(p,q),
A→J(p,q)^TAJ(p,q)
where the rotation angle is chosen not to zero the element A_p,q but rather to zero the element A_p-1,q. Argue that the following sequence of such Jacobi transformations,
J(2,3), J(2,4), ..., J(2,n); J(3,4), ..., J(3,n); ...; J(n-1,n)
reduces the matrix to the lower Hessenberg form.
Implement this algorithm. Remember to accumulate the total transformation matrix.
Extra

What is faster: Hessenberg factorization or QR-factorization?
Calculate the determinant of the Hessenberg matrix.
LU factorization of a Hessenberg matrix
Why?

LU-factorization of a Hessebberg matrix takes only O(n²) operations!
An upper n×n Hesseberg matrix H can be LU-factorized, H=LU, with the L-matrix being bidiagonal: made of a main diagonal (containing ones) and a subdiagonal with elements v_i, i=1,..,n-1.

Algorithm
The elementary operation for 2x2 matrix is given as
```
[  1 0 ] [ h11 h12 ] = [     h11           h12   ]
[ -v 1 ] [ h21 h22 ]   [ -v*h11+h21   -v*h11+h22 ]
```
If we choose v=h21/h11 then the matrix becomes upper triangular,
```
[  1 0 ] [ h11 h12 ] = [  h11       h12   ] ≡ U
[ -v 1 ] [ h21 h22 ]   [   0   -v*h11+h22 ]
```
Now the inverse of the left matrix is easy:
```
[  1 0 ] [ 1 0 ] = [ 1 0 ]
[ -v 1 ] [ v 1 ]   [ 0 1 ]
```
Therefore the sought operation, which eliminates one sub-diagonal element of matrix H, is
```
[ h11 h12 ] = [  1 0 ] [ u11 u12 ]
[ h21 h22 ]   [  v 1 ] [  0  u22 ]
```
where v=h21/h11, u11=h11, u12=h12, u22=h22-v*h11.
Now one simply has to apply this operation to all sub-diagonal elements of the upper Hessenber matrix H, which leads to the following algorithm (see A note on the stability of the LU factorization of Hessenberg matrices):
```
for i=1:n
	u_1,i=h_1,i
endfor
for i=1:n-1
	v_i=h_i+1,i/u_i,i
	for j=i+1:n
		u_i+1,j=h_i+1,j-v_i*u_i,j
	endfor
endfor
```
Problem

Implement this O(n²) algorithm for LU factorization of a Hessenberg matrix.
Two-sided Jacobi algorithm for Singular Value Decomposition (SVD)
Introduction

The SVD of a (real square, for simplicity) matrix A is a representation of the matrix in the form
A = U D V^T ,
where matrix D is diagonal with non-negative elements and matrices U and V are orghogonal. The diagonal elements of matrix D can always be chosen non-negative by multiplying the relevant columns of matrix U with (-1).
SVD can be used to solve a number of problems in linear algebra.

Problem

Implement the two-sided Jacobi SVD algorithm.

Algorithm (as described in the "Eigenvalues" chapter of the book)
In the cyclic Jacobi eigenvalue algorithm for symmetric matrices we applied the elementary Jacobi transformation
A → J^T A J

where J ≐ J(θ,p,q) is the Jacobi matrix (where the angle θ is chosen to eliminate the off-diagonal elements A_pq=A_qp) to all upper off-diagonal matrix elements in cyclic sweeps until the matrix becomes diagonal.
The two-sided Jacobi SVD algorithm for general real square matrices is the same as the Jacobi eigenvalue algorithm, except that the elementary transformation is slightly different,
A → J^T G^T A J ,

where
- G ≐ G(θ,p,q) is the Givens (Jacobi) matrix where the angle is chosen such that the matrix A' ≐ G^T(θ,p,q) A has identical off-diagonal elements, A'_pq=A'_qp:
  tan(θ) = (A_pq - A_qp)/(A_qq + A_pp)
  (check that this is correct).
- J ≐ J(θ,p,q) is the Jacobi matrix where the angle is chosen to eliminate the off-diagonal elements A'_pq=A'_qp.
The matrices U and V are updated after each transformation as
U → U G J , V → V J .
Of course you should not actually multiply Jacobi matrices—which costs O(n³) operations—but rather perform updates which only cost O(n) operations.
One-sided Jacobi algorithm for Singular Value Decomposition (SVD)

Introduction

See the previous exercise.

Problem

Implement the one-sided Jacobi SVD algorithm.

Algorithm

In this method the elementary iteration is given as
A → A J(θ,p,q)
where the indices (p,q) are swept cyclicly (p=1..n, q=p+1..n) and where the angle θ is chosen such that the columns number p and q of the matrix AJ(θ,p,q) are orthogonal. One can show that the angle should be taken from the following equation,
tan(2θ)=2a_p^Ta_q /(a_q^Ta_q-a_p^Ta_p)
where a_i is the i-th column of matrix A.
After the iterations converge and the matrix A'=AJ (where J is the accumulation of the individual rotations) has orthogonal columns, the SVD is simply given as
A=UDV^T
where
V=J, D_ii=||a'_i||, u_i=a'_i/||a'_i||,
where a'_i is the i-th column of matrix A' and u_i us the i-th column of matrix U.
Golub-Kahan-Lanczos bidiagonalization

Introduction

Bidiagonalization is a representation of a real matrix A in the form A=UBV^T where U and V are orthogonal matrices and B is a bidiagonal matrix with non-zero elements only on the main diagonal and first sup-diagonal.
Bidiagonalization can be used on its own to solve linear systems and ordinary least squares problems, calculate the determinant and the (pseudo-)inverse of a matrix. But it mostly is used as the first step in SVD.

Problem

Here is a description of the Golub-Kahan-Lanczos algorithm for bidiagonalization. Implement it.
Inverse iteration algorithm for eigenvalues (and eigenvectors)

Introduction

See the chapter "Power iteration methods and Krylov subspaces" in the book.

Problem

Implement the variant of the inverse iteration method that calculates the eigenvalue closest to a given number s (and the corresponding eigenvector).
Lanczos tridiagonalization algorithm
Implement the Lanczos algorithm (Lanczos interation) for real symmetric matrices.
Symmetric rank-1 update of a size-n symmetric eigenvalue problem
The matrix A to diagonalize is given in the form
A = D +uu^T,
where D is a diagonal matrix and u is a column-vector.
Given the diagonal elements of the matrix D and the elements of the vector u find the eigenvalues of the matrix A using only O(n²) operations (see section "Eigenvalues of updated matrix" in the book).
Symmetric row/column update of a size-n symmetric eigenvalue problem
The matrix to diagonalize is given in the form
A = D + e(p) u^T + u e(p)^T
where D is a diagonal matrix with diagonal elements {d_k, k=1,...,n}, u is a given column-vector, and the vector e(p) with components e(p)_{_i}=δ_ip is a unit vector in the direction p where 1≤p≤n.
Given the diagonal elements of the matrix D, the vector u, and the integer p, calculate the eigenvalues of the matrix A using O(n²) operations (see section "Eigenvalues of updated matrix" in the book).
ODE: a two-step method
Implement a two-step stepper for solving ODE (as in the book). Use your own stepsize controlling driver.
ODE with complex-valued functions of complex variable
Generalize the ODE solver of your choice to solve ODE with complex-valued functions of complex variable along a straight line between the given start- and end-point.
Adaptive 1D integrator with random nodes
Implement an adaptive one-dimensional integrator with random abscissas. Reuse points. Note that you don't need to pass the points to the next level of recursion, only statistics.
Adaptive integration of complex-valued functions
Implement an adaptive integrator which calculates the integral of a complex-valued function f(z) of a complex variable z along a straight line between two points in the complex plane.
Adaptive integrator with subdivision into three subintervals.
Implement a (one-dimensional) adaptive integrator which at each iteration subdivides the interval not into two, but into three sub-intervals.
Two-dimensional integrator
Implement a two-dimensional integrator for integrals in the form
∫_{_a}^{^b}dx ∫_{_d(x)}^{^u(x)}dy f(x,y)
which consecutively applies your favourite adaptive one-dimensional integrator along each of the two dimensions. The signature might be something like
double integ2D(double a, double b, double d(double x), double u(double x), double f(double x, double y), double acc, double eps)
Multidimensional pseudo-random (plain Monte Carlo) vs quasi-random (Halton and/or lattice sequence) integrators
Investigate the convergence rates (of some interesting integrals in different dimensions) as function of the number of sample points.
Rootfinding: 1D complex vs. 2D real
Implement a (quasi) Newton method for rootfinding for a complex function f of complex variable z,
f(z) = 0 .
Hints: do it the ordinary way, just with complex numebers:
- The usual Newton's step Δz, f(z+Δz) = 0 , ⇒ f(z)+f'(z)Δz = 0 , ⇒ Δz = -f(z)/f'(z) .
- Instead of full Newton's step do backtracking linesearch. In the simplest incarnation, |f(z+λΔz)| < |f(z)| .
Compare the effectiveness of your complex implementation with your homework multi-dimensional implementation of real rootfinding applied to the equivalent 2D system of two real equation with two real variables x and y,
Re f(x + iy) = 0 ,
Im f(x + iy) = 0 .
Artificial neural network (ANN) for solving ODE
At the lectures we have constructed an ANN that can be trained to approximate a given tabulated function. Here you should build a topologically similar network that can be trained to approximate a solution to a one-dimenstional first-order ordinary differential equation,
y'=f(x,y) ,
on a given interval x∈[a,b] with a given initial condition y(x₀)=y₀. After training the response of the network to the input parameter x should approximate the sought solution y(x).

Solve the following equations:
y'=y*(1-y), x∈[-5,5], y(0)=0.5 (logistic function)

y'=-x*y, x∈[-5,5], y(0)=1 (gaussian function)

Hints:
- Given the ODE's right-hand side f(x,y) the network must be trained (that is, its parameter-vector p must be tuned) such that for each x∈[a,b]:
  F_p'(x)=f(x,F_p(x)),
  and
  F_p(x₀) = y₀,
  where F_p(x) is the response of the network to the input x, and F_p'(x) is its derivative with respect to x.
- The hidden neurons must be slightly modified: they must contain both the activation function g(x)—we call it g(x) here because f(x,y) is already taken for the ODE—and the analytic derivative of the activation function g'(x), for example,
```
double g(double x){
	return exp(-x*x/2); /* a gaussian activation function (but you better use a gaussian wavelet instead) */
	}
double dg(double x){
	return -x*exp(-x*x/2); /* its derivative */
}
```
- The feed-forward function of the network must also be modified: it has to return both the netowork's response F_p(x)—which eventually has to approximate the sought function y(x)—and the derivative of the response F_p'(x) which will be used for unsupervised training. For example
```
void feed_forward(ann* network,double x,double* F, double* dF){
   *F = ∑_{_[i=1..n]} w_i*g((x-a_i)/b_i);
   *dF= ∑_{_[i=1..n]} w_i*g'((x-a_i)/b_i)/b_i;
}
```
- The training consists of minimizing—in the space of the parameters of the network—the deviation function
  δ(p)=∑_k=1..N |F_p'(x_k) - f(x_k,F_p(x_k))|² +N*|F_p(x₀)-y₀|² ,
  where p is the vector of parameters of the newtork; x_k=a+(b-a)*(k-1)/(N-1)|_k=1..N is a mesh of points spanning the interval [a,b]; {F(x_k), F'(x_k)} is the response of the network and its derivative at the input signal x_k.