Sunday, 17 December 2023

Blog Moved!

Thank you for visiting this blog. This blog has been moved to a LaTeXML-generated site at:

A Blog on Applied Mathematics (aravindh-krishnamoorthy.github.io)

The site will no longer be maintained and is for archival purpose only. The posts on this site are also available on the new site (so there's no reason to not go there).

Aravindh Krishnamoorthy 17.12.2023



Saturday, 13 August 2022

Simultaneous Triangularization of Two Arbitrary Wide Matrices via the QR Decomposition

Simultaneous triangularization (ST) of an arbitrary number of matrices that satisfy certain algebraic properties is well known, see, e.g., [1]. However, for arbitrary wide matrices with the same number of columns, directly applying QR decomposition leads to a "pseudo-triangular" structure, i.e., there are more non-zero columns than the number of rows. However, for just two arbitrary wide matrices, a "true" ST scheme can be obtained exploiting their null spaces. In [2], this technique was presented and used in non-orthogonal multiple access (NOMA) downlink communication systems for interference mitigation. In the following post, an overview of the ST scheme is presented. A MATLAB example is provided.

The statement of the theorem is as follows.

[2, Theorem 2] Let $\boldsymbol{A} \in \mathbb{C}^{m_1\times n}$ and $\boldsymbol{B} \in \mathbb{C}^{m_2\times n},$ $m_1,m_2 \leq n,$ be complex-valued matrices of sizes $m_1\times n$ and $m_2\times n$ and have full row rank. Then, there exist unitary matrices $\boldsymbol{U}_1 \in \mathbb{C}^{m_1\times m_1}, \boldsymbol{U}_2 \in \mathbb{C}^{m_2\times m_2},$ and an invertible matrix $\boldsymbol{X} \in \mathbb{C}^{n\times n}$ such that

\begin{align}\boldsymbol{U}_1\boldsymbol{A}\boldsymbol{X} &= \begin{bmatrix}\boldsymbol{R}_1 & \boldsymbol{0}\end{bmatrix}, \label{eqn:std1}\\\boldsymbol{U}_2\boldsymbol{B}\boldsymbol{X} &= \begin{bmatrix}\boldsymbol{R}_2' & \boldsymbol{0} & \boldsymbol{R}_2''\end{bmatrix}, \label{eqn:std2}\end{align}

where $\boldsymbol{R}_1 \in \mathbb{C}^{m_1\times m_1}$ and $\boldsymbol{R}_2 = \begin{bmatrix}\boldsymbol{R}_2'& \boldsymbol{R}_2''\end{bmatrix} \in \mathbb{C}^{m_2\times m_2}$ are upper-triangular matrices with real-valued entries on their main diagonals.

As seen from the theorem, the ST scheme additionally requires the invertible matrix $\boldsymbol{X}$ apart from the unitary matrices $\boldsymbol{U}_1$ and $\boldsymbol{U}_2.$ Moreover, for $\boldsymbol{B},$ triangularization includes $n-m_2$ columns of zeros in the middle, which may be ignored to construct the matrix $\boldsymbol{R}_2.$ See [2] for generalizations of the theorem.

Below is the proof of the theorem.

Let  $\bar{\boldsymbol{A}} \in \mathbb{C}^{n\times (n-m_1)}$ and $\bar{\boldsymbol{B}} \in \mathbb{C}^{n\times (n-m_2)}$ be matrices that contain a basis for the null space of $\boldsymbol{A}$ and $\boldsymbol{B},$ respectively. Let $\boldsymbol{K} \in \mathbb{C}^{n\times \max(0,m_1+m_2-n)}$ denote the matrix containing a basis for the null space of $\begin{bmatrix} \bar{\boldsymbol{A}} & \bar{\boldsymbol{B}}\end{bmatrix}^\mathrm{H}.$ Let, by QR decomposition,

\begin{align}\boldsymbol{\mathcal{U}}_1 \boldsymbol{R}_1 &= \boldsymbol{A} \begin{bmatrix}\boldsymbol{K} & \bar{\boldsymbol{B}}\end{bmatrix}, \label{eqn:qr1}\\ \boldsymbol{\mathcal{U}}_2 \boldsymbol{R}_2 &= \boldsymbol{B} \begin{bmatrix}\boldsymbol{K} & \bar{\boldsymbol{A}}\end{bmatrix}.\label{eqn:qr2}\end{align}

Then, (\ref{eqn:std1}) and (\ref{eqn:std2}) are satisfied by setting $\boldsymbol{X}$ to

\begin{align}\boldsymbol{X} = \begin{bmatrix}\boldsymbol{K} & \bar{\boldsymbol{B}} & \bar{\boldsymbol{A}}\end{bmatrix}, \label{eqn:x}\end{align}

and choosing $\boldsymbol{U}_1 = \boldsymbol{\mathcal{U}}_1^\mathrm{H}$ and $\boldsymbol{U}_2 = \boldsymbol{\mathcal{U}}_2^\mathrm{H}$ from (\ref{eqn:qr1}) and (\ref{eqn:qr2}) above, to obtain

\begin{align}\boldsymbol{U}_1\boldsymbol{A}\boldsymbol{X} = \begin{bmatrix}\underbrace{\boldsymbol{\mathcal{U}}_1^\mathrm{H} \boldsymbol{A} \begin{bmatrix}\boldsymbol{K} & \bar{\boldsymbol{B}}\end{bmatrix}}_{\boldsymbol{R}_1} &  \underbrace{\boldsymbol{\mathcal{U}}_1^\mathrm{H} \boldsymbol{A} \bar{\boldsymbol{A}}}_{\boldsymbol{0}}\end{bmatrix}, \\ \boldsymbol{U}_2\boldsymbol{B}\boldsymbol{X} \overset{(a)}{=} \begin{bmatrix} \underbrace{\boldsymbol{\mathcal{U}}_2^\mathrm{H} \boldsymbol{B} \boldsymbol{K}}_{\boldsymbol{R}_2'} & \underbrace{\boldsymbol{\mathcal{U}}_2^\mathrm{H} \boldsymbol{B} \bar{\boldsymbol{B}}}_{\boldsymbol{0}} &  \underbrace{\boldsymbol{\mathcal{U}}_2^\mathrm{H} \boldsymbol{B} \bar{\boldsymbol{A}}}_{\boldsymbol{R}_2''}\end{bmatrix},\end{align}

where (a) holds because the QR decomposition in (\ref{eqn:qr2}) is unaffected by the zero columns introduced in the middle. QED.

Next, a MATLAB-based example is provided.

A = complex(randn(3,4),randn(3,4)) ;
B = complex(randn(3,4),randn(3,4)) ;
A_bar = null(A) ;
B_bar = null(B) ;
K = null([A_bar, B_bar]') ;
X = [K, B_bar, A_bar] ;
[U_cal1, ~] = qr(A*X) ;
[U_cal2, ~] = qr(B*X) ;
U1 = U_cal1' ;
U2 = U_cal2' ;


U1*A*X =

-2.0667 + 0.0000i -1.4910 + 1.2510i  0.1708 - 0.4198i  0.0000 + 0.0000i
 0.0000 - 0.0000i -1.0789 + 0.0000i  0.4117 - 0.4322i  0.0000 - 0.0000i
 0.0000 - 0.0000i  0.0000 + 0.0000i -1.5056 - 0.0000i -0.0000 + 0.0000i

U2*B*X =

-2.0667 + 0.0000i -1.1795 + 1.0534i -0.0000 + 0.0000i -0.7368 - 0.0662i
-0.0000 - 0.0000i -2.5202 - 0.0000i  0.0000 - 0.0000i  0.9798 - 1.9193i
-0.0000 + 0.0000i  0.0000 + 0.0000i  0.0000 + 0.0000i  0.8765 + 1.6384i



See also the example provided here: https://gitlab.com/aravindh.krishnamoorthy/mimo-noma

ark

References

[1] Heydar Radjavi and Peter Rosenthal. Simultaneous triangularization. Springer, 2012.
[2] Krishnamoorthy, Aravindh, and Robert Schober. "Uplink and downlink MIMO-NOMA with simultaneous triangularization." IEEE Transactions on Wireless Communications 20.6 (2021): 3381-3396. IEEE XplorearXiv

Saturday, 9 July 2022

Ordered densities of squared singular values of a Gaussian matrix product

This rather simple result is a humble acknowledgement of the great work in finite-size random matrix theory (RMT) by Prof. Gernot Akemann and team at Uni Bielefeld. For finding the ordered densities, a straightforward recursive formulation, in terms of the MeijerG function, based on the work of Alberto Zanella at CNR in Italy, is utilized.

Note: Several integration formulas for the MeijerG function are known, e.g., see the MeijerG function reference.

Although the expressions are complex, they can be numerically evaluated quite easily via Mathematica or MATLAB. It amazes me that these finite-size RMT densities are even analytically approachable, although, undoubtedly, the asymptotic RMT theory is "more elegant."

The theorem is as follows. See below for Mathematica code and numerical simulations.

Theorem



Second projection notation: Let set $s := \{(1,2), (3,4), (5,6)\}$ be a set containing three tuples. Then, $\pi_2(s)$ is the second projection of $s$ given by $\pi_2(s) = \{2, 4, 6\}.$

Sunday, 12 December 2021

Distribution of the Determinant of a Complex-Valued Sample Correlation Matrix

In this post, we look at the distribution of the determinant of the sample correlation matrix of the realizations of a complex-valued Gaussian random vector. The distribution for real-valued Gaussian random vector was developed in [1], and we largely follow the developed framework. Thanks to Prashant Karunakaran for bringing this problem and the many applications to my attention in late 2017/early 2018.

Let $\boldsymbol{x}$ be a Gaussian random vector of length $p$ with mean $\boldsymbol{\mu} \in \mathbb{C}^p$ and covariance $\boldsymbol{\Sigma} \in \mathbb{C}^{p\times p}.$ Let $\boldsymbol{x}^{(1)}, \boldsymbol{x}^{(2)}, \dots, \boldsymbol{x}^{(n)}$ denote $n$ realizations, $n \geq p,$ of $\boldsymbol{x}.$ In the terminology of [1], the adjusted sample covariance matrix is given by $$\boldsymbol{S} = \frac{1}{n}\sum_{i = 1}^{n}(\boldsymbol{x}^{(i)} - \bar{\boldsymbol{x}})(\boldsymbol{x}^{(i)} - \bar{\boldsymbol{x}})^\mathrm{H},$$ where $\bar{\boldsymbol{x}}$ is the sample mean given by $$\bar{\boldsymbol{x}} = \frac{1}{n}\sum_{i = 1}^{n}\boldsymbol{x}^{(i)}.$$ Note that the adjusted sample covariance matrix is positive semi-definite.

The correlation matrix $\boldsymbol{R}$ is defined as: $$\boldsymbol{R} = \boldsymbol{D}^{-\frac{1}{2}} \boldsymbol{S} \boldsymbol{D}^{-\frac{1}{2}},$$ where $\boldsymbol{D} = \mathrm{Diag}(\boldsymbol{S})$ is a diagonal matrix with the diagonal elements of $\boldsymbol{S}$ on the main diagonal. Hence, $\boldsymbol{R}$ has unit diagonal elements and is independent of the variance of the elements of $\boldsymbol{x}.$ 

Now, for real-valued $\boldsymbol{x},$ the determinant of $\boldsymbol{R},$ denoted by $|\boldsymbol{R}|,$ is shown in [1, Theorem 2] to be a product of $p-1$ Beta-distributed scalar variables $\mathrm{Beta}(\frac{n-i}{2},\frac{i-1}{2}),$  $i=1,\dots,p-1.$ The density of the product can be given in terms of the $\mathrm{MeijerG}$ function [2] as follows [1, Theorem 2]:

$$g_\mathbb{R}(x;n,p) = \frac{\left[\Gamma(\frac{n-1}{2})\right]^{(p-1)} }{\Gamma(\frac{n-2}{2})\dots\Gamma(\frac{n-p}{2})} \mathrm{MeijerG}^{\begin{bmatrix}p-1 & 0 \\ p-1 & p-1\end{bmatrix}}\left(x\middle|\begin{matrix}\frac{n-3}{2},\dots,\frac{n-3}{2}\\ \frac{n-4}{2},\dots,\frac{n-(p+2)}{2}\end{matrix}\right).$$

Analogously, for complex-valued $\boldsymbol{x},$ $|\boldsymbol{R}|$ is a product of $p-1$ Beta-distributed scalar variables $\mathrm{Beta}(n-i,i),$  $i=1,\dots,p-1.$ The density of the product can now be given in terms of the $\mathrm{MeijerG}$ function, in a straightfoward manner, as follows.

$$g_\mathbb{C}(x;n,p) = \frac{\left[\Gamma(n-1)\right]^{(p-1)} }{\Gamma(n-1)\dots\Gamma(n-p+1)} \mathrm{MeijerG}^{\begin{bmatrix}p-1 & 0 \\ p-1 & p-1\end{bmatrix}}\left(x\middle|\begin{matrix}n-1,\dots,n-1\\ n-2,\dots,n-p\end{matrix}\right).$$

In the following, a Mathematica program for numerical simulation and the corresponding output are provided.

gC[x_, n_,
  p_] := (Gamma[n])^(p - 1) /
   Product[Gamma[n - i], {i, 1, p - 1}] MeijerG[{{},
    Table[n - 1, {i, 1, p - 1}]}, {Table[n - i, {i, 2, p}], {}}, x]
 
r[x_] := Module[{d}, d = DiagonalMatrix[Diagonal[x]]; 
  MatrixPower[d, -1/2] . x . MatrixPower[d, -1/2]]
\[ScriptCapitalD] =
  MatrixPropertyDistribution[
   Det[r[(xr + I xi) .
      ConjugateTranspose[xr + I xi]]], {xr \[Distributed]
     MatrixNormalDistribution[IdentityMatrix[p], IdentityMatrix[n]],
    xi \[Distributed]
     MatrixNormalDistribution[IdentityMatrix[p],
      IdentityMatrix[n]]}] ;
 
data = Re[RandomVariate[\[ScriptCapitalD], 100000]] ;
\[ScriptCapitalD]1 = SmoothKernelDistribution[data] ;
 
Plot[{PDF[\[ScriptCapitalD]1, x], gC[x, n, p]}, {x, 0, 1},
 PlotLabels -> {"Numerical", "Analytical"},
 AxesLabel -> {"u", "p(u)"}]

The following figure shows that the numerical and analytical results match perfectly for the example case $n=4, k=6.$



[1] Pham-Gia, T. and Choulakian, V. (2014) Distribution of the Sample Correlation Matrix and      Applications. Open Journal of Statistics, 4, 330-344. doi: 10.4236/ojs.2014.45033.

[2] Weisstein, Eric W. "Meijer G-Function." From MathWorld--A Wolfram Web Resource. https://mathworld.wolfram.com/MeijerG-Function.html

Tuesday, 19 October 2021

Distribution of A Simple Function of Gamma Variables

In this post, we look at a nifty result presented in [Krishnamoorthy2019] where the probability density function (pdf) of the function $\dfrac{r_1}{c+r_2}$ two independent Gamma distributed random variables $r_1 \sim \mathcal{G}(k_1,\theta_1)$ and $r_2 \sim \mathcal{G}(k_2, \theta_2)$ is derived.

The derivation is an exercise (for the scalar case) in computing the pdf of functions of variables by constructing the joint pdf and marginalizing based on the Jacobian. A similar approach can also be used for matrix-variate distributions (which will probably be a good topic for another post.)

Theorem


Let $c > 0,$ and let $r_1 \sim \mathcal{G}(k_1,\theta_1)$ and $r_2 \sim \mathcal{G}(k_2, \theta_2)$ be independent random variables, then, the pdf of $r = \dfrac{r_1}{c+r_2}$, denoted by $p_r(r;k_1,\theta_1,k_2,\theta_2,c),$ is given by

$\begin{equation}K_\mathrm{r} r^{k_1-1} \exp\left(-\frac{rc}{\theta_1}\right) U\left(k_2,k_1+k_2+1;c\left(\frac{r}{\theta_1}+\frac{1}{\theta_2}\right)\right),\end{equation}$ 

where $$U(a,b;z) = \frac{1}{\Gamma(a)} \int_{0}^{+\infty} \exp(-zx) x^{a-1} (1+x)^{b-a-1} \mathrm{d}x$$ is the hypergeometric U-function [Olver2010, Chapter 13, Kummer function], and $K_\mathrm{r}$ is a constant ensuring that the integral over the pdf equals one.

Proof


As $r_1$ and $r_2$ are independent, their joint pdf, denoted by $p_{r_1,r_2}(r_1,r_2;k_1,\theta_1,k_2,\theta_2),$ is given by

$\begin{equation}\frac{1}{\Gamma(k_1) \theta_1^{k_1} \Gamma(k_2) \theta_2^{k_2}} r_1^{k_1-1} r_2^{k_2-1} \exp\left(-\frac{r_1}{\theta_1}-\frac{r_2}{\theta_2}\right).\end{equation}$

Applying transformation $r = \frac{r_1}{c+r_2}$ with Jacobian $\frac{\mathrm{d} r_1}{\mathrm{d} r} = c+r_2,$ we obtain the transformed pdf, $p_{r,r_2}(r,r_2;k_1,\theta_1,k_2,\theta_2,c)$, as

$\begin{align}K_\mathrm{r}' r^{k_1-1} (c+r_2)^{k_1} r_2^{k_2-1}\exp\left(-\frac{rc}{\theta_1}-\left(\frac{r}{\theta_1} +\frac{1}{\theta_2}\right)r_2\right),\end{align}$

where $K_\mathrm{r}'$ is a constant ensuring that the integral over the pdf equals one. Next, 
$p_r(r;k_1,\theta_1,k_2,\theta_2,c)$ is obtained by marginalization as

$\begin{equation}p_r(r;k_1,\theta_1,k_2,\theta_2,c) = \int_{0}^{+\infty} \hspace{-0.25cm} p_{r,r_2}(r,r_2;k_1,\theta_1,k_2,\theta_2,c) \mathrm{d} r_2,\end{equation}$

where the integration is conducted using the integral representation of the hypergeometric U-function from [Olver2010, Chapter 13, Kummer function] to obtain the expression in the theorem statement.

References


[Krishnamoorthy2019] A. Krishnamoorthy and R. Schober, “Precoder design for two-user uplink MIMO-NOMA with simultaneous triangularization,” in Proc. IEEE Global Commun. Conf., Dec. 2019, pp. 1–6. https://doi.org/10.1109/GLOBECOM38437.2019.9014161

[Olver2010] F. W. Olver, D. Lozier, R. Boisvert and C. Clark, "NIST digital library of mathematical functions", Release, vol. 1, pp. 14, 2010.


Tuesday, 14 May 2019

Offset Non-Uniform Discrete Cosine Decomposition of Toeplitz Matrices

Diagonalization of a Toeplitz matrix (or the related Henkel matrix) is well known for about 100 years due to Carathéodory. Recently, the methods and techniques were formalized and were shown to be related to non-uniform discrete Fourier transformation (DFT). In this post, we look at a straightforward extension to non-uniform discrete cosine transformation (DCT) for real-valued Toeplitz matrices.

Introduction


Let $\boldsymbol{T}$ be an $N\times N$ real, symmetric, Toeplitz matrix with simple eigenvalues. The matrix admits a Vandermonde decomposition [2], [3] given by \[\boldsymbol{T} = \boldsymbol{V}^\text{H} \boldsymbol{D} \boldsymbol{V}\] where $\boldsymbol{V}$ is an $N\times N$ complex-valued Vandermonde matrix with powers of roots of unity, $\nu_0, \dots, \nu_{N-1},$ hereafter referred to as modes, along the columns (shown below), and $\boldsymbol{D}$ is an $N\times N$ non-negative diagonal matrix.

\[
\boldsymbol{V} = \begin{bmatrix} 1 & \nu_0 & \nu_0^2 & \cdots & \nu_0^{N-1} \\
1 & \nu_1 & \nu_1^2 & \cdots & \nu_1^{N-1} \\
1 & \vdots &  \vdots & \ddots & \vdots \\
1 & \nu_{N-1} & \nu_{N-1}^2 & \cdots & \nu_{N-1}^{N-1}
\end{bmatrix}
\]

Absence of a Non-Uniform Cosine Decomposition


Theorem 2.1. (Absence of a general NUCD). There exists no uniform cosine decomposition \[\boldsymbol{T} = \boldsymbol{C}^\text{T} \boldsymbol{D} \boldsymbol{C}\] for the matrix $\boldsymbol{T}$ for $N > 2.$

Note: $\boldsymbol{C}$ is an $N\times N$ real, cosine matrix and $\boldsymbol{D}$ an $N\times N$ positive diagonal matrix. The elements of $\boldsymbol{C}$ are given by  \[ [\boldsymbol{C}]_{mn} = \cos(n \theta_m), \]
where $m, n = 0,\dots, N - 1,$ and $\theta_m \in (-\pi,\pi].$

Proof. Let $\tau_{m,n}$ and $\delta_{ m,n}$ denote the $(m,n)$ terms of the matrices $\boldsymbol{T}$ and $\boldsymbol{D},$ respectively. Expanding $\boldsymbol{C}^\text{T} \boldsymbol{D} \boldsymbol{C}$ and equating the diagonal elements we have \[ \tau_{n,n} = \sum_{m=0}^{N-1} \delta_{m,m} \cos^2 (n \theta_m),\]
where $n = 0,\dots,N-1.$ Therefore, the first element $\tau_{0,0}$ is \[\tau_{0,0} = \sum_{m=0}^{N-1} \delta_{m,m}.\] Due to the Toeplitz structure of $\boldsymbol{T}$, we have equality of the diagonal elements, i.e., $\tau_{0,0} = \tau_{1,1} = \cdots = \tau_{N-1,N-1}.$ However, since for any $\theta \in (-\pi,\pi]$ we have $0 \leq \cos^2 (\theta ) \leq 1$ with upper-bound equality only for $\theta = \{0,\pi\},$ no angles beyond $\{0,\pi\}$ can satisfy the equality of the diagonal elements. Hence, such a decomposition does not exist. $\blacksquare$


Result of the above theorem motivates us to explore offset non-uniform cosine decomposition where the decomposition is scaled as $\boldsymbol{T} = \gamma^2 \boldsymbol{C}^\text{T} \boldsymbol{D} \boldsymbol{C}$ for some  $\gamma > 1,$ wherefore equality of the diagonal elements may be met.

Offset Non-Uniform Cosine Decomposition


Lemma 3.1. The modes $\nu_1, \dots, \nu_{N-1},$ are either real or appear in complex conjugate pairs using the algorithm given in [2] with a real initial value $\nu_0.$

Note: The only admissible real initial values are $\nu_0 = \pm 1.$

Proof. Modes are the polynomial roots of $A(z) = \sum_{k=0}^{N-1} \alpha_k z^k,$ where $\alpha_ k$ is the $k$-th element of the vector $\boldsymbol{a}$ given by solving $\boldsymbol{Ta} = \boldsymbol{v}$ [2]. Here, $\boldsymbol{v}$ is the initial value vector given by $\boldsymbol{v} = [1, \nu_0, \nu_0^2,\dots,\nu_0^{N-1}].$

For real initial value vectors, the coeffcients $\alpha_k$ of the polynomial $A(z)$ are real. Therefore, the roots of the polynomial are either real or appear in complex conjugate pairs. $\blacksquare$

Now we present the main result of this post.

Theorem 3.2 (O-NUCD). For real initial values, there exists an offset non-uniform cosine decomposition $\boldsymbol{T} = 2 \boldsymbol{C}^\text{T} \boldsymbol{D} \boldsymbol{C}$ for the matrix $\boldsymbol{T}.$

Note: $\boldsymbol{C}$ is an $N\times N$ real, cosine matrix and $\boldsymbol{D}$ an $N\times N$ positive diagonal matrix. The elements of the matrix $\boldsymbol{C}$ are given by \[
[\boldsymbol{C}]_{m,n} = \cos\left(\mp \frac{\pi}{4} + n \theta_m\right),\] where $m,n = 0,\dots,N-1,$  and $\theta_m = \angle \nu_m.$

Proof. Let $\boldsymbol{V}_R$ and $\boldsymbol{V}_I$ denote the real and imaginary parts of the Vandermonde matrix $\boldsymbol{V}.$ As the initial values are real, from Lemma 3.1, the roots are either real or have complex conjugate symmetry. Therefore, we have the following orthogonality properties: \[\boldsymbol{V}_R^\text{T} \boldsymbol{V}_I = \boldsymbol{V}_R^\text{T} \boldsymbol{D} \boldsymbol{V}_I = \boldsymbol{V}_I^\text{T} \boldsymbol{V}_R = \boldsymbol{V}_I^\text{T} \boldsymbol{D} \boldsymbol{V}_R =\boldsymbol{0}.\] Using these properties (twice), we have \[\boldsymbol{T} =  \boldsymbol{V}_R^\text{T} \boldsymbol{D} \boldsymbol{V}_R +  \boldsymbol{V}_I^\text{T} \boldsymbol{D} \boldsymbol{V}_I = (\boldsymbol{V}_R \pm \boldsymbol{V}_I)^\text{T} \boldsymbol{D} (\boldsymbol{V}_R \pm \boldsymbol{V}_I) = 2 \boldsymbol{C}^\text{T} \boldsymbol{D} \boldsymbol{C},\] with the elements of the matrix $\boldsymbol{C}$ as shown above. $\blacksquare$

O-NUCD in MATLAB using the Vandermonde Tools [1]

As an example, O-NUCD ($-\frac{\pi}{4}$) may be found using the Vandermonde Tools from AudioLabs [1] as follows:

>> [v,d] = find_vand('xcorr', T) ;
>> V = vandermonde_fast(v) ;
>> D = diag(d) ;
>> norm(V'*D*V - T)
ans = 2.5403e-15

>> C = cos(-pi/4 + angle(v)*(0:N-1)) ;
>> norm(2*C'*D*C - T)
ans = 2.0304e-15

References

[1] T. Backstrom, Vandermonde tools. http://www.audiolabs-erlangen.de/resources/vandermonde-tools. Accessed: 13th March 2015.
[2] T. Backstrom, Vandermonde factorization of toeplitz matrices and applications in filtering and warping, Signal Processing, IEEE Transactions on, 61 (2013), pp. 6257-6263.
[3] D. L. Boley, F. T. Luk, and D. Vandevoorde, Vandermonde factorization of a Hankel matrix, (1998).

Friday, 14 December 2018

Determinant Form for KummerU Function of a Matrix Argument

KummerU function is the confluent hypergeometric function of the second kind. In this post, a straightforward method of expressing the KummerU function of a matrix argument in terms of the determinant of a matrix of scalar KummerU functions is presented.

KummerU function of a matrix argument [1, Definition 1.3.6] given by $$\mathfrak{U}(a,b;\boldsymbol{Z}) = \frac{1}{\Gamma_p(a)}\int_{\boldsymbol{X} \succ 0} \text{etr}(-\boldsymbol{Z}\boldsymbol{X}) \det(\boldsymbol{X})^{a-n} \det(\boldsymbol{I} + \boldsymbol{X})^{b-a-n} \text{d}\boldsymbol{X},$$ where $\boldsymbol{Z}, \boldsymbol{X}$ are $n\times n$ complex-valued symmetric positive-definite matrices, $\text{Re}(a) \geq n$, and $\Gamma_p(a)$ is the multivariate Gamma function [2]. The definition of function $\mathfrak{U}(a,b;\boldsymbol{Z})$ closely corresponds to the definition of the scalar KummerU function $U(a,b;z)$ [3, Eqn. 3] given by $$U(a,b;z) = \frac{1}{\Gamma(a)} \int_{0}^{+\infty} e^{-zx} x^{a-1} (1+x)^{b-a-1} \text{d}x.$$

We begin by noting that the determinant form for generalized hypergeometric functions are known due to Orlov [4, Eqn. 34]. Hence, KummerU function of a matrix argument may be expressed in its determinant form in a straightforward manner using the relation [1, Definition 1.3.6]: $$\lim_{c\to +\infty} {}_2\mathfrak{F}_{1}(a,b;c;\boldsymbol{I}-c\boldsymbol{Z}^{-1}) = \det(\boldsymbol{Z})^b \mathfrak{U}(b,b-a+n; \boldsymbol{Z}),$$ where ${}_2\mathfrak{F}_{1}$ is the Gaussian hypergeometric function of a matrix argument [5]. A corresponding relation for scalar KummerU function is given in [3, Eqn. 2].

Using the relation above and [4, Eqn.34], the determinant form for KummerU function of a matrix argument can be simplified to $$\mathfrak{U}(a,b;\boldsymbol{Z}) = \frac{1}{\prod_{1\leq i \leq j \leq n}(\lambda_i-\lambda_j)}\det(\boldsymbol{\Omega}),$$ where $\lambda_i,i=1,\dots,n$, are the non-repeating eigenvalues of $\boldsymbol{Z}$, and $\boldsymbol{\Omega}$ is an $n\times n$ matrix whose $(i,j)$-th element is given by $$[\boldsymbol{\Omega}]_{ij} = U(a-j+1,a-b+1;\lambda_i),$$ and $U(a,b;z)$ is the scalar KummerU function as mentioned earlier.

While the above formula allows us to express $\mathfrak{U}(a,b;\boldsymbol{Z})$ in an elegant way in terms of determinant of matrix of $U(a,b;z)$, computing $\mathfrak{U}(a,b;\boldsymbol{Z})$ in terms of Zonal polynomials [6] may be more efficient. For an example of using Zonal polynomials to evaluate generalized hypergeometric functions in MATLAB, see [7]. 

[1] Gupta, Arjun K., and Daya K. Nagar. Matrix variate distributions. Chapman and Hall/CRC, 2018.
[2] Multivariate gamma function, web: Wikipedia.
[3] KummerU function, web: Wolfram Mathworld.
[4] Yu. Orlov, A. New solvable matrix integrals. International Journal of Modern Physics A 19.supp02 (2004): 276-293, web: https://arxiv.org/abs/nlin/0209063.
[5] Hypergeometric function of a matrix argument, web: Wikipedia.
[6] Zonal polynomials, web: Wikipedia.
[7] Koev, Plamen, MHG for MATLAB, web: https://math.mit.edu/~plamen/software/mhgref.html.

Friday, 29 December 2017

Stochastic Properties of Random Matrix Factors - 3

In this third post continuing QR and related Cholesky matrix decomposition techniques, we look at the probability density function (PDF) of a diagonal element of the QR decomposition of a complex-valued random matrix with circular Gaussian i.i.d elements.

Let $\mathbf{H}$ denote an $N\times N$ matrix with circular Gaussian independent and identically distributed (i.i.d) elements with zero mean and variance two i.e. $\mathcal{CN}(0,2)$. Furthermore let $\mathbf{R}$ denote the upper-triangular matrix obtained after QR decomposition ($\mathbf{H} = \mathbf{Q}\mathbf{R}$) such that the diagonal elements of $\mathbf{R}$ are non-negative. It is well-known that such a QR decomposition (QRP or QR with positive diagonal elements) of a matrix is unique. In case a QR decomposition algorithm results in negative diagonal elements, non-negativity may be arranged by multiplying those rows of $\mathbf{R}$ and the corresponding columns of $\mathbf{Q}$ by -1.

It is well known that the diagonal elements of $\mathbf{R}$ follow the distribution:
$$
\begin{bmatrix}
\chi_{2N} & \mathcal{CN}(0,2) & \cdots &  \mathcal{CN}(0,2)\\
 & \chi_{2(N-1)} & \ddots & \vdots\\
 & & \ddots & \mathcal{CN}(0,2)\\
 & & & \chi_{2}
\end{bmatrix},
$$
where $\chi_{k}$ denotes a Chi-distributed random variable with $k$ degress of freedom.

From the above, it follows that the distribution of a diagonal element $r$ of the matrix $\mathbf{R}$ may be given as the (equal) weighted sum of the pdfs of the diagonal elements as
$$
f_r(x) = \frac{1}{N} \sum_{n=1}^{N} \frac{2^{n}}{\Gamma(n)} x^{n-1} e^{-\frac{x}{2}},
$$
where $\Gamma(n)$ is the Gamma function.

The following graph shows the results of a MATLAB simulation for $N=4$ over 100000 iterations where the diagonal values of $\mathbf{R}$ were collected. The blue graph shows the histogram of the diagonal values. The red graph shows the pdf as obtained using the equation above. We notice that the two correspond well.


ARK

Thursday, 23 November 2017

Generating and Sampling Two Signed Bernoulli Random Variables with Given Correlation

The following first constructs two signed $\{-1, +1\}$ Bernoulli variables $Y_1,$ $Y_2$ such that $p(Y_1 = -1) = p_1,$ $p(Y_2 = -1) = p2,$ and $E[Y_1Y_2] = r.$ Constraints: $0 < p1,p2 < 1, -1 < r < 1.$ 

Next, it draws $N$ realizations of the two variables in $Y(m,:), m = 1,2.$ The sample pdf and cross correlation converge as $N \to \infty.$

Example: generate $N$ realizations of variables with mean $0,$ variance $1,$ and correlation $0.3:$ Y = signed_bernoulli2(N,0.5,0.5,0.3).


% Resolve the joint pdf given by
% / p(1) if (+1, +1)
% p(Y_1, Y_2) = | p(2) if (+1, -1)
% | p(3) if (-1, +1)
% \ p(4) if (-1, -1)
%
% Conditions (i.e. how matrices A and b are generated):
% row 1: p(1) + p(2) + p(3) + p(4) = 1
% row 2: p(3) + p(4) = p1 -> marginal for Y_1
% row 3: p(2) + p(4) = p2 -> marginal for Y_2
% row 4: p(1) - p(2) - p(3) + p(4) = r -> cross correlation E[Y_1 Y_2]
A = [1,1,1,1;0,0,1,1;0,1,0,1;1,-1,-1,1] ;
b = [1;p1;p2;r] ;
p = pinv(A)*b ;
assert(all(p >= 0), sprintf(['The given p1=%f and p2=%f cannot result in '...
' a correlation r=%f with the given algorithm.\np1=p2=0.5 is a safe value.'],...
p1,p2,r)) ;
% Now that we have the joint pdf in "p" we can draw from this distribution
% as follows (based on conditional pdfs and inverse transform sampling):
% 1. Draw samples of Y_1 from Signed_Bernoulli(p1) in a standard fashion using
% an auxiliary uniformly distributed variable U_1 = U(1,:).
% 2. If the n-th sample of Y_1 is "-1", then draw for Y_2 from the conditional pdf
% p(Y_2|Y_1 = -1) using a second axiliary uniformly distributed
% variable U_2 = U(2,:).
% 2. If the n-th sample of Y_1 is "+1", then draw for Y_2 from the conditional pdf
% p(Y_2|Y_1 = +1) using the second axiliary uniformly distributed
% variable U_2 = U(2,:).
% The conditional pdfs are given as follows:
% p(Y_2|Y_1 = -1) = p(4)/p1 for -1, and 1-(p(4)/p1) for +1.
% p(Y_2|Y_1 = +1) = p(2)/(1-p1) for -1, and 1-(p(2)/(1-p1)) for +1.
%
% Note: for the rather straightforward proofs for the above steps, please contact
% the author.

% Auxiliary variable U
U = rand(2,N) ;
% Draw samples from p(Y_1)
Y(1,:) = (-1).^(U(1,:) < p1) ;
% Indices to help choose the conditional pdf
I1 = Y(1,:) == -1 ;
I2 = Y(1,:) == +1 ;
% Draw samples from the conditional pdfs
Y(2,I1) = (-1).^(U(2,I1) < p(4)/p1) ;
Y(2,I2) = (-1).^(U(2,I2) < p(2)/(1-p1)) ;