Category: Mathematics

  • More about Matrix – Differentiation

    It was not until recently that I realize that matrix differentiation is significantly important when using matrix representation to do computation. After searching for some relevant materials and lecture notes, I found more useful formulas than I expected. Now I list some of them which are pretty handy for me and may possibly be helpful for you one day.

    Definition

    The notation \(\frac{\partial y}{\partial x}\) =

    \[\begin{bmatrix}\frac{\partial y_1}{\partial x_1} & \frac{\partial y_1}{\partial x_2} & \cdots & \frac{\partial y_1}{\partial x_n} \\\vdots & \vdots & \vdots \\\frac{\partial y_m}{\partial x_1} & \frac{\partial y_m}{\partial x_2} & \cdots & \frac{\partial y_m}{\partial x_n} \\\end{bmatrix}\]

    denotes the $m \times n$ matrix of first-order partial derivatives of the transformation from $x$ to $y$. Such a matrix is called the Jacobian matrix of the transformation matrix $\varphi ()$, if $y=\varphi(x)$, where $y$ is a $m\times1$ vector and $x$ is a $1\times n$ vector.

    Proposition

    (1) Let $y=Ax$ and $A$ does not depend on $x$, then:

    \[\frac{\partial y}{\partial x}=A\]

    (2) Let  $y=Ax$ and $x$ be a function of $z$ and $A$ does not depend on $z$, then:

    \[\frac{\partial y}{\partial z}=A\frac{\partial x}{\partial z}\]

    (3) For scalar $\alpha = y^TAx$ and $A$ is independent of $x, y$, then:

    \[\frac{\partial \alpha}{\partial x}=y^TA\]

    \[\frac{\partial \alpha}{\partial y}=A^Ty\]

    (4) For scalar $\alpha = x^TAx$ and $A$ is independent of $x$, then:

    \[\frac{\partial \alpha}{\partial x}=x^T(A+A^T)\]

    (5) Based on (4) and now $A$ is a symmetric matrix, then:

    \[\frac{\partial \alpha}{\partial x}=2x^T(A)\]

    (6) For scalar $\alpha = y^Tx$ and $x, y$ are functions of $z$, then:

    \[\frac{\partial \alpha}{\partial z}= x^T\frac{\partial y}{\partial z} + y^T\frac{\partial x}{\partial z}\]

    (7) For scalar $\alpha = x^Tx$ and $x$ is functions of $z$, then:

    \[\frac{\partial \alpha}{\partial z}=2x^T\frac{\partial x}{\partial z}\]

    (8) Let scalar $\alpha = y^TAx$, then:

    \[\frac{\partial \alpha}{\partial z} = x^TA^T\frac{\partial y}{\partial z} + y^TA\frac{\partial x}{\partial z}\]

    (9) Let scalar $\alpha = x^TAx$ and $x$ be function of $z$:

    \[\frac{\partial \alpha}{\partial z} = x^T(A+A^T)\frac{\partial y}{\partial z}\]

    (10) Based on (9) and now $A$ is a symmetric matrix, then:

    \[\frac{\partial \alpha}{\partial z} = 2x^TA\frac{\partial x}{\partial z}\]

    Definition

    \[\frac{\partial A}{\partial \alpha} = \begin{bmatrix}\frac{\partial a_{11}}{\partial \alpha} & \frac{\partial a_{12}}{\partial \alpha} & \cdots & \frac{\partial a_{1n}}{\partial \alpha} \\\vdots & \vdots & \vdots \\\frac{\partial a_{m1}}{\partial \alpha} & \frac{\partial a_{m2}}{\partial \alpha} & \cdots & \frac{\partial a_{mn}}{\partial \alpha} \\\end{bmatrix}\]

    Proposition

    Let $A$ be a nonsingular $m\times m$ matrix, whose elements are functions of scalar parameter $\alpha$, then:

    \[\frac{\partial A}{\partial \alpha} = -A^{-1}\frac{\partial A}{\partial \alpha}A^{-1}\]

     

    Summary

    This article lists some useful matrix differentiation formulas that inspired me when understanding the least squares approximation of linear systems. Yet, everything is still in the scope of linear algebra and calculus.

  • Understanding Classical Methods of Image Interpolation using Linear Algebra based on Numerical Analysis

    This semester I am taking two courses that are related to image and computer vision and I started to better understand that some amazing applications require even basic mathematical tools that we learned in the first two years in college. Yet many open source project now enables us to do lots of image analysis without a deep understanding of how basic mathematics work underneath, It is still fruitful and inspiring for me to all of sudden connect all the dots that we once collected.

    Here, I want to introduce one of the examples – Image Interpolation.


    What is image interpolation and why that is important?

    It is convenient for us to zoom an image(say transform a \(500\times500\) image to \(1000\times1000)\) thanks to the help of the digital device. But when an image is zoomed up, there are actually extra pixels inserted into the original one, and how to color those extra pixels is a critical task called image interpolation.


    A naive method

    A reasonable image is always locally continuous. That’s saying that we are expecting neighboring pixels have similar colors. Thus, a naive way to fill one extra pixel is by copying the color from the nearest pixel. Noticing that there is still one more problem undefined – how to find the nearest pixel?

    Fig1

    Let’s assume we have two pixel coordinates (see Fig1, one is \(s\) for the original image and another is \(S\) for the zoomed image. And the image is scaled by \(k\) times.

    \(\forall (u, v) \in S\), the corresponding pixel in s is \((\frac{u}{k}, \frac{v}{k})\) and four candidate neighboring pixels in s to are: \((i, j), (i, j+1), (i+1, j), (i+1,j+1)\), where \(i = floor(\frac{u}{k})\) and \(j = floor(\frac{v}{k})\). And Define \(\Delta = (\Delta_i, \Delta_j) = (\frac{u}{k} – i, \frac{v}{k} – j)\)

    Finally, the distance we are discussing here is the euclidean distance. Now we could find the nearest pixel to POI.

    An improvement

    Since we know how close each candidate pixel and my POI are, why not use a weighted average to gain a better estimation?

    If possible, then how to select weights wisely? Intuitively, we can use the area as a set of weights (which requires fewer computations than euclidean distance and all the weights (total area) magically sum up to 1!).

    Fig2

    Bilinear Image Interpolation

    In another word, the technique we get from the improvement above is:

    $$f(i+\Delta_i, j+\Delta_j) = \sum_{i, j} f(i,j)S_{i,j}$$

    And it also has a fancy name – bilinear image interpolation.

    Mathematical Foundation Behind

    It idea of the weighted average is natural and intuitive, yet we can find a solid mathematical foundation from numerical analysis – Interpolation.

    Considering one dimension, we have multiple ways to find the interpolation function giving a set of points. For instance, if we are given two points \((x_i, f_i)\), \((x_{i+1}, f_{i+1})\) then using linear polynomial we could estimate \(f(x), \forall x in [i, i+1]\). The Lagrange interpolation form can be written as:

    $$f(x) = f_{i}\frac{x-x_{i+1}}{(x_{i} – x_{i+1})} + f_{i+1}\frac{x-x_{i}}{(x_{i+1} – x_{i})}$$

    In two dimensions, we could use the same trick three times to estimate our POI (See Fig3).

    1. estimate \( f_{i, j+\Delta_{j}} \) based on \(f_{i, j} , f_{i,j+1}\)
    2. estimate \(f_{i+1, j+\Delta_j}\) based on \(f_{i+1, j} , f_{i+1,j+1}\)
    3. estimate \(f_{i+\Delta_i, j+\Delta_j}\) based on \(f_{i, j+\Delta_j} , f_{i+1,j+\Delta_j}\)
    Fig3

    Consider using linear algebra form to represent the Lagrange interpolation process steps above:

    $$f_{i, j+\Delta_j} =\begin{bmatrix}f_{i, j} & f_{i,j+1}\end{bmatrix}\begin{bmatrix}1-\Delta_j \\ \Delta_j\end{bmatrix}$$

    $$f_{i+1, j+\Delta_j} =\begin{bmatrix}f_{i+1, j} & f_{i+1,j+1}\end{bmatrix}\begin{bmatrix}1-\Delta_j \\ \Delta_j\end{bmatrix}$$

    $$f_{i+\Delta_i, j+\Delta_j} =\begin{bmatrix}1-\Delta_i & \Delta_i\end{bmatrix}\begin{bmatrix}f_{i+1, j+\Delta_j} \\ f_{i,j+\Delta_j}\end{bmatrix}$$

    To sum up, the pixel value \(f_{i+\Delta_i, j+\Delta_j}\) can be written as:

    \[f_{i+\Delta_i, j+\Delta_j} = \begin{bmatrix}1-\Delta_i & \Delta_i\end{bmatrix}\begin{bmatrix}f_{i,j} & f_{i,j+1} \\ f_{i+1, j} & f_{i+1, j+1} \end{bmatrix}\begin{bmatrix}1-\Delta_j \\ \Delta_j\end{bmatrix}\]

    Noticing how amazing we could arrange those four pixels’ intensity in the middle matrix(say \(F\)) as if they are in the \(s\) coordinate system(See Fig4).

    Fig4

    Now it is relatively easy to see that:

    \[f_{i+\Delta_i, j+\Delta_j}=(1-\Delta_i)(1-\Delta_j)\cdot f_{i, j}+\Delta_{i}(1-\Delta_{j})\cdot f_{i, j+1}+(1-\Delta_{i})\Delta_{j}\cdot f_{i+1,j}+\Delta_{i}\Delta_{j}\cdot f_{i+1, j+1}\\=\sum_{i,j}S_{i,j}f(i,j)\]

    which proves our intuitive idea at the beginning.


    Bicubic Image Interpolation

    Fig5

    Bicubic image interpolation is trying to consider more neighboring pixels \((4\times 4)\)when estimating our POI. Noticing the matrix form in the bilinear interpolation section above, without loss of generality, the universal model we are adapting is actually:

    $$f_{i,j}=W(\Delta_j)^{T}FW(\Delta_i)$$

    In one dimension, considering the Lagrange Interpolation function based on cubic polynomial given 4 points:

    $$f(x)=\sum_{k=i-1}^{i+2}f_{k}\frac{\Pi_{l=i-1, l\neq k}^{i+2}(x-x_{l})}{\Pi_{l=i-1, l\neq k}^{i+2}(x_{k}-x_{l})}$$

    Taking \(x=i+\Delta_i, x_{k} = k\) and simplifying the right-hand part of the equation above, we can see that the weight vector \(W(\Delta_i)\) should be:

    \[\begin{bmatrix}-\frac{1}{6}\Delta_i(\Delta_i-1)(\Delta_i-2) \\ \frac{1}{2}(\Delta_i+1)(\Delta_i-1)(\Delta_i-2)\\-\frac{1}{2}(\Delta_i+1)\Delta_i(\Delta_i-2) \\ \frac{1}{6}(\Delta_i+1)\Delta_i(\Delta_i-1)\end{bmatrix}\]

    Now, we have a clear and neat equation for estimating POI using the bicubic method.

    Summary

    This article shows a new way to understand image interpolation and mathematics from numerical analysis behind the scene. A unified matrix form of pixel interpolation is presented and also raised a feasible algorithm using the Lagrange interpolation formula to obtain weights for neighboring pixels.

  • Magnetic Field Intensity Around a Uniformly Carrying Cylindrical Surface

    In general, by utilizing Ampère’s Circuital Law, it is straightforward to derive the formula for the magnetic induction intensity around a uniformly current-carrying cylindrical surface, except on the surface of the cylinder itself:

    $$B(r)=\begin{cases}
    0 & if 0 < r < R \\
    \frac{\mu_0 I}{2\pi r} & if r > R \\
    \end{cases}$$

    For the case where \(r=R\), Ampère’s Circuital Law cannot compute the value.

    Instead, we use the Biot-Savart Law, treating the infinitely long cylindrical surface as an infinite straight current. The surrounding field strength formula is then:

    $$B(r)=\frac{\mu_0 I}{2\pi d} $$

    where \(i=\frac{ds}{2\pi R}I\).

    It is intuitive to place the research point on the cylindrical surface at the coordinate origin and use polar coordinates to rewrite the equation. Given that:

    $$ds = \sqrt{\rho^2 + \rho^{‘2}} = 2R\theta$$

    $$d=\rho$$

    Thus, the differential magnetic field \(dB\) is given by:

    $$dB = \frac{\mu_0 I}{4\pi^2 R}d\theta$$

    Integrating from \(-\pi/2\) to \(\pi/2\), we find:

    $$B = \int_{-\pi/2}^{\pi/2}\frac{\mu_0 I}{4\pi^2 R}d\theta = \frac{\mu_0 I}{4\pi R}$$