Matrix-chain multiplication ~ ASHRAFEDU

Matrix-chain multiplication

Given a sequence (chain) <A₁, A₂,…….,A_n> of n matrices to be multiplied, and compute the product A₁.A₂ ….A_n .

Given a sequence of matrices, find the most efficient way to multiply these matrices together. The problem is not actually to perform the multiplications, but merely to decide in which order to perform the multiplications.

Matrix multiplication is associative, and so all parenthesizes yield the same product.

A₁A₂A₃A₄ in five distinct ways:

• (A₁(A₂(A₃A₄)))

• ((A₁A₂)(A₃A₄))

• (((A₁A₂)A₃)A₄)

• ((A₁(A₂A₃))A₄)

• (A₁((A₂A₃)A₄))

For example let A₁ is 10 by 100 matrix , A₂ is 100 by 5 matrix, A₃ is 5 by 50 matrix, A₄ is 50 by 1 matrix and A₁A₂A₃A₄ is a 10 by 1 matrix.

• (A₁(A₂(A₃A₄)))

– A₃₄ = A₃A₄ , 250 multiplications, result is 5 by 1

– A₂₄ = A₂A₃₄ , 500 multiplications, result is 100 by 1

– A₁₄ = A₁A₂₄ , 1000 multiplications, result is 10 by 1

– Total is 1750

• ((A₁A₂)(A₃A₄))

– A₁₂ = A₁A₂ , 5000 mults, result is 10 by 5

– A₃₄ = A₃A₄ , 250 mults, result is 5 by 1

– A₁₄ = A₁₂A₃₄ , 50 mults, result is 10 by 1

– Total is 5300

• (((A₁A₂)A₃)A₄)

– A₁₂ = A₁A₂ , 5000 mults, result is 10 by 5

– A₁₃ = A₁₂A₃ , 2500 mults, result is 10 by 50

– A₁₄ = A₁₃A₄ , 500 mults, results is 10 by 1

– Total is 8000

• ((A₁(A₂A₃))A₄)

– A₂₃ = A₂A₃ , 25000 mults, result is 100 by 50

– A₁₃ = A₁A₂₃ , 50000 mults, result is 10 by 50

– A₁₄ = A₁₃A₄ , 500 mults, results is 10 by

– Total is 75500

• (A₁((A₂A₃)A₄))

– A₂₃ = A₂A₃ , 25000 mults, result is 100 by 50

– A₂₄ = A₂₃A₄ , 5000 mults, result is 100 by 1

– A₁₄ = A₁A₂₄ , 1000 mults, result is 10 by 1

– Total is 31000
As seen above how parenthesize is done on a chain of matrices can have a dramatic impact on the cost of evaluating the product. (A₁(A₂(A₃A₄))) has a minimum cost of 1750.

The matrix-chain multiplication problem is stated as follows: given a chain<A₁,A₂,…..,A_n> of n matrices, where for i = 1,2,….,n, matrix A_i has dimension p_i-1X p_i , fully parenthesize the product A₁A₂…. A_n in a way that minimizes the number of scalar multiplications.

Let P(n) be the number of ways to parenthesize n matrices.

Trying all possible parenthesizes is a bad idea.

Applying dynamic programming

Dynamic-programming method is used to determine how to optimally parenthesize a matrix chain.

Step 1: The structure of an optimal parenthesization

For our first step in the dynamic-programming paradigm, we find the optimal substructure and then use it to construct an optimal solution to the problem from optimal solutions to sub problems.

Structure of an optimal solution If the outermost parenthesization is ((A₁A₂ · · ·A_i) (A_i+1 · · ·A_n)) then the optimal solution consists of solving A₁..A_i and A_i+1..A_n optimally and then combining the solutions.

Let x be the number of multiplications it to solve A₁..A_i , y be the number of multiplications it does to solve A_i+1..A_n , and z be the number of multiplications it does in the final step.

The total number of multiplications is therefore x + y + z.

We must ensure that when we search for the correct place to split the product, we have considered all possible places, so that we are sure of having examined the optimal one.

Step 2: A recursive solution

Next, we define the cost of an optimal solution recursively in terms of the optimal solutions to sub problems.

Computing the matrix product A_{i ..k}.A_{k+1
. .j} takes p_i-1p_kp_j scalar multiplications.

If the final multiplication for A_ij is A_i,kA_{k+1, j} then

minimum number of scalar multiplications needed m[i, j] = m[i, k] + m[k + 1, j] + p_i−1p_kp_j .

Thus, our recursive definition for the minimum cost of parenthesizing the product A_iA_i+1 . . . A_j becomes

The m[ i, j ] values give the costs of optimal solutions to sub problems, but they do not provide all the information we need to construct an optimal solution.

Step 3: Computing the optimal costs

Optimal cost is computed by using a tabular, bottom-up approach.

The procedure MATRIXCHAIN-ORDER, assumes that matrix A_i has dimensions p_i-1 X p_i for i = 1,2, . . . ,n. Its input is a sequence p = <p₀, p₁, . . . ,p_n>, where p.length = n + 1.

The procedure uses an auxiliary table m[1. . n,1. . n] for storing the m[i, j ] costs and another auxiliary table s[1. . n – 1, 2 . . n] that records which index of k achieved the optimal cost in computing m[i, j ]. Table s is used to construct an optimal solution. Each entry s[i, j ] records a value of k such that an optimal parenthesization of A_iA_i+1 . . .A_j splits the product between A_k and A_k+1.

A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O (n³) for the algorithm. The loops are nested three deep, and each loop index (l , i, and k) takes on at most n-1 values. The algorithm requires θ (n²) space to store the m and s tables.

Thus, MATRIX-CHAIN-ORDER is much more efficient than the exponential-time method of enumerating all possible parenthesizations and checking each one.

Saturday, November 18, 2023

Matrix-chain multiplication