# BCH code

In coding theory, the BCH codes form a class of cyclic error-correcting codes that are constructed using finite fields. BCH codes were invented in 1959 by French mathematician Alexis Hocquenghem, and independently in 1960 by Raj Bose and D. K. Ray-Chaudhuri.[1] The acronym BCH comprises the initials of these inventors' names.

One of the key features of BCH codes is that during code design, there is a precise control over the number of symbol errors correctable by the code. In particular, it is possible to design binary BCH codes that can correct multiple bit errors. Another advantage of BCH codes is the ease with which they can be decoded, namely, via an algebraic method known as syndrome decoding. This simplifies the design of the decoder for these codes, using small low-power electronic hardware.

BCH codes are used in applications such as satellite communications,[2] compact disc players, DVDs, disk drives, solid-state drives[3] and two-dimensional bar codes.

## Definition and illustration

### Primitive narrow-sense BCH codes

Given a prime power Template:Mvar and positive integers Template:Mvar and Template:Mvar with dqm − 1, a primitive narrow-sense BCH code over the finite field GF(q) with code length n = qm − 1 and distance at least Template:Mvar is constructed by the following method.

Let Template:Mvar be a primitive element of GF(qm). For any positive integer Template:Mvar, let mi(x) be the minimal polynomial of αi over GF(q). The generator polynomial of the BCH code is defined as the least common multiple g(x) = lcm(m1(x),…,md − 1(x)). It can be seen that g(x) is a polynomial with coefficients in GF(q) and divides xn − 1. Therefore, the polynomial code defined by g(x) is a cyclic code.

#### Example

Let q=2 and m=4 (therefore n=15). We will consider different values of Template:Mvar. There is a primitive root α in GF(16) satisfying

Template:NumBlk

its minimal polynomial over GF(2) is

${\displaystyle m_{1}(x)=x^{4}+x+1.}$

The minimal polynomials of the first seven powers of α are

${\displaystyle m_{1}(x)=m_{2}(x)=m_{4}(x)=x^{4}+x+1,\,}$
${\displaystyle m_{3}(x)=m_{6}(x)=x^{4}+x^{3}+x^{2}+x+1,\,}$
${\displaystyle m_{5}(x)=x^{2}+x+1,\,}$
${\displaystyle m_{7}(x)=x^{4}+x^{3}+1.\,}$

The BCH code with ${\displaystyle d=2,3}$ has generator polynomial

It has minimal Hamming distance at least 3 and corrects up to one error. Since the generator polynomial is of degree 4, this code has 11 data bits and 4 checksum bits.

The BCH code with ${\displaystyle d=4,5}$ has generator polynomial

It has minimal Hamming distance at least 5 and corrects up to two errors. Since the generator polynomial is of degree 8, this code has 7 data bits and 8 checksum bits.

The BCH code with ${\displaystyle d=8}$ and higher has generator polynomial

This code has minimal Hamming distance 15 and corrects 7 errors. It has 1 data bit and 14 checksum bits. In fact, this code has only two codewords: 000000000000000 and 111111111111111.

### General BCH codes

General BCH codes differ from primitive narrow-sense BCH codes in two respects.

First, the requirement that ${\displaystyle \alpha }$ be a primitive element of ${\displaystyle \mathrm {GF} (q^{m})}$ can be relaxed. By relaxing this requirement, the code length changes from ${\displaystyle q^{m}-1}$ to ${\displaystyle \mathrm {ord} (\alpha ),}$ the order of the element ${\displaystyle \alpha .}$

Second, the consecutive roots of the generator polynomial may run from ${\displaystyle \alpha ^{c},\ldots ,\alpha ^{c+d-2}}$ instead of ${\displaystyle \alpha ,\ldots ,\alpha ^{d-1}.}$

Note: if ${\displaystyle n=q^{m}-1}$ as in the simplified definition, then ${\displaystyle {\rm {gcd}}(n,q)}$ is automatically 1, and the order of ${\displaystyle q}$ modulo ${\displaystyle n}$ is automatically ${\displaystyle m.}$ Therefore, the simplified definition is indeed a special case of the general one.

### Special cases

The generator polynomial ${\displaystyle g(x)}$ of a BCH code has coefficients from ${\displaystyle \mathrm {GF} (q).}$ In general, a cyclic code over ${\displaystyle \mathrm {GF} (q^{p})}$ with ${\displaystyle g(x)}$ as the generator polynomial is called a BCH code over ${\displaystyle \mathrm {GF} (q^{p}).}$ The BCH code over ${\displaystyle \mathrm {GF} (q^{m})}$ with ${\displaystyle g(x)}$ as the generator polynomial is called a Reed–Solomon code. In other words, a Reed–Solomon code is a BCH code where the decoder alphabet is the same as the channel alphabet.[4]

## Properties

1. The generator polynomial of a BCH code has degree at most ${\displaystyle (d-1)m.}$ Moreover, if ${\displaystyle q=2}$ and ${\displaystyle c=1,}$ the generator polynomial has degree at most ${\displaystyle dm/2.}$

Proof: each minimal polynomial ${\displaystyle m_{i}(x)}$ has degree at most ${\displaystyle m.}$

Therefore, the least common multiple of ${\displaystyle d-1}$ of them has degree at most ${\displaystyle (d-1)m.}$ Moreover, if ${\displaystyle q=2,}$ then ${\displaystyle m_{i}(x)=m_{2i}(x)}$ for all ${\displaystyle i.}$ Therefore, ${\displaystyle g(x)}$ is the least common multiple of at most ${\displaystyle d/2}$ minimal polynomials ${\displaystyle m_{i}(x)}$ for odd indices ${\displaystyle i,}$ each of degree at most ${\displaystyle m.}$

2. A BCH code has minimal Hamming distance at least ${\displaystyle d.}$ Proof: Suppose that ${\displaystyle p(x)}$ is a code word with fewer than ${\displaystyle d}$ non-zero terms. Then

${\displaystyle p(x)=b_{1}x^{k_{1}}+\cdots +b_{d-1}x^{k_{d-1}},{\text{ where }}k_{1}
${\displaystyle p(\alpha ^{i})=b_{1}\alpha ^{ik_{1}}+b_{2}\alpha ^{ik_{2}}+\cdots +b_{d-1}\alpha ^{ik_{d-1}}=0.}$

In matrix form, we have

${\displaystyle {\begin{bmatrix}\alpha ^{ck_{1}}&\alpha ^{ck_{2}}&\cdots &\alpha ^{ck_{d-1}}\\\alpha ^{(c+1)k_{1}}&\alpha ^{(c+1)k_{2}}&\cdots &\alpha ^{(c+1)k_{d-1}}\\\vdots &\vdots &&\vdots \\\alpha ^{(c+d-2)k_{1}}&\alpha ^{(c+d-2)k_{2}}&\cdots &\alpha ^{(c+d-2)k_{d-1}}\\\end{bmatrix}}{\begin{bmatrix}b_{1}\\b_{2}\\\vdots \\b_{d-1}\end{bmatrix}}={\begin{bmatrix}0\\0\\\vdots \\0\end{bmatrix}}.}$

The determinant of this matrix equals

${\displaystyle \left(\prod _{i=1}^{d-1}\alpha ^{ck_{i}}\right)\det {\begin{pmatrix}1&1&\cdots &1\\\alpha ^{k_{1}}&\alpha ^{k_{2}}&\cdots &\alpha ^{k_{d-1}}\\\vdots &\vdots &&\vdots \\\alpha ^{(d-2)k_{1}}&\alpha ^{(d-2)k_{2}}&\cdots &\alpha ^{(d-2)k_{d-1}}\\\end{pmatrix}}=\left(\prod _{i=1}^{d-1}\alpha ^{ck_{i}}\right)\det(V).}$

The matrix ${\displaystyle V}$ is seen to be a Vandermonde matrix, and its determinant is

${\displaystyle \det(V)=\prod _{1\leq i

which is non-zero. It therefore follows that ${\displaystyle b_{1},\ldots ,b_{d-1}=0,}$ hence ${\displaystyle p(x)=0.}$

3. A BCH code is cyclic.

Proof: A polynomial code of length ${\displaystyle n}$ is cyclic if and only if its generator polynomial divides ${\displaystyle x^{n}-1.}$ Since ${\displaystyle g(x)}$ is the minimal polynomial with roots ${\displaystyle \alpha ^{c},\ldots ,\alpha ^{c+d-2},}$ it suffices to check that each of ${\displaystyle \alpha ^{c},\ldots ,\alpha ^{c+d-2}}$ is a root of ${\displaystyle x^{n}-1.}$ This follows immediately from the fact that ${\displaystyle \alpha }$ is, by definition, an ${\displaystyle n}$th root of unity.

## Decoding

There are many algorithms for decoding BCH codes. The most common ones follow this general outline:

1. Calculate the syndromes sj for the received vector
2. Determine the number of errors t and the error locator polynomial Λ(x) from the syndromes
3. Calculate the roots of the error location polynomial to find the error locations Xi
4. Calculate the error values Yi at those error locations
5. Correct the errors

During some of these steps, the decoding algorithm may determine that the received vector has too many errors and cannot be corrected. For example, if an appropriate value of t is not found, then the correction would fail. In a truncated (not primitive) code, an error location may be out of range. If the received vector has more errors than the code can correct, the decoder may unknowingly produce an apparently valid message that is not the one that was sent.

### Calculate the syndromes

The received vector ${\displaystyle R}$ is the sum of the correct codeword ${\displaystyle C}$ and an unknown error vector ${\displaystyle E.}$ The syndrome values are formed by considering ${\displaystyle R}$ as a polynomial and evaluating it at ${\displaystyle \alpha ^{c},\ldots ,\alpha ^{c+d-2}.}$ Thus the syndromes are[5]

${\displaystyle s_{j}=R(\alpha ^{j})=C(\alpha ^{j})+E(\alpha ^{j})}$

for ${\displaystyle j=c}$ to ${\displaystyle c+d-2.}$ Since ${\displaystyle \alpha ^{j}}$ are the zeros of ${\displaystyle g(x),}$ of which ${\displaystyle C(x)}$ is a multiple, ${\displaystyle C(\alpha ^{j})=0.}$ Examining the syndrome values thus isolates the error vector so one can begin to solve for it.

If there is no error, ${\displaystyle s_{j}=0}$ for all ${\displaystyle j.}$ If the syndromes are all zero, then the decoding is done.

### Calculate the error location polynomial

If there are nonzero syndromes, then there are errors. The decoder needs to figure out how many errors and the location of those errors.

If there is a single error, write this as ${\displaystyle E(x)=e\,x^{i},}$ where ${\displaystyle i}$ is the location of the error and ${\displaystyle e}$ is its magnitude. Then the first two syndromes are

${\displaystyle s_{c}=e\,\alpha ^{c\,i}}$
${\displaystyle s_{c+1}=e\,\alpha ^{(c+1)\,i}=\alpha ^{i}s_{c}}$

so together they allow us to calculate ${\displaystyle e}$ and provide some information about ${\displaystyle i}$ (completely determining it in the case of Reed–Solomon codes).

If there are two or more errors,

${\displaystyle E(x)=e_{1}x^{i_{1}}+e_{2}x^{i_{2}}+\cdots \,}$

It is not immediately obvious how to begin solving the resulting syndromes for the unknowns ${\displaystyle e_{k}}$ and ${\displaystyle i_{k}.}$ First step is finding locator polynomial

${\displaystyle \Lambda (x)=\prod _{j=1}^{t}(x\alpha ^{i_{j}}-1)}$ compatible with computed syndromes and with minimal possible ${\displaystyle t.}$

Two popular algorithms for this task are:

#### Peterson–Gorenstein–Zierler algorithm

Peterson's algorithm is the step 2 of the generalized BCH decoding procedure. Peterson's algorithm is used to calculate the error locator polynomial coefficients ${\displaystyle \lambda _{1},\lambda _{2},\dots ,\lambda _{v}}$ of a polynomial

${\displaystyle \Lambda (x)=1+\lambda _{1}x+\lambda _{2}x^{2}+\cdots +\lambda _{v}x^{v}.}$

Now the procedure of the Peterson–Gorenstein–Zierler algorithm.[6] Expect we have at least 2t syndromes sc,...,sc+2t−1. Let v = t.

${\displaystyle S_{v\times v}={\begin{bmatrix}s_{c}&s_{c+1}&\dots &s_{c+v-1}\\s_{c+1}&s_{c+2}&\dots &s_{c+v}\\\vdots &\vdots &\ddots &\vdots \\s_{c+v-1}&s_{c+v}&\dots &s_{c+2v-2}\end{bmatrix}}.}$
${\displaystyle C_{v\times 1}={\begin{bmatrix}s_{c+v}\\s_{c+v+1}\\\vdots \\s_{c+2v-1}\end{bmatrix}}.}$
${\displaystyle \Lambda _{v\times 1}={\begin{bmatrix}\lambda _{v}\\\lambda _{v-1}\\\vdots \\\lambda _{1}\end{bmatrix}}.}$
• Form the matrix equation
${\displaystyle S_{v\times v}\Lambda _{v\times 1}=-C_{v\times 1\,}.}$
       if ${\displaystyle v=0}$
then
declare an empty error locator polynomial
stop Peterson procedure.
end
set ${\displaystyle v\leftarrow v-1}$
continue from the beginning of Peterson's decoding by making smaller ${\displaystyle S_{v\times v}}$


### Factor error locator polynomial

Now that you have the ${\displaystyle \Lambda (x)}$ polynomial, its roots can be found in the form ${\displaystyle \Lambda (x)=(\alpha ^{i_{1}}x-1)(\alpha ^{i_{2}}x-1)\cdots (\alpha ^{i_{v}}x-1)}$ by brute force for example using the Chien search algorithm. The exponential powers of the primitive element ${\displaystyle \alpha }$ will yield the positions where errors occur in the received word; hence the name 'error locator' polynomial.

The zeros of Λ(x) are αi1, ..., αiv.

### Calculate error values

Once the error locations are known, the next step is to determine the error values at those locations. The error values are then used to correct the received values at those locations to recover the original codeword.

For the case of binary BCH, (with all characters readable) this is trivial; just flip the bits for the received word at these positions, and we have the corrected code word. In the more general case, the error weights ${\displaystyle e_{j}}$ can be determined by solving the linear system

{\displaystyle {\begin{aligned}s_{c}&=e_{1}\alpha ^{c\,i_{1}}+e_{2}\alpha ^{c\,i_{2}}+\cdots \\s_{c+1}&=e_{1}\alpha ^{(c+1)\,i_{1}}+e_{2}\alpha ^{(c+1)\,i_{2}}+\cdots \\&{}\ \vdots \end{aligned}}}

#### Forney algorithm

However, there is a more efficient method known as the Forney algorithm.

Let ${\displaystyle \Omega (x)=S(x)\,\Lambda (x){\pmod {x^{d-1}}}}$ be the error evaluator polynomial[7]

Than if syndromes could be explained by an error word, which could be nonzero only on positions ${\displaystyle i_{k}}$, then error values are

${\displaystyle e_{k}=-{\alpha ^{i_{k}}\Omega (\alpha ^{-i_{k}}) \over \alpha ^{c\cdot i_{k}}\Lambda '(\alpha ^{-i_{k}})}.}$

For narrow-sense BCH codes, c = 1, so the expression simplifies to:

${\displaystyle e_{k}=-{\Omega (\alpha ^{-i_{k}}) \over \Lambda '(\alpha ^{-i_{k}})}.}$

#### Explanation of Forney algorithm computation

It is based on Lagrange interpolation and techniques of generating functions.

${\displaystyle S(x)=\sum _{i=0}^{d-2}\sum _{j=1}^{v}e_{j}\alpha ^{(c+i)\cdot i_{j}}x^{i}=\sum _{j=1}^{v}e_{j}\alpha ^{c\,i_{j}}\sum _{i=0}^{d-2}(\alpha ^{i_{j}})^{i}x^{i}=\sum _{j=1}^{v}e_{j}\alpha ^{c\,i_{j}}{(x\alpha ^{i_{j}})^{d-1}-1 \over x\alpha ^{i_{j}}-1}.}$
${\displaystyle S(x)\Lambda (x)=S(x)\lambda _{0}\prod _{\ell =1}^{v}(\alpha ^{i_{\ell }}x-1)=\lambda _{0}\sum _{j=1}^{v}e_{j}\alpha ^{c\,i_{j}}{(x\alpha ^{i_{j}})^{d-1}-1 \over x\alpha ^{i_{j}}-1}\prod _{\ell =1}^{v}(\alpha ^{i_{\ell }}x-1).}$

We could gain form of polynomial:

${\displaystyle S(x)\Lambda (x)=\lambda _{0}\sum _{j=1}^{v}e_{j}\alpha ^{c\,i_{j}}((x\alpha ^{i_{j}})^{d-1}-1)\prod _{\ell \in \{1,\dots ,v\}\setminus \{j\}}(\alpha ^{i_{\ell }}x-1).}$

We want to compute unknowns ${\displaystyle e_{j},}$ and we could simplify the context by removing the ${\displaystyle (x\alpha ^{i_{j}})^{d-1}}$ terms. This leads to the error evaluator polynomial

${\displaystyle \Omega (x)=S(x)\,\Lambda (x){\pmod {x^{d-1}}}.}$

Thanks to ${\displaystyle v\leq d-1}$ we have

${\displaystyle \Omega (x)=-\lambda _{0}\sum _{j=1}^{v}e_{j}\alpha ^{c\,i_{j}}\prod _{\ell \in \{1,\dots ,v\}\setminus \{j\}}(\alpha ^{i_{\ell }}x-1).}$

Look at ${\displaystyle \Omega (\alpha ^{-i_{k}}).}$ Thanks to ${\displaystyle \Lambda }$ (the Lagrange interpolation trick) the sum degenerates to only one summand

${\displaystyle \Omega (\alpha ^{-i_{k}})=-\lambda _{0}e_{k}\alpha ^{c\cdot i_{k}}\prod _{\ell \in \{1,\dots ,v\}\setminus \{k\}}(\alpha ^{i_{\ell }}\alpha ^{-i_{k}}-1).}$

To get ${\displaystyle e_{k}}$ we just should get rid of the product. We could compute the product directly from already computed roots ${\displaystyle \alpha ^{-i_{j}}}$ of ${\displaystyle \Lambda ,}$ but we could use simpler form.

${\displaystyle \Lambda '(\alpha ^{-i_{k}})=\lambda _{0}\alpha ^{i_{k}}\prod _{\ell \in \{1,\dots ,v\}\setminus \{k\}}(\alpha ^{i_{\ell }}\alpha ^{-i_{k}}-1).}$

So finally

${\displaystyle e_{k}=-{\alpha ^{i_{k}}\Omega (\alpha ^{-i_{k}}) \over \alpha ^{c\cdot i_{k}}\Lambda '(\alpha ^{-i_{k}})}.}$

This formula is advantageous when one computes the formal derivative of ${\displaystyle \Lambda }$ form its ${\displaystyle \Lambda (x)=\sum _{i=1}^{v}\lambda _{i}x^{i}}$ form, gaining

${\displaystyle \Lambda '(x)=\Sigma _{i=1}^{v}i\cdot \lambda _{i}x^{i-1},}$

where ${\displaystyle i\cdot x}$ denotes here ${\displaystyle \textstyle \sum _{k=1}^{i}x}$ rather than multiplying in the field.

### Decoding based on extended Euclidean algorithm

The process of finding both the polynomial Λ and the error values could be based on the Extended Euclidean algorithm. Correction of unreadable characters could be incorporated to the algorithm easily as well.

Let ${\displaystyle k_{1},...,k_{k}}$ be positions of unreadable characters. One creates polynomial localising these positions ${\displaystyle \Gamma (x)=\prod _{i=1}^{k}(x\alpha ^{k_{i}}-1).}$ Set values on unreadable positions to 0 and compute the syndromes.

As we have already defined for the Forney formula let ${\displaystyle S(x)=\sum _{i=0}^{d-2}s_{c+i}x^{i}.}$

Let us run extended Euclidean algorithm for locating least common divisor of polynomials ${\displaystyle S(x)\Gamma (x)}$ and ${\displaystyle x^{d-1}.}$ The goal is not to find the least common divisor, but a polynomial ${\displaystyle r(x)}$ of degree at most ${\displaystyle \lfloor (d+k-3)/2\rfloor }$ and polynomials ${\displaystyle a(x),b(x)}$ such that ${\displaystyle r(x)=a(x)S(x)\Gamma (x)+b(x)x^{d-1}.}$ Low degree of ${\displaystyle r(x)}$ guarantees, that ${\displaystyle a(x)}$ would satisfy extended (by ${\displaystyle \Gamma }$) defining conditions for ${\displaystyle \Lambda .}$

Defining ${\displaystyle \Xi (x)=a(x)\Gamma (x)}$ and using ${\displaystyle \Xi }$ on the place of ${\displaystyle \Lambda (x)}$ in the Fourney formula will give us error values.

The main advantage of the algorithm is that it meanwhile computes ${\displaystyle \Omega (x)=S(x)\Xi (x){\bmod {x}}^{d-1}=r(x)}$ required in the Forney formula.

#### Explanation of the decoding process

The goal is to find a codeword which differs from the received word minimally as possible on readable positions. When expressing the received word as a sum of nearest codeword and error word, we are trying to find error word with minimal number of non-zeros on readable positions. Syndrom ${\displaystyle s_{i}}$ restricts error word by condition ${\displaystyle s_{i}=\sum _{j=0}^{n-1}e_{j}\alpha ^{ij}.}$ We could write these conditions separately or we could create polynomial ${\displaystyle S(x)=\sum _{i=0}^{d-2}s_{c+i}x^{i}}$ and compare coefficients near powers ${\displaystyle 0}$ to ${\displaystyle d-2.}$ ${\displaystyle S(x){\textstyle {\{0,\ldots ,\,d-2\} \atop =}}E(x)=\sum _{i=0}^{d-2}\sum _{j=0}^{n-1}e_{j}\alpha ^{ij}\alpha ^{cj}x^{i}.}$

Suppose there is unreadable letter on position ${\displaystyle k_{1},}$ we could replace set of syndromes ${\displaystyle \{s_{c},\ldots ,s_{c+d-2}\}}$ by set of syndromes ${\displaystyle \{t_{c},\ldots ,t_{c+d-3}\}}$ defined by equation ${\displaystyle t_{i}=\alpha ^{k_{1}}s_{i}-s_{i+1}.}$ Suppose for an error word all restrictions by original set ${\displaystyle \{s_{c},\ldots ,s_{c+d-2}\}}$ of syndromes hold, than ${\displaystyle t_{i}=\alpha ^{k_{1}}s_{i}-s_{i+1}=\alpha ^{k_{1}}\sum _{j=0}^{n-1}e_{j}\alpha ^{ij}-\sum _{j=0}^{n-1}e_{j}\alpha ^{j}\alpha ^{ij}=\sum _{j=0}^{n-1}e_{j}(\alpha ^{k_{1}}-\alpha ^{j})\alpha ^{ij}.}$ New set of syndromes restricts error vector ${\displaystyle f_{j}=e_{j}(\alpha ^{k_{1}}-\alpha ^{j})}$ the same way the original set of syndromes restricted the error vector ${\displaystyle e_{j}.}$ Note, that except the coordinate ${\displaystyle k_{1},}$ where ${\displaystyle f_{k_{1}}=0,}$ an ${\displaystyle f_{j}}$ is zero, iff ${\displaystyle e_{j}}$ is zero. For the goal of locating error positions we could change the set of syndromes in the similar way to reflect all unreadable characters. This shortens the set of syndromes by ${\displaystyle k.}$

In polynomial formulation, the replacement of syndromes set