Just Stuff

Just another WordPress.com weblog

Archive for the ‘Maths and Stuff’ Category

Peer Rejected

leave a comment »

It seems that some problems with abuse of peer review have been highlighted by the stem cell community [1].

My own experience with this was a paper that I wrote for SIAM Review [2] essentially criticising something they had published (on the speed of the domino effect). The criticism was that [3] presented a mathematical model of the domino effect the results of which were counter-intuitive and no attempt had been made to compare the model predictions with any experimental data despite there being a number of data sets in the literature. In fact the principle references in [3] did contain experimental data (slightly flawed in my opinion but that is not relevant here) which if used would have invalidated the model.

The problem with my paper might in the eyes of the editors (and possibly the referees, if the same referees were used as for the original paper) be that not only was I critical of the original paper but of the reviewers in not picking up that no attempt at validation had been made. This might also be construed as a criticism of the who refereeing and editorial policy at SIAM Review.

It took SIAM Review a year to decide that they would not publish this paper. IIRC the reason for rejection was that they did not accept papers that were critical of stuff they had already published. I am not able to recover the email to quote the exact wording as my work email system seems to have deleted all emails older than a few months. Why it took them a year to tell me that they were not going to publish  I don’t know. They could have determined that it did not satisfy their policy in about 5 minutes.

Peer review is not some panacea that guarantees that what is published is correct, methodologically sound and/or important. It is influenced by the the foibles of the editorial staff, policy and any number of other things. The need of researchers to publish guarantees that a large proportion of papers published are trivial (assuming they have no other defects).

It is also the case that a paper or report has not been published in a peer reviewed journal is not necessarily worthless. There has been a piece in the news recently about the IPCC using a report from mountaineers in the Alps about the state of glaciers. This is not a controlled experimental result nor has it been published in a peer reviewed journal (IIRC it was published in some mountaineering journal) but it is expert testimony and has some value in the absence of other data.

Of course I might have broken an academic taboo in sending [2] to SIAM Review without first informing the authors of [3], but in this case I consider that an irrelevance, they had published and there is nothing they can say that would retrospectively introduce validation into their paper also the criticism was not just of what they wrote but of the refereeing process itself. I’m afraid that I am not a very courteous person and I an not going to tell you that I think what you write is a heap of crap before doing so, which is why the internet is such a wonderful (virtual) place I can write what I want without asking anyone’s permission.

References

1. “Journal stem cell work ‘blocked'”, BBC News, http://news.bbc.co.uk/1/hi/sci/tech/8490291.stm

2. “Validation of a Model of the Domino Effect?”, Ron Larham arXiv.org paper: arXiv:0803.2898, http://arxiv.org/abs/0803.2898

3. “Domino Waves”, C.J Efthimiou, M.D. Johnson,  SIAM Review 49 (2007) 111

Written by CaptainBlack

February 3, 2010 at 08:30

Posted in Maths and Stuff

Tagged with

Unfitness Flows Between 1991 EHCS and the 1996 EHCS

leave a comment »

Background, the EHCS

In the mid 1990 I worked on the English House Condition Survey at the Building Research Establishment for the UK Department of the Environment and its successors. This was mainly on the physical disrepair but also the analysis of errors in the fitness assessment element of the surveys. The reports on these surveys can be found on the National Archives [1] . Additional material can be found in two conference papers that I published on this work [2][3].

The English House Condition Survey is a periodic survey of the condition of the English housing stock, in the 1990’s this comprised a large scale survey conducted every five years. The survey covered a number of aspects of the housing stock but those that I was particularly concerned with were disrepair and unfitness. The Physical Survey involved sending surveyors to examine the condition of a large-ish sample (ca 20,000) of dwellings selected from the English stock. From their observations a number of estimated repair costs were computed. In addition the surveyours reported on the fitness for human habitation of the dwelling for which they were specially trained to assess against the then current government standards.

Flows From Fitness to Unfitness and From Unfitness to Fitness

The thing that gets my ire is illustrated by figure 1 reproduces from the 1996 EHCS report. This shows that about 30% of the dwellings unfit in 1991 had been rendered fit for 1996 and about the same number of dwellings fit in 1991 had become unfit by 1996. Nothing too outrageous in that we might think. Until that is we realise that these flows between fitness state are derived from those dwellings common to both surveys and we know something about the reliability of the fitness assessments from the call back surveys. I don’t have the relevant data for 1996 to hand and if I did its analysis would be difficult because the over sampling of properties in poor condition in the 1996 callback survey. The data for 1991 is not complicated in this manner and is available in [2] and the relevant table is reproduced here as table 1. What we see in table 1 is that when a dwelling is found unfit on a survey there is something like a 30% chance that it will be found fit if resurveyed even if no work has been done on the dwelling. We may assume that a similar number of dwellings found fit will be found fit on resurvey. Such apparent flows which are just the signature of the fitness assessment uncertainties account for all the flows between unfit and fit states between the surveys.

Figure 1: Change in unfitness 1991-96 from [1]

Now I could extract the apparent flows in figure 1 (or from the tables in the 1996 EHCS report) and use the data from table 1 to estimate the underlying cross flows. Given that the apparent cross flows are comparable to what we would expect just from surveyour variability the results would be very noisy if not meaningless.

Table 1: Numbers of fit and unfit dwellings on initial survey and call-back for the 1991 EHCS from [2]

Why do I take exception to figure 1? It is because the problem of flows between fitness states were known at the time that figure was produced and its originator so informed. The author in question ignored the warnings about this because presumably what was presented told a better story. But a story pretending to be scientific and based on good statistics. It suggests to the present author that all derived statistics emanating from government departments are to be distrusted.

Other Difficulties with Customer

This is in some way typical with the problems I had while working on the EHCS. It always seemed a struggle to get acceptance of what the data was saying against the preconceptions of the DoE side of the EHCS team.

I first encountered this when the initial results for the 1991 EHCS indicated that the disrepair was down on the 1986 EHCS disrepair. It proved necessary to marshal other building statistics to show that the amount of work being done on housing repairs and maintenance had risen by a considerable margin over the period from the building industry activity statistics. I also sent an age with survey forms from 86 and 91 spread over my office floor while I counted the reported disrepair quantities from raw survey forms from the two surveys.

Another problem that I had was the censorship of [2]. Our DoE clients were not happy about my prediction that the 1996 EHCS would see a further small reduction in disrepair over the 1991 survey. The reason for this, if I recall correctly, was that they had already told the Treasury that the public sector disrepair was up and did not want the Treasury to see something that they might interpret as contradicting that. This was despite my prediction being for the stock as a whole, the majority of the English housing stock is under owner occupation and the overall disrepair in the stock is dominated by this sector. It is very difficult to tell from the 1996 EHCS report what the relative movements in disrepair withing particular tenure types was because the disrepair model had changed. The report states that after calibration the disrepair in 96 appeared to be between 0 and 5% down on 1991. This kind of comparison withing tenure types does not seem to have been possible. Attempting this now from the tables in the report seems to indicated that the disrepair in the local authority stock was probably up, but this may have been due to flows out of local authority ownership into other tenures of the best LA stock rather than a deterioration in the condition of specific dwellings (at least for dwellings not in high-rise blocks).

In general I always felt the the DoE part of the EHCS team were not very happy with a lot of the work that I did on the error characteristics of the surveys, nor for that matter with any stock modelling that involved differential equations. If I gave then or now the impression that I had or have a poor opinion of the EHCS this is wrong. I always have thought highly of it, but you can use the data more effectively if you understand its idiosyncrasies.

More to Do on This Post:

Some more needs to be said about the fitness flows including the actual numbers for the flows in figure 1 and something about the peculiarities of the data in table 1. Also something needs to be said about the problems in [2] (which are not in the copy on the RICS website but are in my own PDF which can be found here in red near the end)

References

[1] 1991 and 1996 EHCS reports http://www.ndad.nationalarchives.gov.uk/CRDA/51/DD/1/quickref.html

[2] R. Larham, “Error Characteristics of the English House Condition Survey”, CORBA95 conference of the RIBS, 1996.

[3] R. Larham, “Modelling the Evolution of the condition of a housing stock”, CORBA96 conference of the RIBS, 1997

Written by CaptainBlack

January 18, 2010 at 15:04

Posted in Maths and Stuff

Tagged with ,

Draft Paper for Citizen Scientist

leave a comment »

Auto-Correlation of Rectified Signal re DEMON spectral Analysis

(Slightly modified version to be published on Citizen Scientist in Feb of March 2010)

It has been pointed out (reference 1) that the speed of a domino wave can be estimated from peaks in the auto-correlation of the rectified recording of the noise of the domino array collapse (note here I am ignoring filtering etc which is useful with the auto-correlation method as it is also with the DEMON method (references 2 and 3). There can be no argument with this in principle since for periodic phenomenon the auto-correlation and spectrum of the rectified signal both encode the frequency and/or period of the phenomenon. The auto-correlation will contain a series of peaks at intervals corresponding to the period of the phenomenon (and the time delay of the for  such peak will be at a value corresponding to the period), while the DEMON spectrum will have a peak at a frequency corresponding to the frequency (1 over the period) of the phenomenon.

Time Domain Auto-Correlation Complexity

One option for computing the auto-correlation of a data vector is to do the calculations in the time domain. The pseudo-code below shows how this could be done.

————————————————————————————————–
Psuedo-Code for Time Domain Auto-Correlation:
// Assume input data (real) is in array(1..Npoints) and
// output data is in array ss(1..Ndelays) with the first
// point corresponding to zero delay

ss=zeros(1,Ndelays)
for delay=0 to Ndelays-1
  for idx=delay+1 to Npoints
    ss(delay+1)=ss(delay+1)+data(idx)*data(idx+delay);
  endfor
endfor
-----------------------------------------------------------

This allows us to compute the number of floating point add and multiply operations needed for this time domain auto-correlation, which is:

N_{op}=2 N_{points} ~N_{delays}+N_{delays}^2\ \ \ \ \ ... (Eq 1)

Now if as is often the case N_{points} \gg N_{delays}, then we can write:

N_{op} \approx 2 N_{points}~N_{delays}\ \ \ \ \ ... (Eq 1a)

Note: loop control variable increments and array index arithmetic operations are not counted as these are generally much faster than the (usually) double precision floating point operations.

Frequency Domain Auto-Correlation Complexity

The main alternative to the above time domain method for computing the auto-correlation is to use the convolution theorem to convert the auto correlation calculation from what is a discrete convolution in the time domain to a point wise product in the frequency domain, and then transforming back into the time domain. To do the conversion from time to frequency domain we will be using the Fast Fourier Transform (FFT) (see reference 4 for some general background for the FFT), which is an efficient implementation of the Discrete Fourier Transform (DFT). This is the Fourier transform for a discrete periodic signal,  so to avoid wrap around effects we need to pad the data with sufficient zeros so the wrap around does not effect the part of the result we are interested in (that is the FD auto-correlation we will compute using the DFT will be a circular auto-correlation). This means that if we want N_{delays} points of the auto-correlation we will need to pad our data vector with N_{delays} zeros giving an effective data vector length of M=N_{points}+N_{delays}.

Now the equivalent of the convolution theorem for correlations tells us that:

(data \star data)_i =\sum_j \overline{data}_j ~data_{i+j}=\left[\mathfrak{F}^{-1}\left( \mathfrak{F}(data).\overline{\mathfrak{F}(data)}\right)\right]_i

Where \star denotes the (circular) correlation operator (not to be confused with the convolution operator *), \mathfrak{F}^{-1}  the inverse DFT operator, \mathfrak{F} the DFT operator, “.” the pointwise product operator,i is the correlation delay which starts at 0 and index arithmetic is modulo the length of data. For convenience it is best to regard the indicies to follow the C/C++ convention and start at 0 and run up to M-1, where M is the length of data, rather than form 1 to M.

That this works can be demonstrated on zero mean random complex simulated data (2048 points 0-19 sample delays), the absolute value of the difference between the time and frequency domain results are shown in figure 1 below, where we can see that the differences are of the order of 10^{-14} (all calculations done in double precision), which is pretty good agreement.

Figure 1: Plot of the Absolute Value of the Time and Frequency Domain Auto-Correlations for a Random Complex Data Set

Figure 1: Plot of the Absolute Value of the Time and Frequency Domain Auto-Correlations for a Random Complex Data Set

The operations count for the frequency domain auto-correlation computation are a bit are difficult to compute than for the time domain equivalent. This is mainly because we will be using the FFT to compute the DFT’s and these all differ in efficiency. Other things being equal and a good FFT code used the number of operations to compute a FFT is:

N_{op} \approx k N \log_2 (N)

where N is the length of the transform, and k is a constant of the order of 10. for a complex transform and half that for a real transform.

So if we wish to compute N_{delays} points of the auto-correlation of a signal of length N_{points} we will need two FFT’s of length N_{delays}+N_{points} (one forward transform and one inverse transform) and an additional N_{delays}+N_{points} complex multiplies, which is equivalent to three times as many real multiplies and one real add but half of these can be eliminated if the original signal is real because of the symmetries in the FFT of a real signal.

So we have (to a handwaving approximation for the coefficient and a conservative allowance for the number of multiples) for a frequency domain auto correlation the floating point operations count is:

N_{op} \approx (N_{delays}+N_{points})\left[10\log_2(N_{delays}+N_{points})+4\right] \ ... (Eq 2)

Comparison Of Operations Counts for Time and Frequency Domain Auto-Correlation

We may plot the operations counts corresponding to equations 1 and 2, when we get the plot in figure 2, where we see a crossover where the frequency domain auto-correlation becomes more efficient than the time domain algorithm somewhere in the region of 50 delays (this is for a data sample of 30000 points, a value typical of a recording of a domino wave of duration slightly over 1 second). As for domino wave we know that we are looking for impact frequencies down to something between 15 and 20 Hz for a sampling frequency of the order of 20 kHz we will need to look at auto-correlation delays of up to something over 1000 sample intervals. In which case the frequency domain auto-correlation is significantly more efficient.

Figure 2: Plot comparing the Floating Point Operations Counts for Time and Frequency Domain Auto-Correlations as a Function of the Number of Delays Required for a Signal with 30000 Sample Points

Figure 2: Plot Comparing the Floating Point Operations Counts for Time and Frequency Domain Auto-Correlations as a Function of the Number of Delays Required for a Signal with 30000 Sample Points

That this is not just a theoretical result can be seen from timings for the two auto-correlations produced for random complex signals shown in figure 3. Here the crossover is at something around 20 sample interval delays. The exact point of the cut off depends on the relative efficiency of the implementations of the two auto-correlation algorithms, and here the time domain auto-correlation is relatively inefficient (this is coded in Euler, and so is partially interpreted code while the FD algorithm is almost entirely library calls to compiled code)
Figure 3:

Figure 3: Plot of Timings for TD and FD Auto-Correlations for a 30000 (complex) Random Signal.

Also of note in Figure 3 is the variation in the timing of the FD Auto-Correlation, this is because a mixed radix FFT is in use and its efficiency varies somewhat with the length of the FFT.

Complexity of DEMON analysis

In the previous section we have seen that if we wish to analyse a signal using the auto-correlation of the rectified signal we will use a frequency domain implementation of the auto-correlation. This is more or less equivalent in operations count to two FFT’s of length equal to the signal length plus the number of points in the auto correlation we require. However when analysing a recording of a domino collapse wave we can just as well look for the frequency of the peak in the DEMON spectrum (the spectrum of the absolute value of the signal) (references 2 and 3). This involves just one FFT so it requires no more than half the operations that looking at the auto-correlation requires.

This difference is largely academic if we are doing a one-off analysis but will become significant when a large number of analyses are to be performed (and in an embeded system analysing signals in real time even more so).

Refences

1. Hannon J., “Domino Speed”, Citizen Scientist, May 2009,  http://www.sas.org/tcs/weeklyIssues_2009/2009-05-01/project1/index.html

2. Larham R., Measuring the Speed of the Domino Effect Using the Windows Sound Recorder, Part 1, Citizen Scientist, Nov 2007,  http://www.sas.org/tcs/weeklyIssues_2007/2007-11-02/project2/index.html

3. Larham R., Measuring the Speed of the Domino Effect Using the Windows Sound Recorder, Part 2, Citizen Scientist, Dec 2007,  http://www.sas.org/tcs/weeklyIssues_2007/2007-12-07/project2/index.html

4. Press W. H, Teukolsky S. A. , et al. “Numerical Recipes in C” (or any other of the series), CUP, 2-nd ed., 1992.

Written by CaptainBlack

May 9, 2009 at 13:03

Posted in DSP

Tagged with ,

More on Bounds on the Roots of Polynomials

leave a comment »

In an earlier article/post (reference 1) I showed the Cauchy upper bound for the absolute cvalue of the roots of a monic polynomial. The restriction to a monic polynomial was purly for convienience, but here we will require an arbitary polynomial  of degree n so the Cauchy lower bound is: If  x is a root of the polynomial:

P(x)=a_nx^n+a_{n-1}x^{n-1}+ ... + a_0

then:

|x| \le 1+ \frac{\max_{i=0, .. , n-1} |a_i|}{|a_n| }

Now I will extend this to give a lower bound for the absolute value of the non-zero roots of P(x) Without any real loss of generality we can assume that both a_n and a_0 \ne 0 (if the lowest power of  x appearing in p(x) is \rho \ne 0  we divide through by x^{\rho} to get a polynomial of the required form and whos non-zero roots are the same as the polynomial we started with).

Let B(a_0, a_1, ..., a_n) be an an upper bound on the ansolute value of roota of the ploynomial: P(x)=a_nx^n+a_{n-1}x^{n-1}+ ... + a_0 where a_n and a_0 are both non-zero.  Now let x be a root of P(x), so:

P(x)=a_nx^n+a_{n-1}x^{n-1}+ ... + a_0=0

Now divide through by x^n and put y=1/x, so we have:

Q(y)=a_0 y^n + a_1 y^{n-1}+ ... + a_n=0

Hence:

|y| \le B(a_n, a_1, ..., a_0)

or:

|x| \ge 1/B(a_n, a_1, ..., a_0) .

So combinining this lower bound with the upper bound we have for any n-th degree polynomial with only non-zero roots for any root x:

\frac{1}{B(a_n, a_{n-1}, ..., a_0) } \le |x| \le B(a_0, a_1, ..., a_n)

and for the Cauchy bound we have:

B(a_0, a_1, ... , a_n) =1+\frac{\max_{i=0, .. , n-1} |a_i|}{|a_n| }

So:

\frac{|a_0|}{|a_0|+\max_{i=1,..,n}|a_i|}\le |x| \le \frac{|a_n|+\max_{i=0,..,n-1}|a_i|}{|a_n|}

References:

1. https://captainblack.wordpress.com/2009/03/08/cauchys-upper-bound-for-the-roots-of-a-polynomial/

Written by CaptainBlack

April 27, 2009 at 07:10

Posted in Maths and Stuff

Tagged with

More Domino Wave Speed Stuff

leave a comment »

Discussions in Backscatter Column of Citizen Scientist (ref 4) of my paper and experiments on measuring the speed of the Domino Effect (ref 1 and 2) and attempts to repeat the experiments also reported in CS seem to miss the point of using DEMON spectra (ref 3).

I may be partially responsible for this since I have in the past commented that the spikes of noise which I presume to be the clicks of domino impacts on closer examination appear to be tonals modulated by a narrow envelope. I now believe that this was partially a misinterpretation of the data and partially a result of over fierce band pass filtering on my part.

Whatever the truth of the above we can observe directly from a plot (fig 1) of the auto-correlation of a recording of the domino wave, that there is no clear signal at the expected delay (of about 0.04 s for this data file):

Auto-Correlation, data file d9p2.wav

Fig 1: Auto-Correlation, data file d9p2.wav

Fig 2 shows the DEMON spectrum for this data file and here we see a “Sore Thumb” signature at about 24.5 Hz, which would correspond to a delay of about 0.04 seconds in the auto-correlation for this data set, where there is no clear feature. (the 24.5 Hz frequency corresponds to a wave speed close to 1 m/s on an 0.04 m pitch domino array, or a normalised wave speed of ~1.4 for dominoes of height ~0.052 m and thickness ~-.008 m (the width does not affect the wave speed but here its is ~0.025m)

Fig 2: DEMON Spectrum,  data file d9p2.wav

Note: The datafile d9p2.wav is the recording of one of the repeat runs, not reported in refs 1 and 2, that I have  done to look at the variability in speed measurements. In this case the speed measured is within about 2% of that previously reported.

References:

1.  Larham R., Measuring the Speed of the Domino Effect Using the Windows Sound Recorder, Part 1, Citizen Scientist, Nov 2007, http://www.sas.org/tcs/weeklyIssues_2007/2007-11-02/project2/index.html

2.  Larham R., Measuring the Speed of the Domino Effect Using the Windows Sound Recorder, Part 2, Citizen Scientist, Dec 2007, http://www.sas.org/tcs/weeklyIssues_2007/2007-12-07/project2/index.html

3. Smith R., Attempted Recreation of Ron Larham’s Domino Wave Experiment, Citizen Scientist, March 2009, http://www.sas.org/tcs/weeklyIssues_2009/2009-03-06/project1/index.html

4. Hannon J., Still More About Domino Waves, Backscatter Column, Citizen Scientist, April 2009, http://www.sas.org/tcs/weeklyIssues_2009/2009-04-03/backscatter/index.html

Written by CaptainBlack

April 8, 2009 at 13:50

Posted in DSP, Maths and Stuff

Tagged with , ,

Blocking Matrices

leave a comment »

Proof that the Blocking Matrices do Indeed Block Signals from Specified Directions.

For some/certain signal processing tasks we wish to generate beam patterns with nulls in specified directions (or equivalently; filters with narrow notches at specified frequencies). Typically these can be used to null out stationary interference sources, directions to multipath images of a desired source, …

We assume that we have a number of sensors (M )which may be array elements, subarrays or formed beams and we wish to null signals from directions \theta_1, ..., \theta_k, and k < M, and responses {\bf{b}}(\theta_i) for a source in direction \theta_i (being a column vector of complex signals one element for each sensor). In a narrowband system we may assume there are simply complex numbers representing the sensor amplitude and phase response. We let {\bf{\widehat{b}}}(\theta_i) denote the normalised versions of these signals.

It is obvious (when pointed out anyway) that the matrix:

{\bf{A}}={\bf{I}}_{M\times M} - {\bf{B}} ({\bf{B}}^H{\bf{B}})^{^{-1}} {\bf{B}}^H

is a blocking matrix for directions \theta_1, ..., \theta_k (where {{{\bf{B}}}} is the matrix with \widehat{\bf{b}}(\theta_i)  for its columns). The space of possible sensor outputs constitutes a (complex) vector space of dimension M and the subspace  \text{span}[{\bf{b}}(\theta_i), i=1,\dots ,k]  is the null space of {\bf{A}} and {\bf{A}} behaves like the identity transformation on the orthogonal complement of the null space.

To see that this is a blocking matrix for \widehat{\bf{b}}(\theta_i) we need just consider:

{\bf{AB}} = \left[{\bf{I}}_{M\times M} - {\bf{B}} ({\bf{B}}^H{\bf{B}})^{^{-1}} {\bf{B}}^H\right] {\bf{B}}

so:

{\bf{AB}} = {\bf{B}} - {\bf{B}} ({\bf{B}}^H{\bf{B}})^{^{-1}} {\bf{B}}^H {\bf{B}} ={\bf{0}}

Now the columns of {\bf{AB}} are the vectors {\bf{A\widehat{b}}}(\theta_i) and so these are zeros vectors as expected.

To see that {\bf{A}} behaves like the identity on the orthogonal complement consider {\bf{x}} in the orthogonal complement of the space spanned by the columns of {\bf{B}}. Then {\bf{B}}^H{\bf{x}}={\bf{0}} since each component of this is the dot product of a column of {\bf{B}} and {\bf{x}} and so zero. Hence:

{\bf{Ax}} = \left[{\bf{I}}_{M\times M} - {\bf{B}} ({\bf{B}}^H{\bf{B}})^{^{-1}} {\bf{B}}^H\right] {\bf{x}}

={\bf{x}}+{\bf{B}} ({\bf{B}}^H{\bf{B}})^{^{-1}} {\bf{B}}^H {\bf{x}}

={\bf{x}}

(I hope that is all clear, WordPress LaTeX seems to have started deleting all the \ characters from the LaTeX strings again so there may still be some omissions from when I have tried to restore them)

Written by CaptainBlack

March 25, 2009 at 12:23

Posted in DSP, Maths and Stuff

Tagged with ,

Cauchy’s Bound for the Roots of a Polynomial

with 4 comments

Preamble

This post serves a number of purposes, the first to demonstrate the use of LaTeX in WordPress, the second to provide a convenient location for the proof of this result which while elementary seems difficult to find clearly stated and proven on-line.

Background

When searching for the roots of a polynomial it is advantageous to have some idea of the part of the real line or complex plane in which they lie. Once the roots are localised in this manner numerical techniques can be employed to find them.

There are a number of ways of localizing roots, particularly real roots, from the rational roots theorem that provides a list a candidates for the rational roots of a polynomial, through Descartes’ rule of signs which provides bounds on the number of positive and negative roots, through various bounds on the roots, of which the Cauchy bound is an example, through to Sturm sequences. All of these methods have their uses, but here I want to give the Cauchy bound.

Cauchy’s Bound on the Roots of a Polynomial

Let:

P(x)=x^n + a_{n-1}x^{n-1}+ ... + a_1 x+a_0

be a monic polynomial of degree n, then if x is a root of P(x), that is a solution of P(x)=0:

|x| \le 1+\max_{i=0,..,n-1} |a_i|

Proof

We will start by assuming that x is a solution of P(x)=0 that has absolute value greater than 1, since if it were otherwise the bound would hold trivially.  Now  rewrite P(x)=0 as:

x^n=-a_0-a_1x- .. -a_{n-1}x^{n-1}

Now if we write h=\max_{i=0,..,n-1} |a_i| then since:

|x^n| \le |a_0|+|a_1||x|+ ... + |a_{n-1}||x^{n-1}|

as |x|>1 and for real  n we have |x^n|=|x|^n we get:

|x|^n \le h [1+|x|+ .. +|x|^{n-1}]

hence the expression inside the square brackets is a finite geometric series so:

|x|^n \le h \frac{|x|^n-1}{|x|-1}

Therefore as |x|>1:

|x|-1 \le h \frac{|x|^n-1}{|x^n|} = h \left( 1-\frac{1}{|x|^n} \right).

Then:

|x|-1 \le h.

and so |x| \le 1+h, which proves what was to be proven.

Example

Consider P(x)=x^2+3x+14, the Cauchy bound for its the roots is 15, and the roots are (approximately) x=-1.5\pm 3.428 i, and |x|\approx 3.741 \le 15 as expected.

[Conclusions on LaTeX on wordpress: the results look OK, but the system is temperamental, at least twice now having stripped out the back slashes from the LaTeX, also on occasion it fails to parse LaTeX code that it parsed earlier.]

Written by CaptainBlack

March 8, 2009 at 11:52

Posted in Maths and Stuff

Tagged with , ,