Biographical Information
Date of Birth: 26/12/1934
Married: Gloria Mary
Qualifications: B.A. (Melb.), Ph.D (Lond.), FAA.
Contact Information
Work Address |
Home Address |
School of Mathematical Sciences |
1 Weld Street |
Australian National University |
Yarralumla |
Canberra |
A.C.T. 2600 |
A.C.T. 0200 |
|
Australia |
|
Work URL |
Home URL |
Electronic mail address
Office phone |
Home phone |
(061)-(02)- 6125 3449 |
(02)- 6281 1648 |
Fax (CMA)
(061)-(02)- 6125 5549
Work Information
Job title
Professor Emeritus
Key responsibilities
Visiting Fellow
Department
Centre for Mathematics and its Applications (CMA),
School of Mathematical Sciences.
Membership of Societies
Australian Mathematical Society, Emeritus member American Mathematical Society,
Society for Industrial and Applied Mathematics (Fellow 2009).
Employment
1957-1959 |
Scientific Officer, Royal Australian Navy. |
1959 |
Research Student, Imperial College, London University |
1960-1962 |
Assistant Lecturer and Lecturer in Mathematics, Reading University |
1962-1963 |
Lecturer in Mathematics, Imperial College. |
1963-1965 |
Assistant Director, Computer Unit, Edinburgh University. |
1966-1976 |
Head, Computer Centre, Australian National University |
1971 |
U.K. Science Research Council Senior Visiting Fellow at University of Dundee and Visiting Professor, Computer Science Department, Stanford University. |
1976-1977 |
Director, Computer Centre, Australian National University. |
1977 |
U.K. Science Research Council Senior Visiting Fellow, DAMTP, Cambridge University. |
1977-1980 |
Head, Computing Research Group, Australian National University |
1980 |
U.K. Science Research Council Senior Visiting Fellow, Mathematics Department, Manchester University |
1980-1991 |
Professorial Fellow, Department of Statistics, Institute of Advanced Studies, Australian National University |
1992- |
Professor, Statistics Research Section (incorporated in Centre for Mathematics and its Applications March 31,1993), School of Mathematical Sciences, Australian National University |
1981-1982 |
Visiting Professor, Mathematics Department, University of Utah |
1983 |
Visiting Professor, Mathematics Department, University of Western Australia. |
1984-1985 |
Visiting Professor, Department of Mathematics and Statistics, University of New Mexico. |
1989 |
Visiting Professor, Department of Statistics, Texas A&M University. |
1990-1999. |
Convenor, Program in Advanced Computation, School of Mathematical Sciences, Australian National University |
1994 |
Visiting Professor, Departments of Computer Science and Mathematics, Texas A&M University. |
1998-1999 |
Acting Head, Centre for Mathematics and it's Applications |
2000 |
Visiting Fellow, Centre for Mathematics and it's Applications |
2002 |
Visiting Professor, I.W.R., Heidelberg |
.
Other Activities
Member, IFIP Technical Area 1 Committee, 1971 Congress.
Course Advisory Committee, School of Computing Studies, Canberra College of Advanced Education, 1969-1977.
Chairman, Assessment Panel for Numerical Analysis courses in A.C.T. Senior
Secondary Colleges, 1975-1979.
Member of Organising and Program Committees 1976, Program Committee 1978, Division of Applied Mathematics Conference, Australian Mathematical Society.
Associate Editor for Numerical Analysis, Journal of the Australian Mathematical Society, series B. 1975-2000
Visiting Consultant on Optimization Problems in CSIRO Division of Mathematics and Statistics, February - April, 1987.
Chairman, Australian Mathematical Society Program Committee for the (bicentennial) National Mathematical Sciences Congress, (1988).
Member Section 1. (Mathematical Sciences) subcommittee, Australian Academy of Science, 1984-8, 1991-4.
Member, Steering Committee, Australian Consortium for Advanced Computation 1990-
Director, CTAC93, the biennial conference of the special interest group in computational mathematics and applications of the Applied Mathematics Division of the Australian Mathematical Society held at ANU in July 1993.
Chairman of CTAC 1993-5
ANZIAM99: member Invited Speakers Selection Committee, Committee Outstanding Young Researcher Award
Director, CTAC99
Chairman CTAC 1999-2001, Committee ex officio 2001 - 2004
Member, ANZIAM Executive Committee 2000-2001
Member, Organizing Committee HPSC06 Hanoi
Member, Organizing Committee HPSC09 Hanoi
Major Invited Lectures
Applied Numerical Analysis, Dundee 1971.
Numerical Methods in Nonlinear Optimization, Dundee 1971.
Numerical Methods in Approximation Theory, Oberwolfach 1971.
Gatlinburg V, Los Alamos 1972.
Metodos Numericos Modernos, Caracas 1974.
Royal Irish Academy Conference on Numerical Analysis, Dublin 1974.
IFIP Congress, Stockholm 1974.
Computers in Education and Research, Sperry Univac Research Conference, Rome 1975.
Dundee Numerical Analysis Conference, 1977.
IMA Conference in honour of J.H. Wilkinson (one of four speakers including J.H. Wilkinson), London 1977.
Gatlinburg VII, Asilomar 1977.
Codes for Boundary Value Problems in ODE's, Houston 1978.
Liverpool and Manchester Summer School on Nonlinear Problems in Numerical Analysis, Liverpool 1980.
Workshop on Boundary Value Problems in ODE's, Vancouver 1980.
New Zealand Mathematical Society Conference, Wellington 1984.
University of New Mexico Fund Lecturer, Albuquerque 1985.
Statistical Data Analysis based on the l1-norm and Related Methods, Neuchatel 1987.
CTAC91, Adelaide,1991.
15'th International Symposium in Mathematical Programming, Ann Arbor, 1994.
TIMS XXXIII (INFORMS), Singapore 1996.
International Workshop on Optimization, Sydney, 1999.
International Conference on High Performance Scientific Computing, Hanoi 2003.
PARAOPE 2004, University of Heidelberg, 2004.
CTAC06, Townsville, 2006.
Workshop on Statistical Methods for Modelling Dynamical Systems, Montreal, 2007.
Research and Development Grants
1988 CISR grant for computing equipment ($15,000).
1989 IAS Special Initiative grant ($100,000 pa for three years).
1991 Australia United States Exchange ($5000).
1991 Fujitsu Area 4 contract (Joint Director) ($200,000 1992, $260,000 1993-1997, $160,000 1998, $50,000 1999).
1994 Joint Director ACSys (Cooperative Research Centre) AC1 project.
Research Activities
Numerical linear algebra, and eigenvalue problems
Use of wrap-around partitioning to vectorize the solution of block bidiagonal matrices
Inverse iteration for the solution of eigenvalue problems
Parallel numerical libraries
Solution of boundary value problems in ODE's
Collocation methods for the discretization of boundary value problems
Factorisation of operators, dichotomy, and the stability of computations
Cyclic reduction and its relation to representation theorems
Estimation of ODE's, variable selection, and the suitability of bases
Kalman filters, stochastic ODE's, and smoothing problems
ODE eigenvalue problems, stability, bifurcation, and inverse problems
Estimation and approximation
Fisher's method of scoring for maximising likelihoods
Convergence rates in large sample problems.
Rank regression and l1 estimation
Robust methods
Methods of variable selection
Trust region and (piecewise linear) homotopy methods
Polyhedral convex functions
Structure functional representation
Simplicial methods for minimization
Resolution of degeneracy
Polyhedral constrained optimization
Optimization
Methods for constrained and unconstrained problems
Barrier and penalty methods
Interior point methods
Discrete simulation
Event oriented model descriptions
Object oriented programming implementation
Major Research Contributions
First application of spectral methods to the problem of sound propagation in a layered fluid medium such as a deep ocean (the North Atlantic for example).
The standard method involved contour integration and the assumption of constant layer properties. The spectral approach has the advantages that asymptotic properties can be used to deduce the spectral density, ambiguities in contour selection are avoided, continuous layer dependence is easily treated, and convenient formulae are available for computing associated quantities such as group velocities. The discrete spectrum explains sound channelling, while the continuous spectrum can be interpreted by geometric optics as an outgoing spherical wave corresponding to rays that are not channelled [5], [7]. The approach can also be used in inverse problems [53], [64], [66], [149]. The problem that appears feasible is estimating the velocity depth dependence given observation of group velocities of individual modes (eigensolutions) as functions of frequency. This raises an interesting question. The parametrization of the velocity depth curve is a parametrisation of its spacial dependence, while the fitting is done in the frequency domain after this parametrization has been filtered by the differential system. Under what conditions is independence of the parameters preserved? Current investigations indicate that if p parameters are to be estimated then independent observations are required on at least p distinct modes. This condition combines with considerations of stochastic convergence which then require the number of observations on each individual mode to grow without limit [135].
Development of special methods for separable partial differential equations.
This work was stimulated by the late Professor Bickley who derived the basic formalism used. My most important contribution was a characterization of the optimal overrelaxation parameter for the successive overrelaxation iterative method as the solution of a multiparameter eigenvalue problem. An effective method for solving this problem was also given. It is believed to be the first algorithm for this class of problem. [9], [16], [17], [18], [25].
First to provide a general connection between inverse iteration and Newton's method for (linear and nonlinear) eigenvalue problems.
This formulation of a class of iterative methods based on Newton's method was one of the first applications of Wilkinson's results on the stability of inverse iteration. [7], [13], [23]. It includes within its scope just about all superlinearly convergent methods for finding or accelerating convergence to a single eigenvalue so it provides a generalisation of estimates like the Rayleigh quotient to general (even nonlinear) eigenvalue problems. The first application considered was to discretizations of second order differential equations that permitted uniformly good estimates to be found for relatively large numbers of eigenvalues, and this led to nonlinear problems. Another early application was to compute the neutral curve and curves of time amplification for studies of the hydrodynamic stability of flow over a flat plate [23]. My student Ralph Jordinson studied the related space amplification problem, a nonlinear eigenvalue problem, by the same methods in his Ph.D thesis. Application to the inverse eigenvalue problem is considered in [53]. The achievable rates of convergence (often>2) are explored in [84]. Methods suitable for large scale problems on parallel and vector computers are treated in [144], [163], [167], [168], [174], [175]. A method with rate 3.56 and using a deflation technique applicable with very general weighting matrices is given in [167]. This method has been implemented using wrap-around partitioning under the Area 4 contract with Fujitsu.
First collocation code for boundary value problems in ordinary differential equations.
I don't think that anything like this - the use of the computer to develop the discretization of the differential system - had been considered before. I was invited to do the development at A.E.R.E. Harwell. They even punched up the tapes for the Mercury computer [11]. The main modern codes (eg COLNEW) can show a direct line of development to this precursor - for example, in the choice of local basis. These codes are the principal methods now used for the solution of boundary value problems. I have been using one of the explicit original formulae (an O(h4) accurate formula for first order systems) to redo some hydrodynamic neutral curve calculations to illustrate a vectorizing block bidiagonal eigensolver using the 3.56 algorithm developed for Fujitsu [174], [175].
First treatment of superconvergence in collocation methods for ordinary differential equations.
The key question asks how to pick the collocation points? This paper was the first to establish the fundamental result that the optimal choice of collocation points are Gaussian points for quadrature with respect to suitable weight functions [22], [69]. For first order systems these are the Legendre points. This choice is central to the modern developments of collocation codes. This latter reference was the first to identify noncompact collocation schemes which would appear to provide a collocation analogue to upwinding for singular perturbation problems[148].
Multiple shooting for boundary value problems - first to identify the factors underlying the stability of this class of methods. Computations carried out demonstrated its capability.
Shooting methods apply initial value techniques by guessing an initial condition, integrating forward to check if the boundary conditions are satisfied, and adjusting the initial guess if needed. The main shortcoming of the method is instability in the initial value problem. The key ideas in multiple shooting are: construct a valid discrete analogue of the differential system by constructing a sequence of local fundamental bases for the solution (instability can be controlled because the calculation is local), and then use a stable method for solving the resulting linear system. The variant `compactification’ suggested by some other workers was explicitly and correctly rejected. I was able to solve satisfactorily all the problems then reported as difficult in the literature on shooting methods using only single precision (32 bits) on an IBM 360/50 [40]. The correct order of the condition number is estimated in [67] where connections with finite difference methods and collocation are established.
First development and analysis of the generalised Gauss-Newton method for norms other than Euclidean. First proofs of second order convergence in l1 and l¥ norms (result is not true in l2 ).
The original interest was in constructing best approximations to functions by nonlinear families with the aim of developing subroutines for evaluating special functions [20]. Linear programming can be used in formulating these problems on discrete point sets in the linear case[24]. Watson and I were the first to use it iteratively to construct nonlinear approximations [34], [35], [52], and the algorithm is often referred to under our names. Second order convergence was first demonstrated in [54]. The most complete description of the convergence properties in the l¥ norm is given in [91]. A key result is that the useful sufficient condition of strong uniqueness is not necessary. The dependence of the rate of convergence of these generalised Newton schemes is analysed as a function of the norm used in [83], [91], [95].
First proof of "almost" global convergence of the Levenberg algorithm for nonlinear least squares.
This result [75] partly explains the observed good behaviour of this algorithm forms of which provide the most widely used algorithms for nonlinear least squares problems. The rest of the story comes later [115], [135]. The published code became the basis of Allan Miller’s excellent package.
Trajectory analysis of barrier function methods in nonlinear programming. One of my lines of approach looks at developing barrier methods with higher rates of convergence as the barrier parameter r tends to 0. The other has been to understand the behaviour of standard methods in difficult cases.
Methods of convergence acceleration for barrier methods are discussed in [48], [59]. It is shown that barrier functions giving arbitrarily fast convergence in r are possible. For the case of the log barrier function the exact rate of convergence is shown to be order r 1/ 2 in the case of non-strict complementarity [87]. In [90] it is shown how to restore order r convergence in these cases. The procedure is applied to extrapolation methods of convergence acceleration.
First proof of the stability of Godonov's method of the stabilized march for boundary value problems with separated boundary conditions.
Godonov suggested the stabilized march in 1943 – a most surprising algorithm to come out of wartime Russia. It became popular in computer codes in the '60's and '70's. This first stability analysis used a backward argument to show that if multiple shooting was stable then so was the stabilized march [88], [89]. The method is important because it reduces significantly the number of integrations of the differential system required in comparison with those required by multiple shooting in the separated boundary conditions case.
Newton's method for singular systems of nonlinear equations. The work of Andreas Griewank and I includes the first study of irregular singular points which are basically singular points which do not have one dimensional analogues. They correspond to the generic type of singularity when Newton’s method is applied to optimization problems. We were able to establish connections between the iterates of Newton's method, strange attractors, and chaos in the irregular case.
This is highly original work. We were able to capture the essential geometry of the process, and in an important special case to reduce the asymptotic behaviour of the Newton iterates to the study of a simple recurrence relation. The conclusion was that methods using only first derivatives would potentially be unsuitable at irregular singular points. Higher order methods are needed if there is to be a progression from theory to practice. These methods need to imbed the nonlinear system into an overdetermined but consistent system that is built using higher order tensors constructed using higher order derivatives of the original system. I suspect this explains in large part why Griewank was motivated to go on and develop automatic differentiation systems! [94], [99].
Homotopy methods with piecewise linear and continuous trajectories can provide complete solutions in a number of problems of practical importance. A number of such applications have been made:
Solution of the Huber M-estimation problem for the complete range of the scale parameter c classifying residuals as large or small [105] (the method depends only on the product s c so strategies for variance estimation are also available);
Computation of all regression quantiles as a function of the quantile parameter [136];
Implementation of Tibshirani's Lasso in which a sum of squares is minimized subject to an l1 constraint with the constraint parameter controlling the number of variables entering the least squares problem. Here the constraint bound is the homotopy parameter [182]. Recent work at Stanford predicts that the entire range of the homotopy parameter can be covered by few more steps than the number of variables. Thus this approach is extremely competitive. Other applications have led to the suggested use of other objective functions such as the l_1 norm. Turlach and I have shown that a homotopy approach is feasible. But that it has the interesting property that continuation must be applied alternately to the constraint bound and the Lagrange multiplier. The resulting algorithm, in this form, does not share the computational efficiency noted for the least squares objective and has much of the character of a small step l_1 minimization algorithm. Thus it becomes of interest to seek a large step mechanism for the l_1 objective homotopy analogous to the use of a line search in l_1 optimisation.
Piecewise linear homotopy can be applied also to Vapnik's support vector regression. Here the best strategy is probably to use an application of an active set algorithm to start the computation.
The interesting question in the implementation of the homotopy is what happens at the join of adjacent linear pieces, with special considerations being necessary in each of the above cases to show continuity. The algebra is not unlike that of the pivoting steps in linear and quadratic programming. This approach extends to optimization of a positive definite quadratic form subject to a polyhedral constraint where now the parameter in the Lagrangian formulation becomes the homotopy parameter. The Lasso has suggested a fresh look also at trust region and Levenberg algorithms for nonlinear least squares [179].
Resolution of degeneracy in simplicial methods for linear programming and related problems - first complete characterization of requirements and development of general algorithms.
Degeneracy now seems an odd problem to have bedeviled much of the theoretical and practical work on simplicial methods for linear programming! But the understanding necessary is quite recent, and is built on my resolution of the problem [103]. My idea was to connect the ability to make progress in the simplicial algorithm with the construction of a descent direction for a reduced problem. Such a direction exists if and only if the current point is not optimal and so provides a complete characterisation. It is an important step to identify the direction as a direction of recession for the reduced problem (there is a sense in which this observation goes back to Wolfe and this is noted in [120]), and such directions are independent of the constraint right hand sides. Constructive methods are suggested immediately and have minimal cost implications. The basic result generalises to the problem of resolving degeneracies in minimizing polyhedral functions [103], [137], [139], [141].
Development of concise representation of subgradients of polyhedral functions based on the new concept of structure functionals introduced in [103]. Application to the general formulation of simplicial algorithms for minimizing polyhedral functions. Development of the first practical exact method for the rank regression problem.
These new approaches apply convex analysis to the problem of characterizing extreme points of the epigraph of a polyhedral convex function. They have made possible practical algorithms for important problems previously considered intractable [103], [106], [187]. Previous attempts had been based on the idea of transforming the problem into a linear programming problem, but this can have exponentially large numbers of constraints (or worse) in the obvious approach. Sets of structure functionals are used to define vertices of the epigraph in a parsimonious fashion, and simplicial methods generate descending paths along edges linking adjacent vertices. Rank and signed rank regression, estimation methods of remarkably high statistical efficiency, in which discontinuities in derivative are linked to ties in a ranking set, provide the most complicated examples so far attempted. Implementation is considered in [187]. This reference considers aspects of nonconvex problems (for example, rank regression with redescending scores to increase robustness), and shows that simplicial methods can be surprisingly computationally robust in this more complex situation. An extension to develop active set methods for functions which are the sum of a smooth convex function and a polyhedral function has been made with Alistair Watson. This work is an extension of the descent algorithm for the Lasso.
Proof of the strong stability associated with the correct identification of the dichotomy of the Riccati factorization of boundary value problems.
Factorizing boundary value problems into reduced systems with appropriate stability properties has been a recurrent theme (for example, in the work on so called invariant imbedding methods). The use of the Riccati equation which is a key aspect of this work was known to be problematical (there is an analogy with the elimination solution of linear equations without interchanges). The key result here leads to a computational strategy that has analogies with the use of interchanges to stabilize the linear equations case [107], [122], [123].
Introduction of analysis based on the law of large numbers into numerical analysis of least squares problems, Gauss-Newton and Levenberg algorithms, variable projection methods for separable nonlinear least squares problems, Fisher scoring for maximum likelihood and quasilikelihood problems.
The numerical analysis literature is significantly deficient in its description of the limitations of these methods precisely in the context in which they are most important. Small residuals are not important in data analysis practice! What is important is the right kind of cancellation, and the right tool for describing this phenomenon is the law of large numbers [113], [115], [135], [158], [202]. The results give order of magnitude better results than previously known in the case that the assumed model is correct and the amount of data is adequate. For example, the linear least squares problem has asymptotically a rounding error dependence on the condition number (not the condition number squared) as the number of observations grows without bound. Under the same conditions the convergence rate of the nonlinear algorithms is asymptotically second order.
The classic method of Prony has been adapted for the estimation of the exponents in problems of exponential and trigonometric fitting. Prony's method is inconsistent in its original setting, but a modification works for trigonometric estimation. Efficient methods are possible by various devices.
Rescuing Prony's method in the case of trigonometric estimation comes down to picking a uniquely determined scaling condition, and this turns the algorithm into an eigenvalue problem (Pisarenko’s method). A related nonlinear eigenvalue problem provides the maximum likelihood estimator. Computation of this problem in the exponential fitting case can be simplified without loss of eficiency provided a correct scaling condition is implemented. This algorithm shows promise also in trigonometric estimation problems when not only can the scaling conditions be relaxed, but also it proves reliable in cases where maximum likelihood methods have problems in locating the consistent estimator in a cluster of many competing maxima [70], [134], [140], [153]. The pattern that emerges of the trigonometric problem being easier than the exponential one is noteworthy. The methods apply whenever the basis functions satisfy a difference equation with constant coefficients (these become the problem unknowns). However, the methods appear significantly more effective in the case of exponential and trigonometric functions than in other cases (rational functions being one example). The reason for this is still unclear.
Showed that wrap-around partitioning in combination with orthogonal factorization or partial pivoting elimination can be applied recursively to vectorize the solution of block bidiagonal systems of linear equations even in the case that the partitioning constants are not exact factors of the block order. This permits the effective vectorization of stable methods for the solution of structured, narrow band systems of linear equations with the small extra cost of embedding the system in a block bidiagonal system. Partitioning constants can be chosen to minimize memory access contention.
Cyclic reduction is the classic technique for applying recursive halving in the solution of tridiagonal linear systems. Up til now this has been the most widely used technique for vectorizing these systems. Restrictive conditions (no pivoting or orthogonal reduction is allowed so usually the matrix is required to be positive definite and system order a power of two) are imposed. Markus Hegland introduced wrap-around partitioning - a more general reordering transformation - for a one-step reduction of tridiagonal systems to improve vectorization while using stabilized factorization techniques. I looked at the corresponding transformation for block bidiagonal systems and discovered the restrictions just melted away. In this context it contains cyclic reduction as a special case. We believe the method will become the standard method for vectorizing general, regularly structured, narrow band linear systems [172]. These can readily be transformed into block bidiagonal form. The method is already in the standard library on Fujitsu supercomputers, and routines to automatically translate band matrices into block bidiagonal form will be added this year.
Cyclic reduction (a special case of wrap-around partitioning) has been applied to vectorize the numerical solution of boundary value problems. I have been able to connect this with a whole new class of representations of the solutions of the system of differential equations. These are derived from problems of twice the order, but include stable members for which the dichotomy is known a priori.
This is very exciting! Why should cyclic reduction (a purely algebraic device) applied to discretized ordinary differential equations connect with a whole class of representations of solutions corresponding to boundary value problems of twice the order of the original system of differential equations? Twice the order may not sound necessarily positive, but the process has the important advantages that the resulting class includes stable problems with known dichotomy structure, and, of course, with solutions that can be computed as a by-product of the cyclic reduction procedure [163].
Estimation of Ordinary Differential Equations. The basic problem is to estimate the parametric dependence in a system of ordinary differential equations from (noisy) observations made on system trajectories. Two main classes of approach are available. The "embedding method" embeds the differential system in a boundary formulation in which boundary matrices are chosen a priori and the parameters and boundary forcing term then chosen to minimize an objective function which expresses the fit to the observations. The second, or "simultaneous method", introduces a discretization of the differential system as a system of explicit constraints on the objective function. Typically variants of sequential quadratic programming algorithms provide the solution methods.
The first problem with the embedding method is that of choosing boundary conditions that provide a stably posed boundary value problem. A scheme for choosing essentially optimum conditions automatically has been formulated [196,204]. The embedding method is almost a standard maximum likelihood estimation problem. The only difference is that the likelihood values can only be computed approximately by integrating the differential equations. However, the error incurred here is small compared with that arising from stochastic sources and is readily easily accounted for. This permits consistency of the estimates to be demonstrated [205]. Gauss-Newton is the preferred method for solution, and fast convergence can be demonstrated on large sample problems. The embedding method has the advantage that it can make use of standard differential equation and nonlinear least squares software without major modification. It has a computational cost in that the differential system must be solved for every function value required. The simultaneous method avoids the somewhat arbitrary choices of boundary data, also avoids repeated differential equation solution, but is otherwise an apparently a more complex mathematical problem. However, equivalence of local optima with those of the embedding method is shown readily enough [204]. An iteration suggested by Georg Bock, which is analogous to Gauss-Newton in omitting calculation of second derivatives of the differential equation forcing functions, has been shown to have a similar large sample convergence rate to Gauss-Newton but under more restrictive conditions on the distribution governing the measurement errors. The equivalence results have been extended by deducing the necessary conditions on the embedding method from the Kuhn-Tucker conditions satisfied by the simultaneous method. This enables the consistency results for the estimation problem to be put in a more pleasing form.
Research Exposition
Methods for unconstrained optimization problems [29]: Kowalik invited me to join him in authoring a text on unconstrained optimization problems. It was clearly a good time to bring together the burgeoning developments in this area (the famous Fletcher Powell paper was the most cited mathematics reference according to the Library at the Rutherford Laboratory). At the time I had developing interests in least squares problems but had no other special qualifications. In the end I finished up writing the text and contributing substantially to other areas. I was pleased with my chapter on direct search methods and referees agreed. The book was remarkably successful.
Lecture notes on optimization [59]: One consequence of the unconstrained optimization book was an invitation from George Forsythe to give a graduate course on optimization at Stanford. The optimization area had moved on, and I had developed an interest in applications to penalty and barrier methods. Thus these notes provided an opportunity to bring up to date the material in the earlier book. It was also positioned much closer to the frontiers of current research. I was able to develop rate of convergence estimates for barrier and penalty algorithms, to address the question of the relation between the structure of barrier methods and their rate of convergence, to base the treatment of conjugate direction algorithms for unconstrained optimization on the product form updating methods, and, with Michael Saunders, make a detailed study of the use of smart updating (surgery) in the implementation of the product form methods [58]. Students of the course who have gone on to distinguished careers in the area include Margaret Wright and Michael Saunders. Forsythe requested the notes be published as a technical report. It would have been very suitable for publication on the web.
Finite Algorithms in Optimization and Data Analysis [103]: This book went through an important phase of its evolution as a graduate course at the University of Utah. The original ideas had been to develop a treatment of algorithms for minimizing polyhedral functions (linear programming is the best known of these problems) based on modern methods of convex analysis, and to exploit systematically the idea that tableau data structures could be used in implementing generic versions of stabilized algorithms such as Bartels-Golub and projected gradient methods. However, the challenge to develop a general formalism for treating the necessary conditions in these optimization problems, motivated in particular by the difficult rank based estimation problems, led me to the concept of structure functionals and the associated compact forms for the subdifferential which is where the convex analysis enters as a unifying technique. A coherent treatment of degeneracy in these polyhedral minimization problems did not exist and was clearly desirable, and one of the most important aspects of the book is that I was able to do this by way of a necessary and sufficient condition that leads to effective algorithms for the class of polyhedral problems. Finiteness is interpreted as the existence of rational algorithms. These include data analysis problems in least squares, quadratic programming, and robust estimation, and problems of this kind are also discussed. A highlight is the inclusion of work from David Clark's thesis to deal with rational methods for the Huber M-estimator (see also [105]), and an account of results obtained with Alistair Watson on a generalisation of the `errors in variables' model (see also [102]).
Simplicial Algorithms for Minimizing Polyhedral Functions [187]: Since the development of the `Finite Algorithms' book there has been significant developments in the area. Most notably, Karen George was able to make a full implementation of the rank regression algorithm before she fell ill. This book aims as an important component of its coverage to provide a reporting of her contributions as part of the task of bringing the previous one up to date. This put the onus on me to provide a working rank regression program which turned out to commit me to several months of intensive work. I made use of object oriented programming techniques which I believe helped significantly - this is argued in detail in the book. Also, I had the advantage of joint work with Alistair Watson on line search methods for these problems [159] which highlights the utility of a modified secant algorithm, and I had completed implementations of degeneracy resolving codes for linear programming [141]. The work on structure functional descriptions of these problems has been revised to take account of developments (for example, [106]), and Rob Womersley's fundamental work on censored l1 estimation as an example of a nonconvex problem put in a more general setting. A different kind of extension applies the structure functional framework to a smooth function subject to a polyhedral function constraint to develop the analog of active set methods. There is a corresponding development for the associated Lagrangian function.
Computational Methods in Least Squares and Maximum Likelihood: These notes have been used for a lecture course at the University of Heidelberg. They cover a number of the research areas of current interest indicated above, especially the application of scoring to maximum likelihood problems which arise in the estimation of differential equations and related problems of variable selection. They are quite regularly updated and/or corrected.
Graduate Students
W.McLewin |
Lecturer in Mathematics, Manchester University (ret) |
R.Jordinson |
Senior Lecturer in Mathematics, Edinburgh University (ret.) |
G.A.Watson |
Professor of Mathematics , University of Dundee |
D.M.Ryan |
Professor and Deputy Dean of Engineering, University of Auckland |
L.S.Jennings |
Associate Professor , University of Western Australia |
F. De Hoog |
Chief Scientist, CSIRO |
D.H.Anderson |
C.A.S.A. |
G.X.Woodford |
Consultant |
D.C.C.Bover |
Mission Critical Systems |
Krisorn Jittorntrum |
Director of Computer Services, Cheng Mai University |
Andreas Griewank |
Professor of Mathematics, Humboldt University Berlin |
D.I.Clark |
Senior Lecturer, University of Canberra |
Linping Sun |
Professor of Mathematics, University of Nanjing |
G.K.Smyth |
Senior Research Scientist, WEHI |
Tania Prvan |
Senior Lecturer in Statistics, Macquarie University |
Karen George |
Lecturer, University of Canberra (ret) |
J.H.Taylor |
Research Scientist, Department of Defence |
Sergey Bakin |
Suncorp Metway |
Zengfeng Li |
Department of Health |