Biographical Information
Date of Birth: 26/12/1934
Married: Gloria Mary
Qualifications: B.A. (Melb.), Ph.D (Lond.), FAA.
Contact Information
Work Address |
Home Address |
School of Mathematical Sciences |
1 Weld Street |
Australian National University |
Yarralumla |
Canberra |
A.C.T. 2600 |
A.C.T. 0200 |
|
Australia |
|
Work URL |
Home URL |
| http://wwwmaths.anu.edu.au/~mike/ | http://mngo.members.veritel.com.au/public/ |
Electronic mail address
| Mike.Osborne@maths.anu.edu.au |
Office phone |
Home phone |
(061)-(02)- 6125 3449 |
(02)- 6281 1648 |
Fax (CMA)
(061)-(02)- 6125 5549
Work Information
Job title
Professor Emeritus
Key responsibilities
Visiting Fellow
Department
Centre for Mathematics and its Applications (CMA),
School of Mathematical Sciences.
Membership of Societies
Australian Mathematical Society, Emeritus member American Mathematical Society,
Society for Industrial and Applied Mathematics.
Employment
1957-1959 |
Scientific Officer, Royal Australian Navy. |
1959 |
Research Student, Imperial College, London University |
1960-1962 |
Assistant Lecturer and Lecturer in Mathematics, Reading University |
1962-1963 |
Lecturer in Mathematics, Imperial College. |
1963-1965 |
Assistant Director, Computer Unit, Edinburgh University. |
1966-1976 |
Head, Computer Centre, Australian National University |
1971 |
U.K. Science Research Council Senior Visiting Fellow at University of Dundee and Visiting Professor, Computer Science Department, Stanford University. |
1976-1977 |
Director, Computer Centre, Australian National University. |
1977 |
U.K. Science Research Council Senior Visiting Fellow, DAMTP, Cambridge University. |
1977-1980 |
Head, Computing Research Group, Australian National University |
1980 |
U.K. Science Research Council Senior Visiting Fellow, Mathematics Department, Manchester University |
1980-1991 |
Professorial Fellow, Department of Statistics, Institute of Advanced Studies, Australian National University |
1992- |
Professor, Statistics Research Section (incorporated in Centre for Mathematics and its Applications March 31,1993), School of Mathematical Sciences, Australian National University |
1981-1982 |
Visiting Professor, Mathematics Department, University of Utah |
1983 |
Visiting Professor, Mathematics Department, University of Western Australia. |
1984-1985 |
Visiting Professor, Department of Mathematics and Statistics, University of New Mexico. |
1989 |
Visiting Professor, Department of Statistics, Texas A&M University. |
1990-1999. |
Convenor, Program in Advanced Computation, School of Mathematical Sciences, Australian National University |
1994 |
Visiting Professor, Departments of Computer Science and Mathematics, Texas A&M University. |
1998-1999 |
Acting Head, Centre for Mathematics and it's Applications |
| 2000 | Visiting Fellow, Centre for Mathematics and it's Applications |
| 2002 | Visiting Professor, I.W.R., Heidelberg |
.
Other
ActivitiesMember, IFIP Technical Area 1 Committee, 1971 Congress.
Course Advisory Committee, School of Computing Studies, Canberra College of Advanced Education, 1969-1977.
Chairman, Assessment Panel for Numerical Analysis courses in A.C.T. Senior
Secondary Colleges, 1975-1979.
Member of Organising and Program Committees 1976, Program Committee 1978, Division of Applied Mathematics Conference, Australian Mathematical Society.
Associate Editor for Numerical Analysis, Journal of the Australian Mathematical Society, series B. 1975-2000
Visiting Consultant on Optimization Problems in CSIRO Division of Mathematics and Statistics, February - April, 1987.
Chairman, Australian Mathematical Society Program Committee for the (bicentennial) National Mathematical Sciences Congress, (1988).
Member Section 1. (Mathematical Sciences) subcommittee, Australian Academy of Science, 1984-8, 1991-4.
Member, Steering Committee, Australian Consortium for Advanced Computation 1990-
Director, CTAC93, the biennial conference of the special interest group in computational mathematics and applications of the Applied Mathematics Division of the Australian Mathematical Society held at ANU in July 1993.
Chairman of CTAC 1993-5
ANZIAM99: member Invited Speakers Selection Committee, Committee Outstanding Young Researcher Award
Director, CTAC99
Chairman CTAC 1999-2001, Committee ex officio 2001 - 2004
Member, ANZIAM Executive Committee 2000-2001
Member, Organizing Committee HPSC06 Hanoi
Major Invited Lectures
Applied Numerical Analysis, Dundee 1971.
Numerical Methods in Nonlinear Optimization, Dundee 1971.
Numerical Methods in Approximation Theory, Oberwolfach 1971.
Gatlinburg V, Los Alamos 1972.
Metodos Numericos Modernos, Caracas 1974.
Royal Irish Academy Conference on Numerical Analysis, Dublin 1974.
IFIP Congress, Stockholm 1974.
Computers in Education and Research, Sperry Univac Research Conference, Rome 1975.
Dundee Numerical Analysis Conference, 1977.
IMA Conference in honour of J.H. Wilkinson (one of four speakers including J.H. Wilkinson), London 1977.
Gatlinburg VII, Asilomar 1977.
Codes for Boundary Value Problems in ODE's, Houston 1978.
Liverpool and Manchester Summer School on Nonlinear Problems in Numerical Analysis, Liverpool 1980.
Workshop on Boundary Value Problems in ODE's, Vancouver 1980.
New Zealand Mathematical Society Conference, Wellington 1984.
University of New Mexico Fund Lecturer, Albuquerque 1985.
Statistical Data Analysis based on the l1-norm and Related Methods, Neuchatel 1987.
CTAC91, Adelaide,1991.
15'th International Symposium in Mathematical Programming, Ann Arbor, 1994.
TIMS XXXIII (INFORMS), Singapore 1996.
International Workshop on Optimization, Sydney, 1999.
International Conference on High Performance Scientific Computing, Hanoi 2003.
PARAOPE 2004, University of Heidelberg, 2004
CTAC06, Townsville, 2006.
Research and Development Grants
1988 CISR grant for computing equipment ($15,000).
1989 IAS Special Initiative grant ($100,000 pa for three years).
1991 Australia United States Exchange ($5000).
1991 Fujitsu Area 4 contract (Joint Director) ($200,000 1992, $260,000 1993-1997, $160,000 1998, $50,000 1999).
1994 Joint Director ACSys (Cooperative Research Centre) AC1 project.
Research Activities
Numerical linear algebra, and eigenvalue problems
Solution of boundary value problems in ODE's
Estimation and approximation
Polyhedral convex functions
Optimization
Discrete simulation
Major Research Contributions
The standard method involved contour integration and the assumption of constant layer properties. The spectral approach has the advantages that asymptotic properties can be used to deduce the spectral density, ambiguities in contour selection are avoided, continuous layer dependence is easily treated, and convenient formulae are available for computing associated quantities such as group velocities. The discrete spectrum explains sound channelling, while the continuous spectrum can be interpreted by geometric optics as an outgoing spherical wave corresponding to rays that are not channelled [5], [7]. The approach can also be used in inverse problems [53], [64], [66], [149]. The problem that appears feasible is estimating the velocity depth dependence given observation of group velocities of individual modes (eigensolutions) as functions of frequency. This raises an interesting question. The parametrization of the velocity depth curve is a parametrisation of its spacial dependence, while the fitting is done in the frequency domain after this parametrization has been filtered by the differential system. Under what conditions is independence of the parameters preserved? Current investigations indicate that if p parameters are to be estimated then independent observations are required on at least p distinct modes. This condition combines with considerations of stochastic convergence which then require the number of observations on each individual mode to grow without limit [135].
This work was stimulated by the late Professor Bickley who derived the basic formalism used. My most important contribution was a characterization of the optimal overrelaxation parameter for the successive overrelaxation iterative method as the solution of a multiparameter eigenvalue problem. An effective method for solving this problem was also given. It is believed to be the first algorithm for this class of problem. [9], [16], [17], [18], [25].
This formulation of a class of iterative methods based on Newton's method was one of the first applications of Wilkinson's results on the stability of inverse iteration. [7], [13], [23]. It includes within its scope just about all superlinearly convergent methods for finding or accelerating convergence to a single eigenvalue so it provides a generalisation of estimates like the Rayleigh quotient to general (even nonlinear) eigenvalue problems. The first application considered was to discretizations of second order differential equations that permitted uniformly good estimates to be found for relatively large numbers of eigenvalues, and this led to nonlinear problems. Another early application was to compute the neutral curve and curves of time amplification for studies of the hydrodynamic stability of flow over a flat plate [23]. My student Ralph Jordinson studied the related space amplification problem, a nonlinear eigenvalue problem, by the same methods in his Ph.D thesis. Application to the inverse eigenvalue problem is considered in [53]. The achievable rates of convergence (often>2) are explored in [84]. Methods suitable for large scale problems on parallel and vector computers are treated in [144], [163], [167], [168], [174], [175]. A method with rate 3.56 and using a deflation technique applicable with very general weighting matrices is given in [167]. This method has been implemented using wrap-around partitioning under the Area 4 contract with Fujitsu.
I don't think that anything like this - the use of the computer to develop the discretization of the differential system - had been considered before. I was invited to do the development at A.E.R.E. Harwell. They even punched up the tapes for the Mercury computer [11]. The main modern codes (eg COLNEW) can show a direct line of development to this precursor - for example, in the choice of local basis. These codes are the principal methods now used for the solution of boundary value problems. I have been using one of the explicit original formulae (an O(h4) accurate formula for first order systems) to redo some hydrodynamic neutral curve calculations to illustrate a vectorizing block bidiagonal eigensolver using the 3.56 algorithm developed for Fujitsu [174], [175].
The key question asks how to pick the collocation points? This paper was the first to establish the fundamental result that the optimal choice of collocation points are Gaussian points for quadrature with respect to suitable weight functions [22], [69]. For first order systems these are the Legendre points. This choice is central to the modern developments of collocation codes. This latter reference was the first to identify noncompact collocation schemes which would appear to provide a collocation analogue to upwinding for singular perturbation problems[148].
Shooting methods apply initial value techniques by guessing an initial condition, integrating forward to check if the boundary conditions are satisfied, and adjusting the initial guess if needed. The main shortcoming of the method is instability in the initial value problem. The key ideas in multiple shooting are: construct a valid discrete analogue of the differential system by constructing a sequence of local fundamental bases for the solution (instability can be controlled because the calculation is local), and then use a stable method for solving the resulting linear system. The variant `compactification suggested by some other workers was explicitly and correctly rejected. I was able to solve satisfactorily all the problems then reported as difficult in the literature on shooting methods using only single precision (32 bits) on an IBM 360/50 [40]. The correct order of the condition number is estimated in [67] where connections with finite difference methods and collocation are established.
The original interest was in constructing best approximations to functions by nonlinear families with the aim of developing subroutines for evaluating special functions [20]. Linear programming can be used in formulating these problems on discrete point sets in the linear case[24]. Watson and I were the first to use it iteratively to construct nonlinear approximations [34], [35], [52], and the algorithm is often referred to under our names. Second order convergence was first demonstrated in [54]. The most complete description of the convergence properties in the l
¥ norm is given in [91]. A key result is that the useful sufficient condition of strong uniqueness is not necessary. The dependence of the rate of convergence of these generalised Newton schemes is analysed as a function of the norm used in [83], [91], [95].This result [75] partly explains the observed good behaviour of this algorithm forms of which provide the most widely used algorithms for nonlinear least squares problems. The rest of the story comes later [115], [135]. The published code became the basis of Allan Millers excellent package.
Methods of convergence acceleration for barrier methods are discussed in [48], [59]. It is shown that barrier functions giving arbitrarily fast convergence in r are possible. For the case of the log barrier function the exact rate of convergence is shown to be order r 1/ 2 in the case of non-strict complementarity [87]. In [90] it is shown how to restore order r convergence in these cases. The procedure is applied to extrapolation methods of convergence acceleration.
Godonov suggested the stabilized march in 1943 a most surprising algorithm to come out of wartime Russia. It became popular in computer codes in the '60's and '70's. This first stability analysis used a backward argument to show that if multiple shooting was stable then so was the stabilized march [88], [89]. The method is important because it reduces significantly the number of integrations of the differential system required in comparison with those required by multiple shooting in the separated boundary conditions case.
This is highly original work. We were able to capture the essential geometry of the process, and in an important special case to reduce the asymptotic behaviour of the Newton iterates to the study of a simple recurrence relation. The conclusion was that methods using only first derivatives would potentially be unsuitable at irregular singular points. Higher order methods are needed if there is to be a progression from theory to practice. These methods need to imbed the nonlinear system into an overdetermined but consistent system that is built using higher order tensors constructed using higher order derivatives of the original system. I suspect this explains in large part why Griewank was motivated to go on and develop automatic differentiation systems! [94], [99].
The interesting question in the implementation of the homotopy is what happens at the join of adjacent linear pieces, with special considerations being necessary in each of the above cases to show continuity. The algebra is not unlike that of the pivoting steps in linear and quadratic programming. This approach extends to optimization of a positive definite quadratic form subject to a polyhedral constraint where now the parameter in the Lagrangian formulation becomes the homotopy parameter. The Lasso has suggested a fresh look also at trust region and Levenberg algorithms for nonlinear least squares [179].
Degeneracy now seems an odd problem to have bedeviled much of the theoretical and practical work on simplicial methods for linear programming! But the understanding necessary is quite recent, and is built on my resolution of the problem [103]. My idea was to connect the ability to make progress in the simplicial algorithm with the construction of a descent direction for a reduced problem. Such a direction exists if and only if the current point is not optimal and so provides a complete characterisation. It is an important step to identify the direction as a direction of recession for the reduced problem (there is a sense in which this observation goes back to Wolfe and this is noted in [120]), and such directions are independent of the constraint right hand sides. Constructive methods are suggested immediately and have minimal cost implications. The basic result generalises to the problem of resolving degeneracies in minimizing polyhedral functions [103], [137], [139], [141].
These new approaches apply convex analysis to the problem of characterizing extreme points of the epigraph of a polyhedral convex function. They have made possible practical algorithms for important problems previously considered intractable [103], [106], [180]. Previous attempts had been based on the idea of transforming the problem into a linear programming problem, but this can have exponentially large numbers of constraints (or worse) in the obvious approach. Sets of structure functionals are used to define vertices of the epigraph in a parsimonious fashion, and simplicial methods generate descending paths along edges linking adjacent vertices. Rank and signed rank regression, estimation methods of remarkably high statistical efficiency, in which discontinuities in derivative are linked to ties in a ranking set, provide the most complicated examples so far attempted. Implementation is considered in [180]. This reference considers aspects of nonconvex problems (for example, rank regression with redescending scores to increase robustness), and shows that simplicial methods can be surprisingly computationally robust in this more complex situation. An extension to develop active set methods for functions which are the sum of a smooth convex function and a polyhedral function has been made with Alistair Watson. This work is an extension of the descent algorithm for the Lasso.
Factorizing boundary value problems into reduced systems with appropriate stability properties has been a recurrent theme (for example, in the work on so called invariant imbedding methods). The use of the Riccati equation which is a key aspect of this work was known to be problematical (there is an analogy with the elimination solution of linear equations without interchanges). The key result here leads to a computational strategy that has analogies with the use of interchanges to stabilize the linear equations case [107], [122], [123].
The numerical analysis literature is significantly deficient in its description of the limitations of these methods precisely in the context in which they are most important. Small residuals are not important in data analysis practice! What is important is the right kind of cancellation, and the right tool for describing this phenomenon is the law of large numbers [113], [115], [134], [157]. The results give order of magnitude better results than previously known in the case that the assumed model is correct and the amount of data is adequate. For example, the linear least squares problem has asymptotically a rounding error dependence on the condition number (not the condition number squared) as the number of observations grows without bound. Under the same conditions the convergence rate of the nonlinear algorithms is asymptotically second order.
Rescuing Prony's method in the case of trigonometric estimation comes down to picking a uniquely determined scaling condition, and this turns the algorithm into an eigenvalue problem (Pisarenkos method). A related nonlinear eigenvalue problem provides the maximum likelihood estimator. Computation of this problem in the exponential fitting case can be simplified without loss of eficiency provided a correct scaling condition is implemented. This algorithm shows promise also in trigonometric estimation problems when not only can the scaling conditions be relaxed, but also it proves reliable in cases where maximum likelihood methods have problems in locating the consistent estimator in a cluster of many competing maxima [70], [133], [139], [153]. The pattern that emerges of the trigonometric problem being easier than the exponential one is noteworthy. The methods apply whenever the basis functions satisfy a difference equation with constant coefficients (these become the problem unknowns). However, the methods appear significantly more effective in the case of exponential and trigonometric functions than in other cases (rational functions being one example). The reason for this is still unclear.
Cyclic reduction is the classic technique for applying recursive halving in the solution of tridiagonal linear systems. Up til now this has been the most widely used technique for vectorizing these systems. Restrictive conditions (no pivoting or orthogonal reduction is allowed so usually the matrix is required to be positive definite and system order a power of two) are imposed. Markus Hegland introduced wrap-around partitioning - a more general reordering transformation - for a one-step reduction of tridiagonal systems to improve vectorization while using stabilized factorization techniques. I looked at the corresponding transformation for block bidiagonal systems and discovered the restrictions just melted away. In this context it contains cyclic reduction as a special case. We believe the method will become the standard method for vectorizing general, regularly structured, narrow band linear systems [177]. These can readily be transformed into block bidiagonal form. The method is already in the standard library on Fujitsu supercomputers, and routines to automatically translate band matrices into block bidiagonal form will be added this year.
This is very exciting! Why should cyclic reduction (a purely algebraic device) applied to discretised ordinary differential equations connect with a whole class of representations of solutions corresponding to boundary value problems of twice the order of the original system of differential equations? Twice the order may not sound necessarily positive, but the process has the important advantages that the resulting class includes stable problems with known dichotomy structure, and, of course, with solutions that can be computed as a byproduct of the cyclic reduction procedure [164]. The application which is the subject of current research is to the estimation of systems of ordinary differential equations. In this application it is usual to imbed the problem in a boundary value problem format. Then the parameters of the boundary conditions are adjusted in the fitting process at the same time as the original problem parameters. The new approach avoids the explicit imbedding, and the extra degrees of freedom in the problem are taken up by Lagrange multipliers. It is interesting that the multipliers are statistically orthogonal to the problem parameters. This property is not shared by the imposed boundary conditions in the explicit imbedding technique [154].
Research Exposition
Methods for unconstrained optimization problems
[29]: Kowalik invited me to join him in authoring a text on unconstrained optimization problems. It was clearly a good time to bring together the burgeoning developments in this area (the famous Fletcher Powell paper was the most cited mathematics reference according to the Library at the Rutherford Laboratory). At the time I had developing interests in least squares problems but had no other special qualifications. In the end I finished up writing the text and contributing substantially to other areas. I was pleased with my chapter on direct search methods and referees agreed. The book was remarkably successful.Lecture notes on optimization [59]: One consequence of the unconstrained optimization book was an invitation from George Forsythe to give a graduate course on optimization at Stanford. The optimization area had moved on, and I had developed an interest in applications to penalty and barrier methods. Thus these notes provided an opportunity to bring up to date the material in the earlier book. It was also positioned much closer to the frontiers of current research. I was able to develop rate of convergence estimates for barrier and penalty algorithms, to address the question of the relation between the structure of barrier methods and their rate of convergence, to base the treatment of conjugate direction algorithms for unconstrained optimization on the product form updating methods, and, with Michael Saunders, make a detailed study of the use of smart updating (surgery) in the implementation of the product form methods [58]. Students of the course who have gone on to distinguished careers in the area include Margaret Wright and Michael Saunders. Forsythe requested the notes be published as a technical report. It would have been very suitable for publication on the web.
Finite Algorithms in Optimization and Data Analysis [103]: This book went through an important phase of its evolution as a graduate course at the University of Utah. The original ideas had been to develop a treatment of algorithms for minimizing polyhedral functions (linear programming is the best known of these problems) based on modern methods of convex analysis, and to exploit systematically the idea that tableau data structures could be used in implementing generic versions of stabilized algorithms such as Bartels-Golub and projected gradient methods. However, the challenge to develop a general formalism for treating the necessary conditions in these optimization problems, motivated in particular by the difficult rank based estimation problems, led me to the concept of structure functionals and the associated compact forms for the subdifferential which is where the convex analysis enters as a unifying technique. A coherent treatment of degeneracy in these polyhedral minimization problems did not exist and was clearly desirable, and one of the most important aspects of the book is that I was able to do this by way of a necessary and sufficient condition that leads to effective algorithms for the class of polyhedral problems. Finiteness is interpreted as the existence of rational algorithms. These include data analysis problems in least squares, quadratic programming, and robust estimation, and problems of this kind are also discussed. A highlight is the inclusion of work from David Clark's thesis to deal with rational methods for the Huber M-estimator (see also [105]), and an account of results obtained with Alistair Watson on a generalisation of the `errors in variables' model (see also [102]).
Simplicial Algorithms for Minimizing Polyhedral Functions [180]: Since the development of the `Finite Algorithms' book there has been significant developments in the area. Most notably, Karen George was able to make a full implementation of the rank regression algorithm before she fell ill. This book aims as an important component of its coverage to provide a reporting of her contributions as part of the task of bringing the previous one up to date. This put the onus on me to provide a working rank regression program which turned out to commit me to several months of intensive work. I made use of object oriented programming techniques which I believe helped significantly - this is argued in detail in the book. Also, I had the advantage of joint work with Alistair Watson on line search methods for these problems [160] which highlights the utility of a modified secant algorithm, and I had completed implementations of degeneracy resolving codes for linear programming [141]. The work on structure functional descriptions of these problems has been revised to take account of developments (for example, [106]), and Rob Womersley's fundamental work on censored l1 estimation as an example of a nonconvex problem put in a more general setting. A different kind of extension applies the structure functional framework to a smooth function subject to a polyhedral function constraint to develop the analog of active set methods. There is a corresponding development for the associated Lagrangian function.
Computational Methods in Least Squares and Maximum Likelihood: These notes have been used for a lecture course at the University of Heidelberg. They cover a number of the research areas of current interest indicated above, especially the application of scoring to maximum likelihood problems which arise in the estimation of differential equations and related problems of variable selection. They are quite regularly updated.
Graduate Students
W.McLewin |
Lecturer in Mathematics, Manchester University (ret) |
R.Jordinson |
Senior Lecturer in Mathematics, Edinburgh University (ret.) |
G.A.Watson |
Professor of Mathematics , University of Dundee |
D.M.Ryan |
Professor and Deputy Dean of Engineering, University of Auckland |
L.S.Jennings |
Associate Professor , University of Western Australia |
F. De Hoog |
Chief Scientist, CSIRO |
D.H.Anderson |
C.A.S.A. |
G.X.Woodford |
Consultant |
D.C.C.Bover |
Mission Critical Systems |
Krisorn Jittorntrum |
Director of Computer Services, Cheng Mai University |
Andreas Griewank |
Professor of Mathematics, Humboldt University Berlin |
D.I.Clark |
Senior Lecturer, University of Canberra |
Linping Sun |
Professor of Mathematics, University of Nanjing |
G.K.Smyth |
Senior Research Scientist, WEHI |
Tania Prvan |
Senior Lecturer in Statistics, Macquarie University |
Karen George |
Lecturer, University of Canberra (ret) |
J.H.Taylor |
Research Scientist, Department of Defence |
Sergey Bakin |
Suncorp Metway |
Zengfeng Li |
Department of Health |