![]() |
Mathematical Sciences Institute (MSI)
Events - Special Year 2005
|
|
2005 Special Year on Computational MathematicsWorkshop on High-dimensional ApproximationThe talks were held in the Manning Clark Centre, building #26a. On Tuesday the talks were in JD101 of the John Dedman Building building #27.
Some of the presentations can be found in the list of abstracts.Monday, 14th of FebruaryManning Clark Centre, Theatre T5
Tuesday, 15th of FebruaryJohn Dedman Building, JD101
Wednesday, 16th of FebruaryManning
Clark Centre, Theatre T5
Thursday, 17th of
February
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Chair | Garry Newsam | |
| 9:00 | Michael
Collins Australian National University |
Constructing molecular potential energy surfaces by interpolation |
| 9:45 | Ralf
Wildenhues Universität Bonn |
Separable approximation for the Schrödinger equation |
| 10:30 | Morning Tea | |
| 11:15 | Jan
Hamaekers Universität Bonn |
Fast Fourier Transformation with hierarchical bases on Sparse Grids for the Schrödinger equation |
| 12:00 | Lunch break | |
| Chair | Markus Hegland | |
| 14:00 | Tianhai
Tian University of Queensland |
Discrete stochastic simulation and its applications |
| 14:45 | Andrej
Nitsche
ETH Zürich |
Tensor Product Approximation Spaces & Elliptic PDEs |
| 15:30 | Afternoon Tea | |
| 16:00 | Michael
Griebel Universität Bonn |
Sparse Grids |
| 19:00 | MSI colloquium dinner | |
Olivier Chapelle,
MPI Tübingen
Feature selection for Support Vector Machines
Presentation (PDF)
Michael Collins,
Australian National University
Constructing Molecular Potential Energy Surfaces by Interpolation
Presentation (PDF)
At some molecular configurations two (or more) electronic states can become energetically degenerate. The motion of the atoms is then coupled with that of the electrons. We must then construct, by interpolation, surfaces for all the elements of a potential energy matrix which describes the atomic motion on all the surfaces and the forces which produce a change in electronic state. The problem is complicated by symmetry considerations and an unknown phase.
Jochen Garcke,
Australian National University
Semi-supervised Learning with
Sparse Grids
Presentation (PDF)
Recently several algorithms were proposed how to use unlabeled data in
the
learning phase of classification methods to improve the generalization
properties. They exploit the geometry of the marginal distribution and
learn the manifold formed by the data. One assumes here, that nearby
data
points have the same class, also known as the cluster assumption, which
states that the decision boundary should not cross high density
regions.
The learning task is then formulated as a minimization problem in a
Reproducing Kernel Hilbert Space and generalisations of
Support Vector Machines are used to obtain the classifier.
In this paper, we apply the sparse grid combination method to the regularization problem. The problem is discretized and solved on a sequence of conventional grids with uniform mesh sizes in each coordinate direction. Thus the classifier is build on sparse grid points and not on data points. For standard classification problems it was shown that this approach scales only linear with the number of data to be classified. We will present first experimental results and discuss the complexity of this approach for the semi-supervised learning case.
Thomas Gerstner,
Universität Bonn
Valuation of performance-dependent options
Part I: Framework and computation using sparse grid quadrature
Performance-dependent options are financial derivatives whose payoff depends on the performance of one asset in comparison to other assets. The martingale approach yields a fair price of such options as a multidimensional integral whose dimension is the number of assets under consideration. Usually, the integrand is discontinuous which makes accurate solutions difficult to achieve by numerical approaches. However, in the case of performance-dependent options is possible to reformulate the problem such that a solution can also be achieved by integrating several multivariate normal distributions and thus smooth integrands. The number of of integration problems to be solved depends exponentially on the dimension, though. Still, for moderate-dimensional problems sparse grid quadrature methods can provide highly accurate option prices as is shown in part one of this talk.
The numerical treatment of high(er) dimensional problems suffers in general from the so-called curse of
dimensionality. In special cases, i.e. for special function classes, this exponential dependence
of O(n-r/d) of the achieved accuracy on the invested work n can be substantially reduced.
Here, r denotes smoothness and d dimensionality. This is e.g. the case for spaces of functions
with bounded mixed derivatives. The associated numerical schemes involve a series expansion
in a multiscale basis for the one-dimensional problem. Then, a product construction and a
proper truncation of the resulting d-dimensional expansion result in a so-called sparse grid
approximation.
Here, depending on the respective problem and the 1-dimensional multiscale basis used, a variety of
algorithms for higher dimensional problems result which allow to break the curse of dimensionality, at
least to some extent, and result in complexities of the order O(n-rlog(n)α(d)
. In special cases even
α(d)
= 0 can be achieved. This is possible if the error is measured in the H1 seminorm or if the different
dimensions as well as their interactions are not equally important and dimension-adaptive strategies are
used.
The constant in these order estimates, however, is still dependent on d. It also reflects subtile details of
the implementation of the respective numerical scheme.
We discuss such sparse grid algorithms for the numerical treatment of integration problems, partial differential equations and data analysis.
The electronic Schrödinger equation forms the basis of most quantum-chemical computations. Here, a high-dimensional eigenvalue problem has to be solved. Due to the Pauli principle, the eigenfunctions which corresponds to fermions have to be antisymmetric. Recently, it has been shown that these wavefunctions possess certain square integrable mixed weak derivatives, which are important for approximation methods that are based on sparse grids or hyperbolic cross spaces (Yserentant, 2002). The talk will first discuss the Fast Fourier Transformation with hierarchical bases on sparse grids or hyperbolic cross spaces. Then, we demonstrate the application of this FFT method within a scheme to calculate the minimal eigenvalue which corresponds to an antisymmetric eigenfunction of a many-body periodic Hamilton operator.
A simple but stunning consequence (demonstrated by Levy in 1920) of the concentration of measure is that the function values of high-dimensional continuous functions are concentrated around the median value mf. Consequently, these functions are well approximated by the constant function f(x) = mf. In our research we investigated if this observation could be generalised to other than constant function approximation. Results were found for piecewise constant approximation, radial basis function approximation and additive modelling.
This is collaborative work with Paola Pozzi, University of Freiburg, and Vladimir Pestov, University of Ottawa.
Markus Holtz,
Universität Bonn
Valuation of performance-dependent options
Part II: Dimension reduction by orthant decomposition of hyperplane arrangements
Presentation (PDF)
In the second part of this talk we show how a dimension reduction of the multidimensional model can be achieved. In addition, a corresponding pricing formula for performance-dependent options is derived. However, the dimension reduction of the model initially does not imply a reduction in dimension and complexity for numerical valuation. A suitable dimension reduction leads to integration problems over so-called hyperplane arrangements in a lower- dimensional space. This reduces the number of cells to be integrated substantially. However, the topology of the cells changes from orthants to general polyhedra requiring much more involved integration rules. We show that any hyperplane arrangement can be decomposed into the same number of orthants as original polyhedral cells using suitable cell combinations. Furthermore, we give an almost optimal algorithm for the construction of an optimal orthant decomposition. Numerical results show the efficiency of the overall algorithm for the valuation of performance-dependent options also for high dimensional problems.
Statistical learning from high dimensional data often requires the reduction to lower dimensional data as a first step of a deeper analysis. The choice of method for dimension reduction affects the performance of the subsequent analysis. Furthermore there is a compromise between preserving important information in the data and reducing the complexity of the data.
Motivated by real high dimensional data sets we examine the use of principal component analysis (PCA) and independent component analysis (ICA) as means of extracting lower dimensional data. A common problem of many dimension reduction methods or feature selection methods - including PCA and ICA -- is the choice of stopping criteria or critical dimension, that is, the dimension which provides the best compromise between dimension reduction and lack of loss of information. We propose some criteria for choosing the critical dimension and show how they work on real applications.
In
this paper we introduce and analyse examples of supervised
learning tasks, where machine learning algorithms yield good
performance on
the training set but worse than random performance on the
independent test set. We call this phenomenon anti-learning.
Examples of real life, biological classification tasks, are given
for motivation, but analysis concentrates on synthetic examples. For
some of these datasets anti-learning is both, demonstrated
empirically and formally proven. The proofs hold for broad classes
of linear and non-linear machines, including linear and kernel
support vector machines, ridge regression, perceptron and k-nearest
neighbours.
Integrals with hundreds or even thousands of dimensions occur in many practical applications. Lattice rules are a family of equal-weight quadrature rules which can be used to approximate these high-dimensional integrals. By now it is well established that for a given n, the total number of quadrature points, good lattice rules can be constructed component-by-component for integrands belonging to certain function spaces. However, the lattice rules constructed this way are not "extensible" in sense that if you wish to change the number of points you need to carry out the construction from scratch. In this talk I will start with a review of recent advances in this research area and then introduce a new algorithm for constructing semi-extensible lattice sequences.
Studies of a wide variety of real networks (e.g. the Internet, citation lists or networks of friends) have identified a shortlist of key features that appear to identify their overall properties: sparsity (the number of edges is O(N) where N is the number of vertices); power law distributions of vertex degrees; and the "small world" effect (diameters of networks are typically o(logN). Recently a new class of random graph models has been proposed with the aim of reproducing these properties that is based on embedding the graph in some finite (often high) dimensional metric space. Embeddings of this nature are a form of multi-dimensional scaling and are an increasingly common phenomena: the talk will analyse their strengths and weaknesses in the context of network modelling.
Tensor Product
Approximation
Spaces & Elliptic PDEs
Presentation (PDF)
We discuss
approximation theory for
tensor product bases. For linear approximation (uniform and
non-uniform sparse grid methods), the approximation spaces are well
known to be (weighted) Sobolev
spaces of mixed highest derivatives. In the nonlinear case (best N term
approximation, adaptive
approximation), the approximation spaces turn out to be q-tensor
products of suitable Besov spaces
Bqs(Lq),
which might be coined Besov spaces of mixed highest derivatives.
In the second part of the talk, we apply these theoretical results to the numerical solution of elliptic partial differential equations. We discuss Besov regularity of solutions to elliptic PDEs in the tensor product setting as well as in the more standard setting using isotropically supported bases. The results are quite different, showing an unlimited degree of anisotropic Besov regularity (for the tensor product setting) but only a limited degree of isotropic Besov regularity (for the setting using isotropically supported basis functions) in dimension 3 and higher.
We present a method for estimating probability density functions in high dimensions based on the sparse grid space, which scales linearly with data size and also has a weak dependence on the dimension of the space.
In particular we present an investigation of discrete histosplines using sparse grids spaces as the discretisation space. We provide a convergence analysis, which shows that for smooth enough underlying probability density functions, the complexity of the space is O(log(h)d-1h-1) and the approximation error is O(log(h)d-1h2) $O(|\log(h)^{d-1} h2)$. The use of classical sparse grids allows us to practically deal with data sets with up to 15 dimensions.
The method is demonstrated on a 10 dimensional data collection containing information on vegetation cover and various geographical features. We use a sparse grid histospline to generate a predictive model of vegetation type in terms of geographical data.
Nowadays integration problems with very large numbers of variables arise in many applications, including mathematical finance. While Monte Carlo methods are available for problems with any number of integration variables, increasingly attention has turned to quasi-Monte Carlo methods. These have the appearance of Monte Carlo methods, but with the integration points chosen deterministically instead of randomly. This talk will review the history of quasi-Monte Carlo methods, with an emphasis on latice methods for non-periodic functions, finishing with recent step-by-step constructions of rules with very large numbers of points and hundreds of variables.
Stochastic models have been developed recently for describing biological networks containing species with small molecular numbers. The stochastic simulation algorithm (SSA) is an essentially exact procedure for simulating the time evolution of biological reaction systems. However, it is difficult to apply the SSA to simulate biological networks including elementary reactions differing by many orders of magnitude in the number of molecules involved due to the huge computational time. In this talk I will first present the most recent progress in designing efficient numerical methods for improving the efficiency of the SSA. The application of these methods will be also discussed for genetic regulatory networks, ecological systems and epidemic systems.
We study the problem of multivariate integration on the unit cube for unbounded integrands. Our study is motivated by problems in mathematical finance, where unbounded integrands can arise as a result of using the cumulative inverse normal transformation to map the integral to the unit cube. We define a new space of functions which possess the boundary behavior of those unbounded integrands arising from finance applications. Our new function space is a weighted tensor-product reproducing-kernel Hilbert space. We carry out a worst-case analysis in this space and show that good randomly-shifted lattice rules can be constructed component-by-component to achieve a worst-case error smaller than the worst-case Monte Carlo error. Numerical experiments indicate that our lattice rules are reasonably robust and perform well in pricing Asian options.
Recently, algorithms have been proposed to approximate functions with low separation rank in high dimensions. These methods provide the opportunity to overcome the curse of dimensionality, provided that the involved functions and operators can be well-represented with low rank. We review some of these methods and try to extend them to a weak formulation, where the goal is to ultimately apply them to the Schrödinger equation.
Multivariate linear problems for spaces of functions of many variables d occur in many applications. Examples of such problems include integration, approximation, and the solution of differential equations. The number d of variables is sometimes in the hundreds or thousands as it is the case for some problems in financial mathematics.
Tractability of multivariate problems has been intensively studied in recent years. This concept is defined in terms of the minimal number n(ε,d) of function values needed to compute an ε-approximation in the worst case or in the average case setting. Tractability means that n(ε,d) can be bounded by a polynomial in ε-1 and d. Strong tractability means that n(ε,d) is independent of d and polynomially dependent on ε-1.
For many classical spaces all variables play the same role, and n(ε,d) depends exponentially on d. This is called the curse of dimensionality, and leads to intractability. The first such an example was given by Bahkvalov in 1959 for multivariate integration of r times continuously differentiable functions. This is also the case for multivariate integration for tensor product Sobolev spaces, and this problem is strictly related to the L2-discrepancy.
To vanquish the curse of dimensionality, we need to treat variables of functions with diminishing importance. This leads to weighted spaces of functions in which the influence of each variable or a group of variables is moderated by the corresponding weight.
In many applications, although d is huge, functions can be well approximated by sums of functions that depend on groups of just a few variables up to a given order k, with the order defined as the number of variables in a group. We can model this situation by finite-order weights.
Necessary and sufficient conditions on weights to obtain tractability or strong tractability of linear multivariate problems have been obtained. In particular, for finite-order weights we have tractability or even strong tractability of many multivariate problems. This holds even in the worst case setting for multivariate integration for tensor product Sobolev spaces. Tractability bounds can be achieved by shifted lattice rules with generators computed by the component-by-component algorithm, and by well-known low discrepancy sequences such as Halton, Sobol and Niederreiter sequences. For other multivariate problems such as approximation and the solution of differential equations, tractability bounds are achieved by weighted Smolyak-type algorithms.
|
Page last updated: 3 November, 2006 Please direct all enquiries to: MSI webmaster Page authorised by: Director, MSI |
| The Australian National University - CRICOS Provider Number 00120C |