2015 Joint Telematics Group/IEEE Information Theory Society Summer School

on Signal Processing, Communications and Networks.

IISc Bangalore, July 20-23, 2015.

Invited Workshop

IISc Bangalore, July 24, 2015.

In the modern big-data era, technological innovations have enabled the collection and storage of data at a previously unimaginable scale; at the same time, the growth of data is constantly outpaced by that of the features which makes the statistical inference task highly non-trivial and computationally challenging. Accordingly, in contemporary data-analytic applications, the area of high-dimensional statistics has become the focus of increasing attention, concerning problems in which the ambient dimension is finite but comparable to or substantially larger than the sample size. The holy grail is to design statistical procedures that are both computationally efficient and information-theoretically optimal.

The interplay between information theory and statistics is a constant theme in the development of both fields. In this lecture series I will illustrate how techniques rooted in information theory play a key role in understanding the fundamental limits of high-dimensional statistical problems. We will discuss foundational topics on information-theoretic methods, such as Fano's inequality, Le Cam's methods, metric entropy and volume methods, aggregation, as well as their applications on specific problems, such as sparse linear regression, estimating high-dimensional matrices, principle component analysis, functional estimation, large-alphabet problems, community detection, etc. I will also discuss the recent trend of combining the statistical and computational perspectives and the computational barriers in a series of statistical problems on large matrices and random graphs.

The course begins with a review of basic information-theoretic
quantities such as entropy, informational divergence, and mutual information.
Typical sequences are treated next, followed by Shannon's rate-distortion and
capacity-cost functions. The multi-terminal problems treated include distributed
source coding, multi-access channels, and interference channels. Time-permitting,
topics related to large networks will be discussed, e.g., relaying and the cut-set
bound.