HAL : derniers dépôts du SAMM
mercredi 18 mai 2016

[hal01308517] On the KreinMilman theorem for convex compact metrizable sets
The KreinMilman theorem states that every convex compact subset of a Hausdorff locally convex topological space, is the closed convex hull of its extreme points. We prove that, in the metrizable case the situation is rather better. Indeed, we introduce a concept of "{\it affine exposed points}" which is intermediate between the notions of exposed points and extreme points. Then, we prove that every convex compact metrizable subset of a Hausdorff locally convex topological space, is the closed convex hull of its affine exposed points. This fails in general for not metrizable compact convex subsets.
mercredi 11 mai 2016

[hal01310409] Bayesian Variable Selection for Globally Sparse Probabilistic PCA
With the flourishing development of highdimensional data, sparse versions of principal component analysis (PCA) have imposed themselves as simple, yet powerful ways of selecting relevant features in an unsupervised manner. However, when several sparse principal components are computed, the interpretation of the selected variables may be difficult since each axis has its own sparsity pattern and has to be interpreted separately. To overcome this drawback, we propose a Bayesian procedure that allows to obtain several sparse components with the same sparsity pattern. This allows the practitioner to identify the original variables which are relevant to describe the data. To this end, using Roweis' probabilistic interpretation of PCA and an isotropic Gaussian prior on the loading matrix, we provide the first exact computation of the marginal likelihood of a Bayesian PCA model. In order to avoid the drawbacks of discrete model selection, we propose a simple relaxation of our framework which allows to find a path of models using a variational expectationmaximization algorithm. The exact marginal likelihood can eventually be maximized over this path, relying on Occam's razor to select the relevant variables. Since the sparsity pattern is common to all components, we call this approach globally sparse probabilistic PCA (GSPPCA). Its usefulness is illustrated on synthetic data sets and on several real unsupervised feature selection problems.

[hal01122393] The dynamic random subgraph model for the clustering of evolving networks
In recent years, many clustering methods have been proposed to extract information from networks. The principle is to look for groups of vertices with homogenous connection profiles. Most of these techniques are suitable for static networks, that is to say, not taking into account the temporal dimension. This work is motivated by the need of analyzing evolving networks where a decomposition of the networks into subgraphs is given. Therefore, in this paper, we consider the random subgraph model (RSM) which was proposed recently to model networks through latent clusters built within known partitions. Using a state space model to characterize the cluster proportions, RSM is then extended in order to deal with dynamic networks. We call the latter the dynamic random subgraph model (dRSM). A variational expectation maximization (VEM) algorithm is proposed to perform inference. We show that the variational approximations lead to an update step which involves a new state space model from which the parameters along with the hidden states can be estimated using the standard Kalman filter and RauchTungStriebel (RTS) smoother. Simulated data sets are considered to assess the proposed methodology. Finally, dRSM along with the corresponding VEM algorithm are applied to an original maritime network built from printed Lloyd's voyage records.
mardi 10 mai 2016

[hal01312596] Exact ICL maximization in a nonstationary temporal extension of the stochastic block model for dynamic networks
The stochastic block model (SBM) is a flexible probabilistic tool that can be used to model interactions between clusters of nodes in a network. However, it does not account for interactions of time varying intensity between clusters. The extension of the SBM developed in this paper addresses this shortcoming through a temporal partition: assuming interactions between nodes are recorded on fixedlength time intervals, the inference procedure associated with the model we propose allows to cluster simultaneously the nodes of the network and the time intervals. The number of clusters of nodes and of time intervals, as well as the memberships to clusters, are obtained by maximizing an exact integrated completedata likelihood, relying on a greedy search approach. Experiments on simulated and real data are carried out in order to assess the proposed methodology.

[hal01312590] Mean Absolute Percentage Error for regression models
We study in this paper the consequences of using the Mean Absolute Percentage Error (MAPE) as a measure of quality for regression models. We prove the existence of an optimal MAPE model and we show the universal consistency of Empirical Risk Minimization based on the MAPE. We also show that finding the best model under the MAPE is equivalent to doing weighted Mean Absolute Error (MAE) regression, and we apply this weighting strategy to kernel regression. The behavior of the MAPE kernel regression is illustrated on simulated data.
mercredi 13 avril 2016

[halshs01301794] Politique salariale et mode de rémunération dans la Fonction publique en France depuis le début des années 2000 : mutations et enjeux.
La politique salariale de l’Etat a connu des inflexions importantes au cours de la dernière décennie. Des ajustements paramétriques (gel du point d'indice, indexation de fait des bas salaires au SMIC) et des mesures partielles (requalifications de certaines catégories) ont été adoptés, mais des réformes plus structurelles du mode de rémunération, même si elles ont été souhaitées par l’Etat, n’ont pas réellement abouti. La politique salariale de l'Etat s'est faite en même temps plus catégorielle. Audelà des effets limités sur le pouvoir d’achat moyen, ces changements ont eu des conséquences importantes, en termes de hiérarchies salariales et de carrière, et contribuent à expliquer la montée d’un mécontentement salarial important. L'ensemble de ces évolutions interpellent les organisations syndicales, dont les stratégies, à divers niveaux (central ou local) varient entre opposition et accompagnement.
vendredi 8 avril 2016

[hal01299161] The Stochastic Topic Block Model for the Clustering of Networks with Textual Edges
Due to the significant increase of communications between individuals via social medias (Facebook, Twitter) or electronic formats (email, web, coauthorship) in the past two decades, network analysis has become a unavoidable discipline. Many random graph models have been proposed to extract information from networks based on persontoperson links only, without taking into account information on the contents. In this paper, we have developed the stochastic topic block model (STBM) model, a probabilistic model for networks with textual edges. We address here the problem of discovering meaningful clusters of vertices that are coherent from both the network interactions and the text contents. A classification variational expectationmaximization (CVEM) algorithm is proposed to perform inference. Simulated data sets are considered in order to assess the proposed approach and highlight its main features. Finally, we demonstrate the effectiveness of our model on two realword data sets: a communication network and a coauthorship network.

[hal01207009] Weighted interpolation inequalities: a perturbation approach
We study optimal functions in a family of CaffarelliKohnNirenberg inequalities with a powerlaw weight, in a regime for which standard symmetrization techniques fail. We establish the existence of optimal functions, study their properties and prove that they are radial when the power in the weight is small enough. Radial symmetry up to translations is true for the limiting case where the weight vanishes, a case which corresponds to a wellknown subfamily of GagliardoNirenberg inequalities. Our approach is based on a concentrationcompactness analysis and on a perturbation method which uses a spectral gap inequality. As a consequence, we prove that optimal functions are explicit and given by Barenblatttype profiles in the perturbative regime.
samedi 27 février 2016

[hal01279327] Weighted fast diffusion equations (Part II): Sharp asymptotic rates of convergence in relative error by entropy methods
This paper is the second part of the study. In Part~I, selfsimilar solutions of a weighted fast diffusion equation (WFD) were related to optimal functions in a family of subcritical CaffarelliKohnNirenberg inequalities (CKN) applied to radially symmetric functions. For these inequalities, the linear instability (symmetry breaking) of the optimal radial solutions relies on the spectral properties of the linearized evolution operator. Symmetry breaking in (CKN) was also related to largetime asymptotics of (WFD), at formal level. A first purpose of Part~II is to give a rigorous justification of this point, that is, to determine the asymptotic rates of convergence of the solutions to (WFD) in the symmetry range of (CKN) as well as in the symmetry breaking range, and even in regimes beyond the supercritical exponent in (CKN). Global rates of convergence with respect to a free energy (or entropy) functional are also investigated, as well as uniform convergence to selfsimilar solutions in the strong sense of the relative error. Differences with largetime asymptotics of fast diffusion equations without weights will be emphasized.

[hal01279326] Weighted fast diffusion equations (Part I): Sharp asymptotic rates without symmetry and symmetry breaking in CaffarelliKohnNirenberg inequalities
In this paper we consider a family of CaffarelliKohnNirenberg interpolation inequalities (CKN), with two radial power law weights and exponents in a subcritical range. We address the question of symmetry breaking: are the optimal functions radially symmetric, or not ? Our intuition comes from a weighted fast diffusion (WFD) flow: if symmetry holds, then an explicit entropy  entropy production inequality which governs the intermediate asymptotics is indeed equivalent to (CKN), and the selfsimilar profiles are optimal for (CKN). We establish an explicit symmetry breaking condition by proving the linear instability of the radial optimal functions for (CKN). Symmetry breaking in (CKN) also has consequences on entropy  entropy production inequalities and on the intermediate asymptotics for (WFD). Even when no symmetry holds in (CKN), asymptotic rates of convergence of the solutions to (WFD) are determined by a weighted HardyPoincaré inequality which is interpreted as a linearized entropy  entropy production inequality. All our results rely on the study of the bottom of the spectrum of the linearized diffusion operator around the selfsimilar profiles, which is equivalent to the linearization of (CKN) around the radial optimal functions, and on variational methods. Consequences for the (WFD) flow will be studied in Part II of this work.
samedi 13 février 2016

[hal01270963] On combining wavelets expansion and sparse linear models for Regression on metabolomic data and biomarker selection
Wavelet thresholding of spectra has to be handled with care when the spectra are the predictors of a regression problem. Indeed, a blind thresholding of the signal followed by a regression method often leads to deteriorated predictions. The scope of this article is to show that sparse regression methods, applied in the wavelet domain, perform an automatic thresholding: the most relevant wavelet coefficients are selected to optimize the prediction of a given target of interest. This approach can be seen as a joint thresholding designed for a predictive purpose. The method is illustrated on a real world problem where metabolomic data are linked to poison ingestion. This example proves the usefulness of wavelet expansion and the good behavior of sparse and regularized methods. A comparison study is performed between the twosteps approach (wavelet thresholding and regression) and the onestep approach (selection of wavelet coefficients with a sparse regression). The comparison includes two types of wavelet bases, various thresholding methods, and various regression methods and is evaluated by calculating prediction performances. Information about the location of the most important features on the spectra was also obtained and used to identify the most relevant metabolites involved in the mice poisoning.

[hal01265147] Limited operators and differentiability
We characterize the limited operators by differentiability of convex continuous functions. Given Banach spaces $Y$ and $X$ and a linear continuous operator $T: Y \longrightarrow X$, we prove that $T$ is a limited operator if and only if, for every convex continuous function $f: X \longrightarrow \R$ and every point $y\in Y$, $f\circ T$ is Fr\'echet differentiable at $y\in Y$ whenever $f$ is G\^ateaux differentiable at $T(y)\in X$.
mercredi 10 février 2016

[hal01263540] Modelling time evolving interactions in networks through a non stationary extension of stochastic block models
The stochastic block model (SBM) describes interactions between nodes of a network following a probabilistic approach. Nodes belong to hidden clusters and the probabilities of interactions only depend on these clusters. Interactions of time varying intensity are not taken into account. By partitioning the whole time horizon, in which interactions are observed, we develop a non stationary extension of the SBM, allowing us to simultaneously cluster the nodes of a network and the fixed time intervals in which interactions take place. The number of clusters as well as memberships to clusters are finally obtained through the maximization of the completedata integrated likelihood relying on a greedy search approach. Experiments are carried out in order to assess the proposed methodology.
mardi 9 février 2016

[hal01270293] Is the corporate elite disintegrating? Interlock boards and the Mizruchi hypothesis
This paper proposes an approach for comparing interlocked board networks over time to test for statistically significant change. In addition to contributing to the conversation about whether the Mizruchi hypothesis (that a disintegration of power is occurring within the corporate elite) holds or not, we propose novel methods to handle a longitudinal investigation of a series of social networks where the nodes undergo a few modifications at each time point. Methodologically, our contribution is twofold: we extend a Bayesian model hereto applied to compare two time periods to a longer time period, and we define and employ the concept of a hull of a sequence of social networks, which makes it possible to circumvent the problem of changing nodes over time.
mercredi 3 février 2016
mercredi 27 janvier 2016

[hal01261122] Countryscale Exploratory Analysis of Call Detail Records through the Lens of Data Grid Models
Call Detail Records (CDRs) are data recorded by telecommunications companies, consisting of basic informations related to several dimensions of the calls made through the network: the source, destination , date and time of calls. CDRs data analysis has received much attention in the recent years since it might reveal valuable information about human behavior. It has shown high added value in many application domains like e.g., communities analysis or network planning. In this paper, we suggest a generic methodology based on data grid models for summarizing information contained in CDRs data. The method is based on a parameterfree estimation of the joint distribution of the variables that describe the calls. We also suggest several wellfounded criteria that allows one to browse the summary at various granularities and to explore the summary by means of insightful visualizations. The method handles network graph data, temporal sequence data as well as user mobility data stemming from original CDRs data. We show the relevance of our methodology on realworld CDRs data from Ivory Coast for various case studies, like network planning strategy and yield management pricing strategy.
vendredi 22 janvier 2016

[hal01259983] Semiparametric stationarity and fractional unit roots tests based on datadriven multidimensional increment ratio statistics
In this paper, we show that the central limit theorem (CLT) satisfied by the datadriven Multidimensional Increment Ratio (MIR) estimator of the memory parameter d established in Bardet and Dola (2012) for d ∈ (−0.5, 0.5) can be extended to a semiparametric class of Gaussian fractionally integrated processes with memory parameter d ∈ (−0.5, 1.25). Since the asymptotic variance of this CLT can be estimated, by datadriven MIR tests for the two cases of stationarity and nonstationarity, so two tests are constructed distinguishing the hypothesis d < 0.5 and d ≥ 0.5, as well as a fractional unit roots test distinguishing the case d = 1 from the case d < 1. Simulations done on numerous kinds of shortmemory, longmemory and nonstationary processes, show both the high accuracy and robustness of this MIR estimator compared to those of usual semiparametric estimators. They also attest of the reasonable efficiency of MIR tests compared to other usual stationarity tests or fractional unit roots tests. Keywords: Gaussian fractionally integrated processes; semiparametric estimators of the memory parameter; test of longmemory; stationarity test; fractional unit roots test.
mercredi 20 janvier 2016
mercredi 13 janvier 2016

[hal01254346] Maxima of Two Random Walks: Universal Statistics of Lead Changes
We investigate statistics of lead changes of the maxima of two discretetime random walks in one dimension. We show that the average number of lead changes grows as π ^(−1) ln t in the longtime limit. We present theoretical and numerical evidence that this asymptotic behavior is universal. Specifically, this behavior is independent of the jump distribution: the same asymptotic underlies standard Brownian motion and symmetric Lévy flights. We also show that the probability to have at most n lead changes behaves as t^(−1/4) (ln t)^n for Brownian motion and as t ^(−β(µ)) (ln t)^n for symmetric Lévy flights with index µ. The decay exponent β ≡ β(µ) varies continuously with the Lévy index when 0 < µ < 2, while β = 1/4 for µ > 2.
samedi 9 janvier 2016

[hal01253191] Pontryagin principle for a Mayer problem governed by a delay functional differential equation
We establish Pontryagin principles for a Mayer's optimal control problem governed by a functional differential equation. The control functions are piecewise continuous and the state functions are piecewise continuously differentiable. To do that, we follow the method created by Philippe Michel for systems governed by ordinary differential equations, and we use properties of the resolvent of a linear functional differential equation.

[hal01253186] Pontryagin principle for a Mayer problem governed by a delay functional differential equation
We establish Pontryagin principles for a Mayer's optimal control problem governed by a functional differential equation. The control functions are piecewise continuous and the state functions are piecewise continuously differentiable. To do that, we follow the method created by Philippe Michel for systems governed by ordinary differential equations, and we use properties of the resolvent of a linear functional differential equation.
dimanche 3 janvier 2016
mardi 22 décembre 2015
mercredi 18 mai 2016
mercredi 11 mai 2016
mardi 10 mai 2016
mercredi 13 avril 2016
vendredi 8 avril 2016
samedi 27 février 2016
samedi 13 février 2016
mercredi 10 février 2016
mardi 9 février 2016
mercredi 3 février 2016
mercredi 27 janvier 2016
vendredi 22 janvier 2016
mercredi 20 janvier 2016
mercredi 13 janvier 2016
samedi 9 janvier 2016
dimanche 3 janvier 2016
mardi 22 décembre 2015