Sur le Web, ces 365 derniers jours
samedi 3 juin 2017

[hal01528540] L’expérience TransiMOOC Pédagogie active et numérique pour un encommun raccrocheur
La question des jeunes en rupture, des jeunes qui décrochent interroge en premier lieu en « rupture de quoi » ou « avec quoi» et donc par contraste, la question de l’accrochage « à quoi ». Claudia Barrantes (2015) affirmait que les jeunes étaient souvent des trapézistes : ils décrochaient de quelque chose pour se raccrocher, la plupart du temps in extrémis, à autre chose. Les études sur le décrochage scolaire montrent que les jeunes sont en rupture avec plusieurs éléments simultanément : ainsi de mauvais résultats scolaires ne suffisent pas à rompre le lien entre les élèves et l’école si ces jeunes aiment aller à l’école pour retrouver leurs copains (Millet Thin 2005 par exemple) Par contre, comme le montraient déjà de nombreuses études dont le précédent colloque du CLEPT (Réussir à l'école, mais réussir quoi 2012), réussir commence par se raccrocher à quelque chose. Il se trouve que plusieurs expériences, dont celle présentée dans cette communication, montrent que la construction du groupe (et donc le raccrochage au groupe) est un préalable pertinent au raccrochage scolaire pris dans son acception plus générale (rapport positif à l’apprentissage, au savoir, etc.) Après un cadre théorique sur ce qu’est le « encommun » et en quoi ce « encommun » est à la fois porteur d’accrochage scolaire et porté par les outils numériques, cette communication décrira le projet TransiMOOC, observé dans sa première année (Epstein, Beauchamps 2016), afin de montrer que le numérique et les pédagogies actives peuvent permettre un « encommun » pour que les élèves raccrochent.

[hal01528387] De la consommation vers la création : étude du projet TransiMOOC Comment le numérique peut participer au renouveau des pédagogies actives ?
En facilitant la création et la publication de contenus audiovisuels de toute nature, les outils numériques font des détenteurs de téléphones portables un potentiel créateur. Comment faire primer la logique de création à travers une démarche de pédagogie active qui place l’apprenant en position d’acteur principal de son apprentissage ? C’est l’un des défis lancés par l’équipe interdisciplinaire de Transapi, qui visait à expérimenter le potentiel des pédagogies actives pour lutter contre le décrochage scolaire. L’objet de cet article est donc de présenter un retour critique sur les premières leçons du projet TransiMOOC, mené en 20132014 à titre expérimental par Transapi. TransiMOOC est un projet de cours en ligne réalisé par des jeunes (de préférence à risque de décrochage scolaire) pour des jeunes. Les jeunes sont happés en tant que « consommateurs » de tablettes et se prennent au jeu comme créateurs de dessins, de cours, etc.
vendredi 2 juin 2017

[hal01530763] Estimation of parameters of regularly varying distributions with an application on planetary perturbations on comets
An important class of heavytailed distributions is the regularly varying distributions of which the stable distributions are a subclass. As the stable distributions, the regularly varying distributions can be described by four parameters that determine the tail heaviness, asymmetry, scale and position respectively. In this talk we present at first a method for estimating these parameters by using the order statistics. Then the method is applied to a set of planetary perturbations of a huge number of comets during close encounters with planets in order to give a statistical description of these perturbations.
jeudi 1er juin 2017

[hal01530761] Statistiques des lois à queue régulière avec l’application sur les perturbations des comètes
Une famille importante de lois de probabilité à queue épaisse est les lois alphastables. Elles sont caractérisées par quatre paramètres qui déterminent respectivement l'épaisseur de queue, l'asymétrie, l'échelle et la position du centre. Dans cet exposé, nous présentons une autre famille de lois à queue épaisse, les lois à queue régulière, qui peut aussi être décrite en utilisant ces quatre paramètres. Ainsi on tient compte non seulement le caractère de la queue de distribution mais aussi la forme du corps quand on utilise cette famille de lois pour représenter les données. Une application aux données des perturbations planétaires des comètes est présentée.
mardi 30 mai 2017

[hal01527749] Nonparametric estimation of time varying AR(1)–processes with local stationarity and periodicity
Extending the ideas of [7], this paper aims at providing a kernel based nonparametric estimation of a new class of time varying AR(1) processes (Xt), with local stationarity and periodic features (with a known period T), inducing the definition Xt = at(t/nT)X t−1 + ξt for t ∈ N and with a t+T ≡ at. Central limit theorems are established for kernel estimators as(u) reaching classical minimax rates and only requiring low order moment conditions of the white noise (ξt)t up to the second order.
jeudi 25 mai 2017
mercredi 24 mai 2017

[hal01526873] Multifractal random walk driven by a Hermite process
We introduce a Multifractal Random Walk (MRW) defined as a stochastic integral of an infinitely divisible noise with respect to a Hermite process. Hermite processes are selfsimilar stochastic processes with stationary increments and exhibit longrange dependence. We study the existence of the Hermite MRW and its properties. We propose a continuous time financial model that captures the multifractal properties observed in the empirical data. We also present a numerical analysis of our results.

[hal01525475] A convex extension of lower semicontinuous functions defined on normal Hausdorff space
We prove that, any problem of minimization of proper lower semicontinuous function defined on a normal Hausdorff space, is canonically equivalent to a problem of minimization of a proper weak * lower semicontinuous convex function defined on a weak * convex compact subset of some dual Banach space. We estalish the existence of an bijective operator between the two classes of functions which preserves the problems of minimization.
lundi 22 mai 2017

[hal01525491] A network model for the propagation of Hepatitis C with HIV coinfection
We define and examine a model of epidemic propagation for a virus such as Hepatitis C (with HIV coinfection) on a network of networks, namely the network of French urban areas. One network level is that of the individual interactions inside each urban area. The second level is that of the areas themselves, linked by individuals traveling between these areas and potentially helping the epidemic spread from one city to another. We choose to encode the second level of the network as extra, special nodes in the first level. We observe that such an encoding leads to sensible results in terms of the extent and speed of propagation of an epidemic, depending on its source point.
vendredi 19 mai 2017

[hal01524112] Discrete time pontryagin principles in banach spaces
The aim of this paper is to establish Pontryagin's principles in a dicretetime infinitehorizon setting when the state variables and the control variables belong to infinite dimensional Banach spaces. In comparison with previous results on this question, we delete conditions of finiteness of codimension of subspaces. To realize this aim, the main idea is the introduction of new recursive assumptions and useful consequences of the Baire category theorem and of the Banach isomorphism theorem.
jeudi 18 mai 2017

[hal01523971] On the asymptotic behaviour of M/G/1 retrial queues with batch arrivals and impatience phenomenon
In this work, we consider an M/G/1 retrial queue with batch arrivals and impatient customers. By using the method of supplementary variables, we obtain the partial generating functions of the steady state joint distribution of the server state and the number of customers in the retrial group. To complete the analysis of the considered model, we find the steady state distribution of the embedded Markov chain. Although the generating function of the steady state distribution of the number of customers in the retrial group can be obtained in explicit form, it is cumbersome and does not reveal the nature of the distribution in question. Therefore, we investigate the asymptotic behaviour of the random variable representing the number of customers in the retrial group under limit values of various parameters.

[hal01522068] A Sharp Uniform Bound for the Distribution of Sums of Bernoulli Trials
In this note we establish a uniform bound for the distribution of a sum $S_n=X_1+\cdots+X_n$ of independent nonhomogeneous Bernoulli trials. Specifically, we prove that $\sigma_n\,\PP(S_n\!=\!j)\!\leq\! \eta$ where $\sigma_n$ denotes the standard deviation of $S_n$ and $\eta$ is a universal constant. We compute the best possible constant $\eta\!\sim\! 0.4688$ and we show that the bound also holds for limits of sums and differences of Bernoullis, including the Poisson laws which constitute the worst case and attain the bound. We also investigate the optimal bounds for $n$ and $j$ fixed. An application to estimate the rate of convergence of Mann's fixed point iterations is presented.
mercredi 17 mai 2017

[hal01519875] An integrative approach based on probabilistic modelling and statistical inference for morphostatistical characterization of astronomical data
This paper describes several applications in astronomy and cosmology that are addressed using probabilistic modelling and statistical inference.

[hal01523751] Conférence invitée “Métodos estadísticos aplicados a la innovación y a la empresa”
Métodos estadísticos aplicados a la innovación y a la empresa: Presentamos los métodos factoriales de analisis de datos, los metodos de clustering y las mapas autoorganizadores, con aplicaciones a problemas reales.

[hal01523304] Robustness of fractal dimension estimators for vector talweg network characterization
The fractal approach is often used to characterize natural objects. Numerous studies have focused on fractal analysis of river networks in particular. However, only few papers discuss the estimation methods and the uncertainty of the main fractal indicator, the fractal dimension. Firstly, the distinction between infinite mathematical fractal and nature fractal should be taken into account to estimate fractal dimension. Moreover, the networks are most of the time integrated in GIS database and represented by vector object. This type of representation possesses its own properties and we think that the impact on fractal measure should be evaluated. In this context, the work we propose aims at testing the robustness of different fractal dimension estimators for the characterization of vector talweg networks. We focus on the two most popular estimators: a classical estimator for river networks, based on a topological approach with the HortonStrahler ratios, and the boxcounting dimension, based on a geometric approach. A third estimator, the less known correlation dimension, also based on a geometric approach, offers interesting possibility for calculating a stable fractal indicator, in particular in the case of a reduced number of streamsegments. These methods are applied on both virtual (such as Scheidegger network), and actual vector networks. The actual case is a network extracted from a high resolution DTM of the Draix badlands in the French Alps. Three main methodological results can be highlighted: 1 the study of virtual network contributes to the assessment of the estimators relevance, according to the network branching structure; 2 an empirical fractal domain must be determined on the LogLog curve with an objective method to estimate fractal dimensions that can be compared; 3 the observation of uncertainty of the fractal dimension is necessary for any valid comparison.
samedi 13 mai 2017

[hal01522093] Monte Carlo Methods
[...]

[hal01522083] On the stochastic NLS equation on compact Riemannian manifolds
[...]

[hal01519780] Inferring structure in bipartite networks using the latent blockmodel and exact ICL
We consider the task of simultaneous clustering of the two node sets involved in a bipartite network. The approach we adopt is based on use of the exact integrated complete likelihood for the latent blockmodel. Using this allows one to infer the number of clusters as well as cluster memberships using a greedy search. This gives a modelbased clustering of the node sets. Experiments on simulated bipartite network data show that the greedy search approach is vastly more scalable than competing Markov chain Monte Carlo based methods. Application to a number of real observed bipartite networks demonstrate the algorithms discussed.

[hal01519750] Variational Bayes model averaging for graphon functions and motif frequencies inference in Wgraph models
Wgraph refers to a general class of random graph models that can be seen as a random graph limit. It is characterized by both its graphon function and its motif frequencies. In this paper, relying on an existing variational Bayes algorithm for the stochastic block models along with the corresponding weights for model averaging, we derive an estimate of the graphon function as an average of stochastic block models with increasing number of blocks. In the same framework, we derive the variational posterior frequency of any motif. A simulation study and an illustration on a social network complete our work.

[hal01519743] The stochastic topic block model for the clustering of vertices in networks with textual edges
Due to the significant increase of communications between individuals via social media (Facebook, Twitter, Linkedin) or electronic formats (email, web, epublication) in the past two decades, network analysis has become a unavoidable discipline. Many random graph models have been proposed to extract information from networks based on persontoperson links only, without taking into account information on the contents. This paper introduces the stochastic topic block model (STBM), a probabilistic model for networks with textual edges. We address here the problem of discovering meaningful clusters of vertices that are coherent from both the network interactions and the text contents. A classification variational expectationmaximization (CVEM) algorithm is proposed to perform inference. Simulated data sets are considered in order to assess the proposed approach and to highlight its main features. Finally, we demonstrate the effectiveness of our methodology on two realword data sets: a directed communication network and a undirected coauthorship network.
vendredi 12 mai 2017

[hal01519891] Pattern detection and characterization for astronomical data through probabilistic modelling and statistical inference
The paper presents several problems coming from astronomy that may be tackled using probability theory and statistics. Due to the nature of data the common question of these problems is what is the pattern hidden in the data. The probabilistic framework allows a statistical and morphological description of these patterns.

[hal01521530] Old people, video games and french press: a topic model approach on a study about discipline, entertainment and selfimprovement.
Over the past few years, the French mainstream press has covered more and more consistently "silver gamers", those adults over sixty who play videogames. This article investigates the discursive and normative paradigms that underlie the unexpected enthusiasm of French mainstream press for older adults who play videogames. We use a topic model approach on a corpus of French articles that mention both older people and video games in order to identify topics, that is, sets of words related by their meanings and identified with a Bayesian statistical algorithm. We preface the topic modeling's conclusions with a discussion of the representations of older people and video games in French mainstream media. Our analysis explores how the French press' coverage of older people who play videogames simultaneously erases the moral panic about video games and reinforces the "successful ageing" discourse.

[hal01519850] Choosing the number of groups in a latent stochastic block model for dynamic networks
Latent stochastic block models are flexible statistical models that are widely used in social network analysis. In recent years, efforts have been made to extend these models to temporal dynamic networks, whereby the connections between nodes are observed at a number of different times. In this paper we extend the original stochastic block model by using a Markovian property to describe the evolution of nodes' cluster memberships over time. We recast the problem of clustering the nodes of the network into a modelbased context, and show that the integrated completed likelihood can be evaluated analytically for a number of likelihood models. Then, we propose a scalable greedy algorithm to maximise this quantity, thereby estimating both the optimal partition and the ideal number of groups in a single inferential framework. Finally we propose applications of our methodology to both real and artificial datasets.

[hal01519713] HiddenMarkov models for time series of continuous proportions with excess zeros
Bounded time series and time series of continuous proportions are often encountered in statistical modeling. Usually, they are addressed either by a logistic transformation of the data, or by specific probability distributions, such as Beta distribution. Nevertheless, these approaches may become quite tricky when the data show an overdispersion in 0 and/or 1. In these cases, the zeroand/orone Betainflated distributions, ZOIB, are preferred. This manuscript combines ZOIB distributions with hiddenMarkov models and proposes a flexible model, able to capture several regimes controlling the behavior of a time series of continuous proportions. For illustrating the practical interest of the proposed model, several examples on simulated data are given, as well as a case study on historical data, involving the military logistics of the Duchy of Savoy during the XVIth and the XVIIth centuries.
jeudi 11 mai 2017

[hal01520204] Asymptotics for Regression Models Under Loss of Identifiability
This paper discusses the asymptotic behavior of regression models under general conditions, especially if the dimensionality of the set of true parameters is larger than zero and the true model is not identifiable. Firstly, we give a general inequality for the difference of the sum of square errors (SSE) of the estimated regression model and the SSE of the theoretical true regression function in our model. A set of generalized derivative functions is a key tool in deriving such inequality. Under suitable Donsker condition for this set, we provide the asymptotic distribution for the difference of SSE. We show how to get this Donsker property for parametric models even though the parameters characterizing the best regression function are not unique. This result is applied to neural networks regression models with redundant hidden units when loss of identifiability occurs and gives some hints on how penalizing such models to avoid overfitting.

[hal01520201] ASSESSMENT OF THE INFLUENCE OF EDUCATION LEVEL ON VOTING INTENTION FOR THE EXTREME RIGHT IN FRANCE.
In France, the Front National has been a growing political party in the last 30 years. After years of stagnant economy, French voters have come to mistrust the political elite, and have been increas ingly receptive the Front National straighttalking approach. The most consistent findings in social research on ethnic attitudes is the negative association between educational attainment and ethnic prejudice: People with higher education are less prejudiced toward ethnic out groups than those with lower education. We might expect, then, that high education will generally prevent people from vot ing for the extreme right, regardless of their position in the labor market. This point of view, however, is not unanimous and there is an intellectual elite within the extreme right, especially among angry academic white males who are uneasy with the gains of feminism and believe in a leftwing media conspiracy. Hence the link between Extreme Right voting and education is not so obvious. The aim of this paper is first to build a model which takes into account the causality structure of variables (Bayesian Network) and secondly to assess the influence of education on the voting intention for extreme rightwing party by taking into consideration all the possible confounding variables.
mercredi 10 mai 2017

[hal01519723] Multidimensional urban segregation : an exploratory case study
Segregation phenomena have long been a concern for policy makers and urban planners, and much attention has been devoted to their study, especially in the fields of quantitative sociology and geography. Perhaps the most common example of urban segregation corresponds to different groups living in different neighbourhoods across a city, with very few neighbourhoods where all groups are represented in roughly the same proportions as in the whole city itself. The social groups in question are usually defined according to one variable: ethnic group, income category, religious group, electoral group, age... In this paper, we introduce a novel, multidimensional approach based on the SelfOrganizing Map algorithm (SOM). Working with public data available for the city of Paris, we illustrate how this method allows one to describe the complex interplay between social groups’ residential patterns and the geography of metropolitan facilities and services.

[hal01519718] Unequal time series clustering applied on flight data
Multiple signals are measured by sensors during a flight or a test bench and their analysis represent a big interest for engineers. These signals are actually multivariate time series created by the sensors present on the aircraft engines. Each of them can be decomposed into series of stabilized phases, well known by the experts, and transient phases. Transient phases are merely explored but they reveal a lot of information when the engine is running. The aim of our project is converting these time series into a succession of labels, designing transient and stabilized phases. This transformation of the data will allow to derive several perspectives: on one hand, tracking similar behaviours or patterns seen during a flight; on the other, discovering hidden structures. Labelling signals coming from the engines of the aircraft also helps in the detection of frequent or rare sequences during a flight. Statistical analysis and scoring are more convenient with this new representation. This manuscript proposes a methodology for automatically indexing all engine transient phases. First, the algorithm computes the start and the end points of each phase and builds a new database of transient patterns. Second, the transient patterns are clustered into a small number of typologies, which will provide the labels. The clustering is implemented with SelfOrganizing Maps [SOM]. All algorithms are applied on real flight measurements with a validation of the results from expert knowledge.

[hal01519710] Unsupervised learning for panel data
[...]

[hal01519707] Using SOMbrero for clustering and visualizing complex data
Over the years, the selforganizing map (SOM) algorithm was proven to be a powerful and convenient tool for clustering and visualizing data. While the original algorithm had been initially designed for numerical vectors, the available data in the applications became more and more complex, being frequently too rich to be described by a fixed set of numerical attributes only. This is the case, for example, when the data are described by relations between objects (individuals involved in a social network) or by measures of resemblance/dissemblance. This presentation will illustrate how the SOM algorithm can be used to cluster and visualize complex data such as graphs, categorical time series or panel data. In particular, it will focus on the use of the R package SOMbrero, which implements an online version of the relational selforganizing map, able to process any dissimilarity data. The package offers many graphical outputs and diagnostic tools, and comes with a userfriendly web graphical interface based on RShiny. Several examples on various realworld datasets will be given for highlighting the functionalities of the package.
samedi 3 juin 2017
vendredi 2 juin 2017
jeudi 1er juin 2017
mardi 30 mai 2017
jeudi 25 mai 2017
mercredi 24 mai 2017
lundi 22 mai 2017
vendredi 19 mai 2017
jeudi 18 mai 2017
mercredi 17 mai 2017
samedi 13 mai 2017
vendredi 12 mai 2017
jeudi 11 mai 2017
mercredi 10 mai 2017