Coloquio de Estadística y Ciencia de Datos de la Pontificia Universidad Católica de Chile
El departamento de estadística de la pontificia universidad católica de chile tiene unos de los cuerpos académicos más grandes y destacados de las universidades chilenas y latinoamericanas, en su búsqueda por la interrelación regional de los diferentes investigadores del área de estadística y afines, el departamento de estadística busca organizar un seminario local que permita conocer de cerca el trabajo realizado por los investigadores regionales, así como también conocer los problemas actuales en investigación de los académicos de la UC, con la intención de posibilitar puentes de futuras colaboraciones.
Daira Velandia. Universidad de Valparaíso Estimation methods for a Gaussian process under fixed domain asymptotics Sala 2 Abstract: This talk will address some inference tools for Gaussian random fields from the increasing domain and fixed domain asymptotic approaches. First, concepts and previous results are presented. Then, the results obtained after studying some extensions of the problem of estimating covariance parameters under the two asymptotic approaches named above are addressed.
Manuel González. Universidad de la Frontera Métodos de Regularización Aplicados a Problemas de Quimiometría Sala 2
Christian Caamaño . Universidad del Bio-Bio A flexible Clayton-like spatial copula with application to bounded support data. Sala 2 Abstract: The Gaussian copula is a powerful tool that has been widely used to model spatial and/or temporal correlated data with arbitrary marginal distribution. However, this kind of model can potentially be too restrictive since it expresses a reflection symmetric dependence. In this work, we propose a new spatial copula model that allows to obtain random fields with arbitrary marginal distribution with a type of dependence that can be reflection symmetric or not. Particularly, we propose a new random field with uniform marginal distribution, that can be viewed as a spatial generalization of the classical Clayton copula model. It is obtained through a power transformation of a specific instance of a beta random field which in turn is obtained using a transformation of two independent Gamma random fields. For the proposed random field we study the second-order properties and we provide analytic expressions for the bivariate distribution and its correlation. Finally, in the reflection symmetric case, we study the associated geometrical properties. As an application of the proposed model we focus on spatial modeling of data with bounded support. Specifically, we focus on spatial regression models with marginal distribution of the beta type. In a simulation study, we investigate the use of the weighted pairwise composite likelihood method for the estimation of this model. Finally, the effectiveness of our methodology is illustrated by analyzing point-referenced vegetation index data using the Gaussian copula as benchmark. Our developments have been implemented in an open-source package for the R statistical environment.
Keywords: Archimedean Copula, Beta random fields, Composite likelihood, Reflection Asymmetry.
Felipe Osorio. Utfsm Robust estimation in generalized linear models based on maximum Lq-likelihood procedure Sala Multiuso 1Er Piso, Edificio Felipe Villanueva Abstract: In this talk we propose a procedure for robust estimation in the context of generalized linear models based on the maximum Lq-likelihood method. Alongside this, an estimation algorithm that represents a natural extension of the usual iteratively weighted least squares method in generalized linear models is presented. It is through the discussion of the asymptotic distribution of the proposed estimator and a set of statistics for testing linear hypothesis that it is possible to define standardized residuals using the mean-shift outlier model. In addition, robust versions of deviance function and the Akaike information criterion are defined with the aim of providing tools for model selection. Finally, the performance of the proposed methodology is illustrated through a simulation study and analysis of a real dataset.
Hamdi Raissi. Pontificia Universidad Católica de Valparaíso Analysis of stocks with time-varying illiquidity levels Sala Multiuso 1Er Piso/2 Abstract: The first and higher order serial correlations of illiquid stock's price changes are studied, allowing for unconditional heteroscedasticity and time-varying zero returns probability. The dependence structure of the categorical trade/no trade sequence is also studied. Depending on the set up, we investigate how the dependence measures can be accommodated, to deliver an accurate representation of the price changes serial correlations. We shed some light on the properties of the different tools, by means of Monte Carlo experiments. The theoretical arguments are illustrated considering shares from the Chilean stock market and the intraday returns of the Facebook stock.
2023-11-03 15:00 horashrs.
Ramsés Mena. Universidad Nacional Autonoma de México Random probability measures via dependent stick-breaking priors Sala Multiuso 1Er Piso/2 Abstract: I will present a general class of stick-breaking processes with either exchangeable or Markovian length variables. This class generalizes well-known Bayesian nonparametric priors in an unexplored direction. An appealing feature of such a new family of nonparametric priors is that we are able to modulate the stochastic ordering of the weights and recover Dirichlet and Geometric priors as extreme cases. A general formula for the distribution of the latent allocation variables is derived and an MCMC algorithm is proposed for density estimation purposes.
Alfredo Alegria. Universidad Técnica Federico Santa Maria Algoritmos de Simulación y Modelación de Covarianza para Campos Aleatorios en Esferas sala de usos múltiples, 2do. piso Edificio Felipe Villanueva Abstract: Los campos aleatorios en esferas desempeñan un papel fundamental en diversas ciencias naturales. Esta presentación aborda dos aspectos clave de manera integrada: algoritmos de simulación y modelación de covarianza para campos aleatorios definidos en la esfera unitaria d-dimensional. Introducimos un algoritmo de simulación, inspirado en el método de bandas rotantes espectrales utilizado en espacios Euclidianos. Este algoritmo genera de manera eficiente campos aleatorios Gaussianos en la esfera, utilizando ondas de Gegenbauer. Por otro lado, exploramos el modelado de la función de covarianza, centrándonos en los desafíos de modelar datos globales sobre la superficie de la Tierra. La familia convencional de funciones de covarianza isotrópicas de Matérn, aunque ampliamente utilizada, enfrenta limitaciones al modelar datos suaves en la esfera debido a restricciones en el parámetro de suavidad. Para abordar esto, proponemos una nueva familia de funciones de covarianza isotrópica adaptada para campos aleatorios esféricos. Esta familia flexible introduce un parámetro que rige la diferenciabilidad en media cuadrática y permite una variedad de dimensiones fractales. Esta presentación destacará las implicaciones prácticas de estos avances a través de experimentos de simulación y aplicaciones con datos reales.
2023-09-29 11:00 horashrs.
Fernando Quintana. Pontificia Universidad Católica de Chile Childhood obesity in Singapore: A Bayesian nonparametric approach Sala 1 Abstract: Overweight and obesity in adults are known to be associated with increased risk of metabolic and cardiovascular diseases. Obesity has now reached epidemic proportions, increasingly affecting children. Therefore, it is important to understand if this condition persists from early life to childhood and if different patterns can be detected to inform intervention policies. Our motivating application is a study of temporal patterns of obesity in children from South Eastern Asia. Our main focus is on clustering obesity patterns after adjusting for the effect of baseline information. Specifically, we consider a joint model for height and weight over time. Measurements are taken every six months from birth. To allow for data-driven clustering of trajectories, we assume a vector autoregressive sampling model with a dependent logit stick-breaking prior. Simulation studies show good performance of the proposed model to capture overall growth patterns, as compared to other alternatives.We also fit the model to the motivating dataset, and discuss the results, in particular highlighting cluster differences and interpretation.
Pedro Ramos. Pontificia Universidad Católica de Chile A generalized closed-form maximum likelihood estimator Sala Multiuso 1Er Piso/2 Abstract: The maximum likelihood estimator plays a fundamental role in statistics. However, for many models, the estimators do not have closed-form expressions. This limitation can be significant in situations where estimates and predictions need to be computed in real-time, such as in applications based on embedded technology, in which numerical methods can not be implemented. Here we provide a generalization in the maximum likelihood estimator that allows us to obtain the estimators in closed-form expressions under some conditions. Under mild conditions, the estimator is invariant under one-to-one transformations, strongly consistent, and has an asymptotic normal distribution. The proposed generalized version of the maximum likelihood estimator is illustrated on the Gamma, Nakagami, and Beta distributions and compared with the standard maximum likelihood estimator.
2023-09-01 12:30 horashrs.
Jonathan Acosta Salazar. Pontificia Universidad Católica de Chile Assessing the Estimation of Nearly Singular Covariance Matrices for Modeling Spatial Variables Sala Multiuso 1Er Piso/2, Facultad de Matemáticas Abstract: Spatial analysis commonly relies on the estimation of a covariance matrix associated with a random field. This estimation strongly impacts the prediction where the process has not been observed, which in turn influences the construction of more sophisticated models. If some of the distances between all the possible pairs of observations in the plane are small, then we may have an ill-conditioned problem that results in a nearly singular covariance matrix. In this paper, we suggest a covariance matrix estimation method that works well even when there are very close pairs of locations on the plane. Our method is an extension to a spatial case of a method that is based on the estimation of eigenvalues of the unitary matrix decomposition of the covariance matrix. Several numerical examples are conducted to provide evidence of good performance in estimating the range parameter of the correlation structure of a spatial regression process. In addition, an application to macroalgae estimation in a restricted area of the Pacific Ocean is developed to determine a suitable estimation of the effective sample size associated with the transect sampling scheme.
Riccardo Corradin. University of Nottingham A journey through model-based clustering with intractable distributions Sala 2 Abstract: Model-based clustering represents one of the fundamental procedures in a statistician's toolbox. Within the model-based clustering framework, we consider the case where the kernel distribution of nonparametric mixture models is available only up to an intractable normalizing constant, in which most of the commonly used Markov chain Monte Carlo methods fail to provide posterior inference. To overcome this problem, we propose an approximate Bayesian computational strategy, whereby we approximate the posterior to avoid the intractability of the kernel. By exploiting the structure of the nonparametric prior, our proposal combines the use of predictive distributions as a proposal with transport maps to obtain an efficient and flexible sampling strategy. Further, we illustrate how the specification of our proposal can be relaxed by introducing an adaptive scheme on the degree of approximation of the posterior distribution. Empirical evidence from simulation studies shows that our proposal outperforms its main competitors in terms of computational times while preserving comparable accuracy of the estimates.
Ricardo Cunha Pedroso. Universidade Federal de Minas Gerais Multipartition model for multiple change point identification Sala 1 Abstract: The product partition model (PPM) is widely used for detecting multiple change points. Because changes in different parameters may occur at different times, the PPM fails to identify which parameters experienced the changes. To solve this limitation, we introduce a multipartition model to detect multiple change points occurring in several parameters. It assumes that changes experienced by each parameter generate a different random partition along the time axis, which facilitates identifying those parameters that changed and the time when they do so. We apply our model to detect multiple change points in Normal means and variances. Simulations and data illustrations show that the proposed model is competitive and enriches the analysis of change point problems.
Victor Hugo Lachos Davila. Department of Statistics, University of Connecticut Lasso regularization for censored regression and high-dimensional predictors Sala 1 Abstract: The censored regression model, also known as the Tobit model, is designed to estimate linear relationships between variables in cases where the dependent variable is either left- or right-censored. In this study, we propose a heuristic expectation–maximization (EM) algorithm for handling censored regression models with Lasso regularization for variable selection, and to accommodate high-dimensional predictors. The procedure is computationally efficient and easily implemented. We describe how this technique can be easily implemented using available R packages. The proposed methods are assessed using simulated and two real datasets, in cases where p is less or equal, or greater than n.
Cristóbal Guzmán. PUC Advances in Differentially Prívate Stochastic Saddle Points. Sala 2
Matteo Gianella. Dipartimento Di Matematica, Politecnico Di Milano (Gaussian) graphical models: two examples in Bayesian Statistics Sala 2 Abstract: Graphical models are a powerful tool when the aim is to study dependencies between random variables. In this talk, we give two examples of their use under the Bayesian paradigm. One is motivated by the analysis of spectrometric data. We introduce a Gaussian graphical model for learning the dependence structure among frequency bands of the infrared absorbance spectrum. The spectra are modeled as continuous functional data through a B-spline basis expansion, while a Gaussian graphical model is assumed as a prior specification for the smoothing coefficients to induce sparsity in the associated precision matrix. The second example focuses on areal data, where the neighbouring structure is modeled via an undirected graph. We extend the model in Beraha et al. (2021) to jointly perform density estimation and boundary detection.
Francisco Cuevas. Universidad Técnica Federico Santa María. Modelando la función de intensidad de un patrón puntual espacial usando un modelo de log-convolución Sala 2 Rolando Chuaqui Abstract:
En el análisis de patrones puntuales, estimar la función de intensidad de primer orden es muy importante, y es por esto que diferentes perpectivas han sido adoptadas. En este trabajo, y motivados por datos de las sacadas oculares, introducimos un modelo para el logaritmo de la intensidad basada en la convolución entre una covariable y una función a estimar, β(·). Mostramos que, basados en el análisis de Fourier, el problema de estimar β(·) es equivalente a estimar infinitos par ?ametros. Luego de truncar, proponemos un método de estimación penalizado para resolver este problema.