-
Christensen, Dennis; Stoltenberg, Emil Aas & Hjort, Nils Lid
(2024)
Theory for adaptive designs in regression
arXiv.org.
-
Lee, Adam & Mesters, Geert
(2024)
Locally robust inference for non-Gaussian linear simultaneous equations models
-
Moss, Jonas
(2024)
Measures of Agreement with Multiple Raters: Fréchet Variances and Inference
-
Foldnes, Njål; Moss, Jonas & Grønneberg, Steffen
(2024)
Improved Goodness of Fit Procedures for Structural Equation Models
-
Yang, Wei-Ting; Tamssaouet, Karim & Dauzère-Pérès, Stéphane
(2024)
Bayesian network structure learning using scatter search
-
Hoesch, Lukas; Lee, Adam & Mesters, Geert
(2024)
Locally Robust Inference for Non-Gaussian SVAR models
-
Lee, Daesoo; Malacarne, Sara & Aune, Erlend
(2024)
Explainable time series anomaly detection using masked latent generative modeling
-
Miroshnychenko, Ivan; Vocalelli, Giorgio, De Massis, Alfredo, Grassi, Stefano & Ravazzolo, Francesco
(2023)
The COVID-19 pandemic and family business performance
Vis sammendrag
This study examines the impact of the COVID-19 pandemic on corporate financial performance using a unique, cross-country, and longitudinal sample of 3350 listed firms worldwide. We find that the financial performance of family firms has been significantly higher than that of nonfamily firms during the COVID-19 pandemic, accounting for pre-pandemic business conditions. This effect is pertinent to firms with strong family involvement in management or in both management and ownership. We also identify the role of firm-, industry-, and country-level contingencies for family business financial performance during the COVID-19 pandemic. This study offers a novel understanding of the financial resilience across different types of family business and sets an agenda for future research on the drivers of resilience of family firms to adverse events. It also provides important and novel evidence for policymakers, particularly for firms with different ownership and management structures.
-
Høst, Anders Mølmen; Lison, Pierre & Moonen, Leon
(2023)
Constructing a Knowledge Graph from Textual Descriptions of Software Vulnerabilities in the National Vulnerability Database
Vis sammendrag
Knowledge graphs have shown promise for several cybersecurity tasks, such as vulnerability assessment and threat analysis. In this work, we present a new method for constructing a vulnerability knowledge graph from information in the National Vulnerability Database (NVD). Our approach combines named entity recognition (NER), relation extraction (RE), and entity prediction using a combination of neural models, heuristic rules, and knowledge graph embeddings. We demonstrate how our method helps to fix missing entities in
knowledge graphs used for cybersecurity and evaluate the performance.
-
Asimakopoulos, Stylianos; Lorusso, Marco & Ravazzolo, Francesco
(2023)
A Bayesian DSGE approach to modelling cryptocurrency
Vis sammendrag
We develop and estimate a DSGE model to evaluate the economic repercussions of cryptocurrency. In our model, cryptocurrency offers an alternative currency option to government currency, with endogenous supply and demand. We uncover a substitution effect between the real balances of government currency and cryptocurrency in response to technology, preferences and monetary policy shocks. We find that an increase in cryptocurrency productivity induces a rise in the relative price of government currency with respect to cryptocurrency. Since cryptocurrency and government currency are highly substitutable, the demand for the former increases whereas it drops for the latter. Our historical decomposition analysis shows that fluctuations in the cryptocurrency price are mainly driven by shocks in cryptocurrency demand, whereas changes in the real balances for government currency are mainly attributed to government currency and cryptocurrency demand shocks.
-
Juelsrud, Ragnar Enger & Larsen, Vegard Høghaug
(2023)
Macroeconomic uncertainty and bank lending
Vis sammendrag
We investigate the impact of macro-related uncertainty on bank lending in Norway. We show that an increase in general macroeconomic uncertainty reduces bank lending. Importantly, however, we show that this effect is largely driven by monetary policy uncertainty, suggesting that uncertainty about the monetary policy stance is key for understanding why macro-related uncertainty impacts bank lending.
-
Moss, Jonas & Grønneberg, Steffen
(2023)
Partial Identification of Latent Correlations with Ordinal Data
Vis sammendrag
The polychoric correlation is a popular measure of association for ordinal data. It estimates a latent correlation, i.e., the correlation of a latent vector. This vector is assumed to be bivariate normal, an assumption that cannot always be justified. When bivariate normality does not hold, the polychoric correlation will not necessarily approximate the true latent correlation, even when the observed variables have many categories. We calculate the sets of possible values of the latent correlation when latent bivariate normality is not necessarily true, but at least the latent marginals are known. The resulting sets are called partial identification sets, and are shown to shrink to the true latent correlation as the number of categories increase. Moreover, we investigate partial identification under the additional assumption that the latent copula is symmetric, and calculate the partial identification set when one variable is ordinal and another is continuous. We show that little can be said about latent correlations, unless we have impractically many categories or we know a great deal about the distribution of the latent vector. An open-source R package is available for applying our results.
-
Foroni, Claudia; Ravazzolo, Francesco & Rossini, Luca
(2023)
Are low frequency macroeconomic variables important for high frequency electricity prices?
Vis sammendrag
Recent research finds that forecasting electricity prices is very relevant. In many applications, it might be interesting to predict daily electricity prices by using their own lags or renewable energy sources. However, the recent turmoil of energy prices and the Russian–Ukrainian war increased attention in evaluating the relevance of industrial production and the Purchasing Managers’ Index output survey in forecasting the daily electricity prices. We develop a Bayesian reverse unrestricted MIDAS model which accounts for the mismatch in frequency between the daily prices and the monthly macro variables in Germany and Italy. We find that the inclusion of macroeconomic low frequency variables is more important for short than medium term horizons by means of point and density measures. In particular, accuracy increases by combining hard and soft information, while using only surveys gives less accurate forecasts than using only industrial production data.
-
Casarin, Roberto; Grassi, Stefano, Ravazzolo, Francesco & van Dijk, Herman K.
(2023)
A flexible predictive density combination for large financial data sets in regular and crisis periods
Vis sammendrag
A flexible predictive density combination is introduced for large financial data sets which allows for model set incompleteness. Dimension reduction procedures that include learning allocate the large sets of predictive densities and combination weights to relatively small subsets. Given the representation of the probability model in extended nonlinear state-space form, efficient simulation-based Bayesian inference is proposed using parallel dynamic clustering as well as nonlinear filtering, implemented on graphics processing units. The approach is applied to combine predictive densities based on a large number of individual US stock returns of daily observations over a period that includes the Covid-19 crisis period. Evidence on dynamic cluster composition, weight patterns and model set incompleteness gives valuable signals for improved modelling. This enables higher predictive accuracy and better assessment of uncertainty and risk for investment fund management.
-
Langguth, Johannes; Schroeder, Daniel Thilo, Filkukova, Petra, Brenner, Stefan, Phillips, Jesper & Pogorelov, Konstantin
(2023)
COCO: an annotated Twitter dataset of COVID-19 conspiracy theories
Vis sammendrag
The COVID-19 pandemic has been accompanied by a surge of misinformation on social media which covered a wide range of different topics and contained many competing narratives, including conspiracy theories. To study such conspiracy theories, we created a dataset of 3495 tweets with manual labeling of the stance of each tweet w.r.t. 12 different conspiracy topics. The dataset thus contains almost 42,000 labels, each of which determined by majority among three expert annotators. The dataset was selected from COVID-19 related Twitter data spanning from January 2020 to June 2021 using a list of 54 keywords. The dataset can be used to train machine learning based classifiers for both stance and topic detection, either individually or simultaneously. BERT was used successfully for the combined task. The dataset can also be used to further study the prevalence of different conspiracy narratives. To this end we qualitatively analyze the tweets, discussing the structure of conspiracy narratives that are frequently found in the dataset. Furthermore, we illustrate the interconnection between the conspiracy categories as well as the keywords.
-
Iwaszkiewicz-Eggebrecht, Elzbieta; Ronquist, Fredrik, Łukasik, Piotr, Granqvist, Emma, Buczek, Mateusz, Prus, Monika, Kudlicka, Jan, Roslin, Tomas, Tack, Ayco J. M., Andersson, Anders F. & Miraldo, Andreia
(2023)
Optimizing insect metabarcoding using replicated mock communities
Vis sammendrag
Metabarcoding (high-throughput sequencing of marker gene amplicons) has emerged as a promising and cost-effective method for characterizing insect community samples. Yet, the methodology varies greatly among studies and its performance has not been systematically evaluated to date. In particular, it is unclear how accurately metabarcoding can resolve species communities in terms of presence-absence, abundance and biomass.
Here we use mock community experiments and a simple probabilistic model to evaluate the effect of different DNA extraction protocols on metabarcoding performance. Specifically, we ask four questions: (Q1) How consistent are the recovered community profiles across replicate mock communities?; (Q2) How does the choice of lysis buffer affect the recovery of the original community?; (Q3) How are community estimates affected by differing lysis times and homogenization? and (Q4) Is it possible to obtain adequate species abundance estimates through the use of biological spike-ins?
We show that estimates are quite variable across community replicates. In general, a mild lysis protocol is better at reconstructing species lists and approximate counts, while homogenization is better at retrieving biomass composition. Small insects are more likely to be detected in lysates, while some tough species require homogenization to be detected. Results are less consistent across biological replicates for lysates than for homogenates. Some species are associated with strong PCR amplification bias, which complicates the reconstruction of species counts. Yet, with adequate spike-in data, species abundance can be determined with roughly 40% standard error for homogenates, and with roughly 50% standard error for lysates, under ideal conditions. In the latter case, however, this often requires species-specific reference data, while spike-in data generalize better across species for homogenates.
We conclude that a non-destructive, mild lysis approach shows the highest promise for the presence/absence description of the community, while also allowing future morphological or molecular work on the material. However, homogenization protocols perform better for characterizing community composition, in particular in terms of biomass.
-
Bashiri Behmiri, Niaz; Fezzi, Carlo & Ravazzolo, Francesco
(2023)
Incorporating air temperature into mid-term electricity load forecasting models using time-series regressions and neural networks
Vis sammendrag
One of the most controversial issues in the mid-term load forecasting literature is the treatment of weather. Because of the difficulty in obtaining precise weather forecasts for a few weeks ahead, researchers have, so far, implemented three approaches: a) excluding weather from load forecasting models altogether, b) assuming future weather to be perfectly known and c) including weather forecasts in their load forecasting models. This article provides the first systematic comparison of how the different treatments of weather affect load forecasting performance. We incorporate air temperature into short- and mid-term load forecasting models, comparing time-series methods and feed-forward neural networks. Our results indicate that models including future temperature always significantly outperform models excluding temperature, at all-time horizons. However, when future temperature is replaced with its prediction, these results become weaker.
-
Billé, Anna Gloria; Tomelleri, Alessio & Ravazzolo, Francesco
(2023)
Forecasting regional GDPs: a comparison with spatial dynamic panel data models
Vis sammendrag
The monitoring of the regional (provincial) economic situation is of particular importance due to the high level of heterogeneity and interdependences among different territories. Although econometric models allow for spatial and serial correlation of various kinds, the limited availability of territorial data restricts the set of relevant predictors at a more disaggregated level, especially for gross domestic product (GDP). Combining data from different sources at NUTS-3 level, this paper evaluates the predictive performance of a spatial dynamic panel data model with individual fixed effects and some relevant exogenous regressors, by using data on total gross value added (GVA) for 103 Italian provinces over the period 2000–2016. A comparison with nested panel sub-specifications as well as pure temporal autoregressive specifications has also been included. The main finding is that the spatial dynamic specification increases forecast accuracy more than its competitors throughout the out-of-sample, recognising an important role played by both space and time. However, when temporal cointegration is detected, the random-walk specification is still to be preferred in some cases even in the presence of short panels.
-
Galdi, Giulio; Casarin, Roberto, Ferrari, Davide, Fezzi, Carlo & Ravazzolo, Francesco
(2023)
Nowcasting industrial production using linear and non-linear models of electricity demand
Vis sammendrag
This article proposes different modelling approaches which exploit electricity market data to nowcast industrial production. Our models include linear, mixed-data sampling (MIDAS), Markov-Switching (MS) and MS-MIDAS regressions. Comparisons against autoregressive approaches and other commonly used macroeconomic predictors show that electricity market data combined with an MS model significantly improve nowcasting performance, especially during turbulent economic states, such as those generated by the recent COVID-19 pandemic. The most promising results are provided by an MS model which identifies two volatility regimes. These results confirm that electricity market data provide timely and easy-to-access information for nowcasting macroeconomic variables, especially when it is most valuable, i.e. during times of crisis and uncertainty.
-
Haugsdal, Espen; Aune, Erlend & Ruocco, Massimiliano
(2023)
Persistence Initialization: a novel adaptation of the Transformer architecture for time series forecasting
Vis sammendrag
Time series forecasting is an important problem, with many real world applications. Transformer models have been successfully applied to natural language processing tasks, but have received relatively little attention for time series forecasting. Motivated by the differences between classification tasks and forecasting, we propose PI-Transformer, an adaptation of the Transformer architecture designed for time series forecasting, consisting of three parts: First, we propose a novel initialization method called Persistence Initialization, with the goal of increasing training stability of forecasting models by ensuring that the initial outputs of an untrained model are identical to the outputs of a simple baseline model. Second, we use ReZero normalization instead of Layer Normalization, in order to further tackle issues related to training stability. Third, we use Rotary positional encodings to provide a better inductive bias for forecasting. Multiple ablation studies show that the PI-Transformer is more accurate, learns faster, and scales better than regular Transformer models. Finally, PI-Transformer achieves competitive performance on the challenging M4 dataset, both when compared to the current state of the art, and to recently proposed Transformer models for time series forecasting.
-
Fronzetti Colladon, Andrea; Grippa, Francesca, Guardabascio, Barbara, Costante, Gabriele & Ravazzolo, Francesco
(2023)
Forecasting consumer confidence through semantic network analysis of online news
Vis sammendrag
This research studies the impact of online news on social and economic consumer perceptions through semantic network analysis. Using over 1.8 million online articles on Italian media covering four years, we calculate the semantic importance of specific economic-related keywords to see if words appearing in the articles could anticipate consumers’ judgments about the economic situation and the Consumer Confidence Index. We use an innovative approach to analyze big textual data, combining methods and tools of text mining and social network analysis. Results show a strong predictive power for the judgments about the current households and national situation. Our indicator offers a complementary approach to estimating consumer confidence, lessening the limitations of traditional survey-based methods.
-
Andrade Mancisidor, Rogelio; Kampffmeyer, Michael Christian, Aas, Kjersti & Jenssen, Robert
(2023)
Discriminative multimodal learning via conditional priors in generative models
Vis sammendrag
Deep generative models with latent variables have been used lately to learn joint representations and generative processes from multi-modal data, which depict an object from different viewpoints. These two learning mechanisms can, however, conflict with each other and representations can fail to embed information on the data modalities. This research studies the realistic scenario in which all modalities and class labels are available for model training, e.g. images or handwriting, but where some modalities and labels required for downstream tasks are missing, e.g. text or annotations. We show, in this scenario, that the variational lower bound limits mutual information between joint representations and missing modalities. We, to counteract these problems, introduce a novel conditional multi-modal discriminative model that uses an informative prior distribution and optimizes a likelihood-free objective function that maximizes mutual information between joint representations and missing modalities. Extensive experimentation demonstrates the benefits of our proposed model, empirical results show that our model achieves state-of-the-art results in representative problems such as downstream classification, acoustic inversion, and image and annotation generation.
-
Larsen, Vegard Høghaug; Maffei-Faccioli, Nicolo & Pagenhardt, Laura
(2023)
Where do they care? The ECB in the media and inflation expectations
Vis sammendrag
This paper examines how news coverage of the European Central Bank (ECB) affects consumer inflation expectations in the four largest euro area countries. Utilizing a unique dataset of multilingual European news articles, we measure the impact of ECB-related inflation news on inflation expectations. Our results indicate that German and Italian consumers are more attentive to this news, whereas in Spain and France, we observe no significant response. The research underscores the role of national media in disseminating ECB messages and the diverse reactions among consumers in different euro area countries.
-
Al-Bataineh, Omar; Moonen, Leon & Vidziunas, Linas
(2023)
Extending the range of bugs that automated program repair can handle
-
Durante, F.; Gatto, A. & Ravazzolo, Francesco
(2023)
Understanding relationships with the Aggregate Zonal Imbalance using copulas
-
Grishina, Anastasiia; Hort, Max & Moonen, Leon
(2023)
The EarlyBIRD Catches the Bug: On Exploiting Early Layers of Encoder Models for More Efficient Code Classification
-
Malik, Sehrish; Naqvi, Moeen & Moonen, Leon
(2023)
CHESS: A Framework for Evaluation of Self-adaptive Systems based on Chaos Engineering
-
Liventsev, Vadim; Grishina, Anastasiia, Härmä, Aki & Moonen, Leon
(2023)
Fully Autonomous Programming with Large Language Models
-
Moss, Jonas
(2023)
Measuring Agreement Using Guessing Models and Knowledge Coefficients
Vis sammendrag
Several measures of agreement, such as the Perreault–Leigh coefficient, the AC1
, and the recent coefficient of van Oest, are based on explicit models of how judges make their ratings. To handle such measures of agreement under a common umbrella, we propose a class of models called guessing models, which contains most models of how judges make their ratings. Every guessing model have an associated measure of agreement we call the knowledge coefficient. Under certain assumptions on the guessing models, the knowledge coefficient will be equal to the multi-rater Cohen’s kappa, Fleiss’ kappa, the Brennan–Prediger coefficient, or other less-established measures of agreement. We provide several sample estimators of the knowledge coefficient, valid under varying assumptions, and their asymptotic distributions. After a sensitivity analysis and a simulation study of confidence intervals, we find that the Brennan–Prediger coefficient typically outperforms the others, with much better coverage under unfavorable circumstances.
-
Baltodano López, Ovielt; Bulfone, Giacomo, Casarin, Roberto & Ravazzolo, Francesco
(2023)
Modeling Corporate CDS Spreads Using Markov Switching Regressions
-
Lee, Daesoo; Malacarne, Sara & Aune, Erlend
(2023)
Vector Quantized Time Series Generation with a Bidirectional Prior Model
Proceedings of Machine Learning Research (PMLR), 206, p. 7665-7693.
Vis sammendrag
Time series generation (TSG) studies have
mainly focused on the use of Generative Adversarial Networks (GANs) combined with recurrent neural network (RNN) variants. However, the fundamental limitations and challenges
of training GANs still remain. In addition,
the RNN-family typically has difficulties with
temporal consistency between distant timesteps.
Motivated by the successes in the image generation (IMG) domain, we propose TimeVQVAE,
the first work, to our knowledge, that uses vector quantization (VQ) techniques to address the
TSG problem. Moreover, the priors of the discrete latent spaces are learned with bidirectional
transformer models that can better capture global
temporal consistency. We also propose VQ modeling in a time-frequency domain, separated into
low-frequency (LF) and high-frequency (HF).
This allows us to retain important characteristics of the time series and, in turn, generate
new synthetic signals that are of better quality, with sharper changes in modularity, than
its competing TSG methods. Our experimental evaluation is conducted on all datasets from
the UCR archive, using well-established metrics
in the IMG literature, such as Frechet inception ´
distance and inception scores.
-
Lee, Daesoo; Ovanger, Oscar, Eidsvik, Jo, Aune, Erlend, Skauvold, Jacob & Hauge, Ragnar
(2023)
Latent Diffusion Model for Conditional Reservoir Facies Generation
arXiv.
Vis sammendrag
https://arxiv.org/pdf/2311.01968.pdf
-
Hort, Max; Grishina, Anastasiia & Moonen, Leon
(2023)
An Exploratory Literature Study on Sharing and Energy Use of Language Models for Source Code
-
Stoltenberg, Emil Aas; Mykland, Per A. & Zhang, Lan
(2022)
A CLT FOR SECOND DIFFERENCE ESTIMATORS WITH AN APPLICATION TO VOLATILITY AND INTENSITY
Vis sammendrag
In this paper, we introduce a general method for estimating the quadratic covariation of one or more spot parameter processes associated with continuous time semimartingales, and present a central limit theorem that has this class of estimators as one of its applications. The class of estimators we introduce, that we call Two-Scales Quadratic Covariation (
TSQC
) estimators, is based on sums of increments of second differences of the observed processes, and the intervals over which the differences are computed are rolling and overlapping. This latter feature lets us take full advantage of the data, and, by sufficiency considerations, ought to outperform estimators that are based on only one partition of the observational window. Moreover, a two-scales approach is employed to deal with asymptotic bias terms in a systematic manner, thus automatically giving consistent estimators without having to work out the form of the bias term on a case-to-case basis. We highlight the versatility of our central limit theorem by applying it to a novel leverage effect estimator that does not belong to the class of
TSQC
estimators. The principal empirical motivation for the present study is that the discrete times at which a continuous time semimartingale is observed might depend on features of the observable process other than its level, such as its spot-volatility process. As an application of the
TSQC
estimators, we therefore show how it may be used to estimate the quadratic covariation between the spot-volatility process and the intensity process of the observation times, when both of these are taken to be semimartingales. The finite sample properties of this estimator are studied by way of a simulation experiment, and we also apply this estimator in an empirical analysis of the Apple stock. Our analysis of the Apple stock indicates a rather strong correlation between the spot volatility process of the log-prices process and the times at which this stock is traded and hence observed.
-
Fronzetti Colladon, Andrea; Grassi, Stefano, Ravazzolo, Francesco & Violante, Francesco
(2022)
Forecasting financial markets with semantic network analysis in the COVID-19 crisis
Vis sammendrag
This paper uses a new textual data index for predicting stock market data. The index is applied to a large set of news to evaluate the importance of one or more general economic-related keywords appearing in the text. The index assesses the importance of the economic-related keywords, based on their frequency of use and semantic network position. We apply it to the Italian press and construct indices to predict Italian stock and bond market returns and volatilities in a recent sample period, including the COVID-19 crisis. The evidence shows that the index captures the different phases of financial time series well. Moreover, results indicate strong evidence of predictability for bond market data, both returns and volatilities, short and long maturities, and stock market volatility.
-
Huber, Andreas; Schröder, Daniel Thilo, Pogorelov, Konstantin, Griwodz, Carsten & Langguth, Johannes
(2022)
A Streaming System for Large-scale Temporal Graph Mining of Reddit Data
-
Lundén, Daniel; Öhman, Joey, Kudlicka, Jan, Senderov, Viktor, Ronquist, Fredrik & Broman, David
(2022)
Compiling Universal Probabilistic Programming Languages with Efficient Parallel Sequential Monte Carlo Inference
Vis sammendrag
Probabilistic programming languages (PPLs) allow users to encode arbitrary inference problems, and PPL implementations provide general-purpose automatic inference for these problems. However, constructing inference implementations that are efficient enough is challenging for many real-world problems. Often, this is due to PPLs not fully exploiting available parallelization and optimization opportunities. For example, handling probabilistic checkpoints in PPLs through continuation-passing style transformations or non-preemptive multitasking—as is done in many popular PPLs—often disallows compilation to low-level languages required for high-performance platforms such as GPUs. To solve the checkpoint problem, we introduce the concept of PPL control-flow graphs (PCFGs)—a simple and efficient approach to checkpoints in low-level languages. We use this approach to implement RootPPL: a low-level PPL built on CUDA and C++ with OpenMP, providing highly efficient and massively parallel SMC inference. We also introduce a general method of compiling universal high-level PPLs to PCFGs and illustrate its application when compiling Miking CorePPL—a high-level universal PPL—to RootPPL. The approach is the first to compile a universal PPL to GPUs with SMC inference. We evaluate RootPPL and the CorePPL compiler through a set of real-world experiments in the domains of phylogenetics and epidemiology, demonstrating up to 6× speedups over state-of-the-art PPLs implementing SMC inference.
-
Gianfreda, Angelica; Ravazzolo, Francesco & Rossini, Luca
(2022)
Large Time-Varying Volatility Models for Hourly Electricity Prices*
Vis sammendrag
We study the importance of time-varying volatility in modelling hourly electricity prices when fundamental drivers are included in the estimation. This allows us to contribute to the literature of large Bayesian VARs by using well-known time series models in a large dimension for the matrix of coefficients. Based on novel Bayesian techniques, we exploit the importance of both Gaussian and non-Gaussian error terms in stochastic volatility. We find that using regressors as fuel prices, forecasted demand and forecasted renewable energy is essential to properly capture the volatility of these prices. Moreover, we show that the time-varying volatility models outperform the constant volatility models in both the in-sample model-fit and the out-of-sample forecasting performance.
-
Ivanovska, Magdalena & Slavkovik, Marija
(2022)
Probabilistic Judgement Aggregation by Opinion Update
Vis sammendrag
We consider a situation where agents are updating their probabilistic opinions on a set of issues with respect to the confidence they have in each other’s judgements. We adapt the framework for reaching a consensus introduced in [2] and modified in [1] to our case of uncertain probabilistic judgements on logically related issues. We discuss possible alternative solutions for the instances where the requirements for reaching a consensus are not satisfied.
-
Straume, Hans-Martin; Asche, Frank, Oglend, Atle, Abrahamsen, Eirik Bjorheim, Birkenbach, Anna M., Langguth, Johannes, Lanquepin, Guillaume & Roll, Kristin Helen
(2022)
Impacts of Covid-19 on Norwegian salmon exports: A firm-level analysis
Vis sammendrag
A rapidly growing literature investigates how the recent Covid-19 pandemic has affected international seafood trade along multiple dimensions, creating opportunities as well as challenges. This suggests that many of the impacts of the Covid measures are subtle and require disaggregated data to allow the impacts in different supply chains to be teased out. In aggregate, Norwegian salmon exports have not been significantly impacted by Covid-related measures. Using firm-level data to all export destinations to examine the effects of lockdowns in different destination countries in 2020, we show that the Covid-related lockdown measures significantly impacted trade patterns for four product forms of salmon. The results also illustrate how the Covid measures create opportunities, as increased stringency of the measures increased trade for two of the product forms. We also find significant differences among firms' responses, with large firms with larger trade networks reacting more strongly to the Covid measures. The limited overall impacts and the significant dynamics at the firm level clearly show the resiliency of the salmon supply chains.
-
Yang, Wei-Ting; Reis, Marco, Borodin, Valeria, Juge, Michel & Roussy, Agnès
(2022)
An interpretable unsupervised Bayesian network model for fault detection and diagnosis
Vis sammendrag
Process monitoring is a critical activity in manufacturing industries. A wide variety of data-driven approaches have been developed and employed for fault detection and fault diagnosis. Analyzing the existing process monitoring schemes, prediction accuracy of the process status is usually the primary focus while the explanation (diagnosis) of a detected fault is relegated to a secondary role. In this paper, an interpretable unsupervised machine learning model based on Bayesian Networks (BN) is proposed to be the fundamental model supporting the process monitoring scheme. The proposed methodology is aligned with the recent efforts of eXplanatory Artificial Intelligence (XAI) for knowledge induction and decision making, now brought to the scope of advanced process monitoring. A BN is capable of combining data-driven induction with existing domain knowledge about the process and to display the underlying causal interactions of a process system in an easily interpretable graphical form. The proposed fault detection scheme consists of two levels of monitoring. In the first level, a global index is computed and monitored to detect any deviation from normal operation conditions. In the second level, two local indices are proposed to examine the fine structure of the fault, once it is signaled at the first level. These local indices support the diagnosis of the fault, and are based on the individual unconditional and conditional distributions of the monitored variables. A new labeling procedure is also proposed to narrow down the search and identify the fault type. Unlike many existing diagnosis methods that require access to faulty data (supervised diagnosis methods), the proposed diagnosis methodology belongs to the class that only requires data under normal conditions (unsupervised diagnosis methods). The effectiveness of the proposed monitoring scheme is demonstrated and validated through simulated datasets and an industrial dataset from semiconductor manufacturing.
-
Hougen, Conrad D.; Kaplan, Lance M., Ivanovska, Magdalena, Cerutti, Federico, Mishra, Kumar Vijay & Hero III, Alfred O.
(2022)
SOLBP: Second-Order Loopy Belief Propagation for Inference in Uncertain Bayesian Networks
Vis sammendrag
In second-order uncertain Bayesian networks, the conditional probabilities are only known within distributions, i.e., probabilities over probabilities. The delta-method has been applied to extend exact first-order inference methods to propagate both means and variances through sum-product networks derived from Bayesian networks, thereby characterizing epistemic uncertainty, or the uncertainty in the model itself. Alternatively, second-order belief propagation has been demonstrated for polytrees but not for general directed acyclic graph structures. In this work, we extend Loopy Belief Propagation to the setting of second-order Bayesian networks, giving rise to Second-Order Loopy Belief Propagation (SOLBP). For second-order Bayesian networks, SOLBP generates inferences consistent with those generated by sum-product networks, while being more computationally efficient and scalable.
-
Iacopini, Matteo; Ravazzolo, Francesco & Rossini, Luca
(2022)
Proper Scoring Rules for Evaluating Density Forecasts with Asymmetric Loss Functions
Vis sammendrag
This article proposes a novel asymmetric continuous probabilistic score (ACPS) for evaluating and comparing density forecasts. It generalizes the proposed score and defines a weighted version, which emphasizes regions of interest, such as the tails or the center of a variable’s range. The (weighted) ACPS extends the symmetric (weighted) CRPS by allowing for asymmetries in the preferences underlying the scoring rule. A test is used to statistically compare the predictive ability of different forecasts. The ACPS is of general use in any situation where the decision-maker has asymmetric preferences in the evaluation of the forecasts. In an artificial experiment, the implications of varying the level of asymmetry in the ACPS are illustrated. Then, the proposed score and test are applied to assess and compare density forecasts of macroeconomic relevant datasets (U.S. employment growth) and of commodity prices (oil and electricity prices) with particular focus on the recent COVID-19 crisis period.
-
Durante, Fabrizio; Gianfreda, Angelica, Ravazzolo, Francesco & Rossini, Luca
(2022)
A multivariate dependence analysis for electricity prices, demand and renewable energy sources
Vis sammendrag
This paper examines the dependence between electricity prices, demand, and renewable energy sources by means of a multivariate copula model while studying Germany, the widest studied market in Europe. The inter-dependencies are investigated in-depth and monitored over time, with particular emphasis on the tail behavior. To this end, suitable tail dependence measures are introduced to take into account a multivariate extreme scenario appropriately identified through the Kendall’s distribution function. The empirical evidence demonstrates a strong association between electricity prices, renewable energy sources, and demand within a day and over the studied years. Hence, this analysis provides guidance for further and different incentives for promoting green energy generation while considering the time-varying dependencies of the involved variables.
-
Avesani, Diego; Zanfei, Ariele, Di Marco, Nicola, Galletti, Andrea, Ravazzolo, Francesco, Righetti, Maurizio & Majone, Bruno
(2022)
Short-term hydropower optimization driven by innovative time-adapting econometric model
Vis sammendrag
The ongoing transformation of the electricity market has reshaped the hydropower production paradigm for storage reservoir systems, with a shift from strategies oriented towards maximizing regional energy production to strategies aimed at the revenue maximization of individual systems. Indeed, hydropower producers bid their energy production scheduling 1 day in advance, attempting to align the operational plan with hours where the expected electricity prices are higher. As a result, the accuracy of 1-day ahead prices forecasts has started to play a key role in the short-term optimization of storage reservoir systems. This paper aims to contribute to the topic by presenting a comparative assessment of revenues provided by short-term optimizations driven by two econometric models. Both models are autoregressive time-adapting hourly forecasting models, which exploit the information provided by past values of electricity prices, with one model, referred to as Autoarimax, additionally considering exogenous variables related to electricity demand and production. The benefit of using the innovative Autoarimax model is exemplified in two selected hydropower systems with different storage capacities. The enhanced accuracy of electricity prices forecasting is not constant across the year due to the large uncertainties characterizing the electricity market. Our results also show that the adoption of Autoarimax leads to larger revenues with respect to the use of a standard model, increases that depend strongly on the hydropower system characteristics. Our results may be beneficial for hydropower companies to enhance the expected revenues from storage hydropower systems, especially those characterized by large storage capacity.
-
Moss, Jonas
(2022)
Infinite diameter confidence sets in Hedges’ publication bias model
Vis sammendrag
Meta-analysis, the statistical analysis of results from separate studies, is a fundamental building block of science. But the assumptions of classical meta-analysis models are not satisfied whenever publication bias is present, which causes inconsistent parameter estimates. Hedges’ selection function model takes publication bias into account, but estimating and inferring with this model is tough for some datasets. Using a generalized Gleser–Hwang theorem, we show there is no confidence set of guaranteed finite diameter for the parameters of Hedges’ selection model. This result provides a partial explanation for why inference with Hedges’ selection model is fraught with difficulties.
-
Andrade Mancisidor, Rogelio; Kampffmeyer, Michael, Aas, Kjersti & Jenssen, Robert
(2022)
Generating customer's credit behavior with deep generative models
Vis sammendrag
Innovation is considered essential for today's organizations to survive and thrive. Researchers have also stressed the importance of leadership as a driver of followers' innovative work behavior (FIB). Yet, despite a large amount of research, three areas remain understudied: (a) The relative importance of different forms of leadership for FIB; (b) the mechanisms through which leadership impacts FIB; and (c) the degree to which relationships between leadership and FIB are generalizable across cultures. To address these lacunae, we propose an integrated model connecting four types of positive leadership behaviors, two types of identification (as mediating variables), and FIB. We tested our model in a global data set comprising responses of N = 7,225 participants from 23 countries, grouped into nine cultural clusters. Our results indicate that perceived LMX quality was the strongest relative predictor of FIB. Furthermore, the relationships between both perceived LMX quality and identity leadership with FIB were mediated by social identification. The indirect effect of LMX on FIB via social identification was stable across clusters, whereas the indirect effects of the other forms of leadership on FIB via social identification were stronger in countries high versus low on collectivism. Power distance did not influence the relations.
-
Billé, Anna Gloria; Gianfreda, Angelica, Del Grosso, Filippo & Ravazzolo, Francesco
(2022)
Forecasting electricity prices with expert, linear, and nonlinear models
Vis sammendrag
This paper compares several models for forecasting regional hourly day-ahead electricity prices, while accounting for fundamental drivers. Forecasts of demand, in-feed from renewable energy sources, fossil fuel prices, and physical flows are all included in linear and nonlinear specifications, ranging in the class of ARFIMA-GARCH models—hence including parsimonious autoregressive specifications (known as expert-type models). The results support the adoption of a simple structure that is able to adapt to market conditions. Indeed, we include forecasted demand, wind and solar power, actual generation from hydro, biomass, and waste, weighted imports, and traditional fossil fuels. The inclusion of these exogenous regressors, in both the conditional mean and variance equations, outperforms in point and, especially, in density forecasting when the superior set of models is considered. Indeed, using the model confidence set and considering northern Italian prices, predictions indicate the strong predictive power of regressors, in particular in an expert model augmented for GARCH-type time-varying volatility. Finally, we find that using professional and more timely predictions of consumption and renewable energy sources improves the forecast accuracy of electricity prices more than using predictions publicly available to researchers.
-
Langguth, Johannes; Filkukova, Petra, Brenner, Stefan, Schroeder, Daniel Thilo & Pogorelov, Konstantin
(2022)
COVID-19 and 5G conspiracy theories: long term observation
of a digital wildfire
-
Langguth, Johannes; Tumanis, Aigar & Azad, Ariful
(2022)
Incremental Clustering Algorithms for Massive Dynamic Graphs
Vis sammendrag
We consider the problem of incremental graph clustering where the graph to be clustered is given as a sequence of disjoint subsets of the edge set. The problem appears when dealing with graphs that are created over time, such as online social networks where new users appear continuously, or protein interaction networks when new proteins are discovered. For very large graphs, it is computationally too expensive to repeatedly apply standard clustering algorithms. Instead, algorithms whose time complexity only depends on the size of the incoming subset of edges in every step are needed. At the same time, such algorithms should find clusterings whose quality is close to that produced by offline algorithms. In this paper, we discuss the computational model and present an incremental clustering algorithm. We test the algorithm performance and quality on a wide variety of instances. Our results show that the algorithm far outperforms offline algorithms while retaining a large fraction of their clustering quality.
-
Yazidi, Anis; Ivanovska, Magdalena, Zennaro, Fabio Massimo, Lind, Pedro & Viedma, Enrique Herrera
(2021)
A new decision making model based on Rank Centrality for GDM with fuzzy preference relations
Vis sammendrag
Preference aggregation in Group Decision Making (GDM) is a substantial problem that has received a lot of research attention. Decision problems involving fuzzy preference relations constitute an important class within GDM. Legacy approaches dealing with the latter type of problems can be classified into indirect approaches, which involve deriving a group preference matrix as an intermediate step, and direct approaches, which deduce a group preference ranking based on individual preference rankings. Although the work on indirect approaches has been extensive in the literature, there is still a scarcity of research dealing with the direct approaches. In this paper we present a direct approach towards aggregating several fuzzy preference relations on a set of alternatives into a single weighted ranking of the alternatives. By mapping the pairwise preferences into transitions probabilities, we are able to derive a preference ranking from the stationary distribution of a stochastic matrix. Interestingly, the ranking of the alternatives obtained with our method corresponds to the optimizer of the Maximum Likelihood Estimation of a particular Bradley-Terry-Luce model. Furthermore, we perform a theoretical sensitivity analysis of the proposed method supported by experimental results and illustrate our approach towards GDM with a concrete numerical example. This work opens avenues for solving GDM problems using elements of probability theory, and thus, provides a sound theoretical fundament as well as plausible statistical interpretation for the aggregation of expert opinions in GDM.
-
Caporin, Massimiliano; Gupta, Rangan & Ravazzolo, Francesco
(2021)
Contagion between real estate and financial markets: A Bayesian quantile-on-quantile approach
Vis sammendrag
We study contagion between Real Estate Investment Trusts (REITs) and the equity market in the U.S. over four sub-samples covering January, 2003 to December, 2017, by using Bayesian nonparametric quantile-on-quantile (QQ) regressions with heteroskedasticity. We find that the spillovers from the REITs on to the equity market has varied over time and quantiles defining the states of these two markets across the four sub-samples, thus providing evidence of shift-contagion. Further, contagion from REITs upon the stock market went up during the global financial crisis particularly, and also over the period corresponding to the European sovereign debt crisis, relative to the pre-crisis period. Our main findings are robust to alternative model specifications of the benchmark Bayesian QQ model, especially when we control for omitted variable bias using the heteroskedastic error structure. Our results have important implications for various agents in the economy namely, academics, investors and policymakers.
-
Agudze, Komla M.; Billio, Monica, Casarin, Roberto & Ravazzolo, Francesco
(2021)
Markov switching panel with endogenous synchronization effects
Vis sammendrag
This paper introduces a new dynamic panel model with multi-layer network effects. Series-specific latent Markov chain processes drive the dynamics of the observable processes, and several types of interaction effects among the hidden chains allow for various degrees of endogenous synchronization of both latent and observable processes. The interaction is driven by a multi-layer network with exogenous and endogenous connectivity layers. We provide some theoretical properties of the model, develop a Bayesian inference framework and an efficient Markov Chain Monte Carlo algorithm for estimating parameters, latent states, and endogenous network layers. An application to the US-state coincident indicators shows that the synchronization in the US economy is generated by network effects among the states. The inclusion of a multi-layer network provides a new tool for measuring the effects of the public policies that impact the connectivity between the US states, such as mobility restrictions or job support schemes. The proposed new model and the related inference are general and may find application in a wide spectrum of datasets where the extraction of endogenous interaction effects is relevant and of interest.
-
Ferrari, Davide; Ravazzolo, Francesco & Vespignani, Joaquin
(2021)
Forecasting energy commodity prices: A large global dataset sparse approach
Vis sammendrag
This paper focuses on forecasting quarterly nominal global energy prices of commodities, such as oil, gas and coal,using the Global VAR dataset proposed by Mohaddes and Raissi (2018). This dataset includes a number of poten-tially informative quarterly macroeconomic variables for the 33 largest economies, overall accounting for morethan 80% of the global GDP. To deal with the information on this large database, we apply dynamic factor modelsbased on a penalized maximum likelihood approach that allows to shrink parameters to zero and to estimatesparse factor loadings. The estimated latent factors show considerable sparsity and heterogeneity in the selectedloadings across variables. When the model is extended to predict energy commodity prices up to four periodsahead, results indicate larger predictability relative to the benchmark random walk model for 1-quarter aheadfor all energy commodities and up to 4 quarters ahead for gas prices. Our model also provides superior forecaststhan machine learning techniques, such as elastic net, LASSO and random forest, applied to the same database.
-
Burchard, Luk; Cai, Xing & Langguth, Johannes
(2021)
iPUG for Multiple Graphcore IPUs: Optimizing Performance and Scalability of Parallel Breadth-First Search
-
Hjort, Nils Lid & Stoltenberg, Emil Aas
(2021)
The partly parametric and partly nonparametric additive risk model
Vis sammendrag
Aalen’s linear hazard rate regression model is a useful and increasingly popular alternative to Cox’ multiplicative hazard rate model. It postulates that an individual has hazard rate function h(s)=z1α1(s)+⋯+zrαr(s) in terms of his covariate values z1,…,zr. These are typically levels of various hazard factors, and may also be time-dependent. The hazard factor functions αj(s) are the parameters of the model and are estimated from data. This is traditionally accomplished in a fully nonparametric way. This paper develops methodology for estimating the hazard factor functions when some of them are modelled parametrically while the others are left unspecified. Large-sample results are reached inside this partly parametric, partly nonparametric framework, which also enables us to assess the goodness of fit of the model’s parametric components. In addition, these results are used to pinpoint how much precision is gained, using the parametric-nonparametric model, over the standard nonparametric method. A real-data application is included, along with a brief simulation study.
-
Burchard, Luk; Moe, Johannes Sellæg, Schroeder, Daniel Thilo, Pogorelov, Konstantin & Langguth, Johannes
(2021)
iPUG: Accelerating Breadth-First Graph Traversals Using Manycore Graphcore IPUs
Vis sammendrag
The Graphcore Intelligence Processing Unit (IPU) is a newly developed processor type whose architecture does not rely on the traditional caching hierarchies. Developed to meet the need for more and more data-centric applications, such as machine learning, IPUs combine a dedicated portion of SRAM with each of its numerous cores, resulting in high memory bandwidth at the price of capacity. The proximity of processor cores and memory makes the IPU a promising field of experimentation for graph algorithms since it is the unpredictable, irregular memory accesses that lead to performance losses in traditional processors with pre-caching.
This paper aims to test the IPU’s suitability for algorithms with hard-to-predict memory accesses by implementing a breadth-first search (BFS) that complies with the Graph500 specifications. Precisely because of its apparent simplicity, BFS is an established benchmark that is not only subroutine for a variety of more complex graph algorithms, but also allows comparability across a wide range of architectures.
We benchmark our IPU code on a wide range of instances and compare its performance to state-of-the-art CPU and GPU codes. The results indicate that the IPU delivers speedups of up to 4× over the fastest competing result on an NVIDIA V100 GPU, with typical speedups of about 1.5× on most test instances.
-
Ravazzolo, Francesco & Vespignani, Joaquin
(2020)
World steel production: A new monthly indicator of global real economic activity
-
Vassøy, Bjørnar; Ruocco, Massimiliano, de Souza da Silva, Eliezer & Aune, Erlend
(2019)
Time is of the essence: A joint Hierarchical RNN and Point Process model for time and item predictions
-
Limongi Concetto, Chiara & Ravazzolo, Francesco
(2019)
Optimism in Financial Markets: Stock Market Returns and Investor Sentiments
-
Yang, Wei-Ting; Blue, Jakey, Roussy, Agnès, Reis, Marco & Pinaton, Jacques
(2018)
Virtual metrology modeling based on gaussian bayesian network
-
Bassetti, Federico; Casarin, Roberto & Ravazzolo, Francesco
(2018)
Bayesian Nonparametric Calibration and Combination of Predictive Distributions
-
Clark, Todd E. & Ravazzolo, Francesco
(2015)
Macroeconomic Forecasting Performance under Alternative Specifications of Time-Varying Volatility