Can quantum physics inspire new machine learning methods? In the preprint « An Explainable Probabilistic Classifier for Categorical Data Inspired to Quantum Physics »,  we show how the analogy with the theory of superposition of states in quantum physics can be successfully exploited in machine learning, and open the door to future research in this direction. The project has been awarded with the Google Cloud research credits.

The research project

In quantum physics, the superposition principle is the idea that a system is in all possible states at the same time, until it is measured. For instance, in the Schrödinger’s thought experiment, a cat may be considered simultaneously both alive and dead [1]. The superposition principle explains the « quantum weirdness » observed with many real-world experiments. A classic example of this is the double-slit experiment [2]. Here, two slits in a barrier allow for the passage of (for example) electrons. The result of this experiment is an interference pattern not predicted by classical mechanics [3].

We argue that the quantum paradigm can be successfully exploited in machine learning. For instance, a text may be regarded as a superposition of words in natural language processing and text classification. By regarding a generic data instance as a superposition of features, we develop a novel supervised classification algorithm for categorical data. In the preprint, we show that this methodology possesses a wide range of desirable properties not available in most other machine learning methods and achieves state-of-the art performances compared to both standard classifiers and deep learning. Assessing the scalability of the algorithm for very large data (i.e., terabytes) requires specialized hardware and software that was not available to us. For this reason, we applied to the Google Cloud research credits.

The Google Cloud research credits

The Google Cloud research credits is an academic research grant « giving access to computing power that will make the next big thing possible ». In particular, we applied to obtain access to a Tesla A100 GPU, which debuts the world’s fastest memory bandwidth at over 2 terabytes per second (TB/s) to run the largest models and datasets. The algorithm is implemented in SQL, and we are now working to parallelize the SQL queries using PG-Strom, which enables to accelerate SQL workloads by processing big data sets via the Tesla A100 GPU. All the results will be made publicly available as part of the open source software we are developing.

See Guidotti Emanuele/Ferrara Alfio, An Explainable Probabilistic Classifier for Categorical Data Inspired to Quantum Physics, 2021. The paper was presented at the XXXII IUPAP Conference on Computational Physics, parallel session on Machine Learning and Algorithms [4].

Auteur(s) de cette contribution :

Doctorant en Finance à l'Université de Neuchâtel. Emanuele est partenaire d'Algo Finance Sagl, start-up de logiciels développant des algorithmes financiers pour l'industrie de la gestion d'actifs. Passionné par les domaines de recherche interdisciplinaires à l'intersection de la finance, de la science des données et des statistiques.