Tensor canonical polyadic decomposition (CPD) has recently emerged as a promising mathematical tool in multidimensional data analytics. Traditionally, the alternating least-squares method is the workhorse for tensor CPD, but it requires knowing the tensor rank. A probabilistic approach overcomes this challenge by incorporating the tensor rank determination as an integral part of the CPD process. However, the current probabilistic tensor CPD method is derived for batch-mode operation, meaning that it needs to process the whole dataset at the same time. Obviously, this is no longer suitable for large datasets. To enable tensor CPD in a massive data paradigm, in this paper, the idea of stochastic optimization is introduced into the probabilistic tensor CPD, rendering a scalable algorithm that only processes mini-batch data at a time. Numerical studies on synthetic data and real-world applications are presented to demonstrate that the proposed scalable tensor CPD algorithm performs almost identically to the corresponding batch-mode algorithm while saving a significant amount of computation time.
All Science Journal Classification (ASJC) codes
- Signal Processing
- Electrical and Electronic Engineering
- Large-scale tensor decomposition
- automatic rank determination
- scalable algorithm
- variational inference