FHE processing can tend to use significant amount of resources because of matrix-matrix computations. We have found that using a GPU to handle the complexity of matrix-matrix multiplication that allows the FHE processing to complete in a relatively timely manner.

To make this possible with our FHE implementations is a bit tricky. Because we use FFLAS-FFPACK to perform matrix operations over a finite field, just trying to use a GPU library does not work. We worked around this by using a special library that implements a xgemm function that runs the matrix-matrix processing on the GPU. This requires keeping the values of the matrix inside the modulus of the finite field. This ability is easily extended to different setups that enable multiple matrix-matirx computations.

