A highly efficient and modular implementation of Gaussian Processes in PyTorch


GPyTorch Unit Tests GPyTorch Examples Documentation Status

GPyTorch is a Gaussian process library implemented using PyTorch. GPyTorch is designed for creating scalable, flexible, and modular Gaussian process models with ease.

Internally, GPyTorch differs from many existing approaches to GP inference by performing all inference operations using modern numerical linear algebra techniques like preconditioned conjugate gradients. Implementing a scalable GP method is as simple as providing a matrix multiplication routine with the kernel matrix and its derivative via our LazyTensor interface, or by composing many of our already existing LazyTensors. This allows not only for easy implementation of popular scalable GP techniques, but often also for significantly improved utilization of GPU computing compared to solvers based on the Cholesky decomposition.

GPyTorch provides (1) significant GPU acceleration (through MVM based inference); (2) state-of-the-art implementations of the latest algorithmic advances for scalability and flexibility (SKI/KISS-GP, stochastic Lanczos expansions, LOVE, SKIP, stochastic variational deep kernel learning, ...); (3) easy integration with deep learning frameworks.

Examples, Tutorials, and Documentation

See our numerous examples and tutorials on how to construct all sorts of models in GPyTorch.



  • Python >= 3.6
  • PyTorch >= 1.7

Install GPyTorch using pip or conda:

pip install gpytorch
conda install gpytorch -c gpytorch

(To use packages globally but install GPyTorch as a user-only package, use pip install --user above.)

Latest (unstable) version

To upgrade to the latest (unstable) version, run

pip install --upgrade git+https://github.com/cornellius-gp/gpytorch.git

ArchLinux Package

Note: Experimental AUR package. For most users, we recommend installation by conda or pip.

GPyTorch is also available on the ArchLinux User Repository (AUR). You can install it with an AUR helper, like yay, as follows:

yay -S python-gpytorch

To discuss any issues related to this AUR package refer to the comments section of python-gpytorch.

Citing Us

If you use GPyTorch, please cite the following papers:

Gardner, Jacob R., Geoff Pleiss, David Bindel, Kilian Q. Weinberger, and Andrew Gordon Wilson. "GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration." In Advances in Neural Information Processing Systems (2018).

  title={GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration},
  author={Gardner, Jacob R and Pleiss, Geoff and Bindel, David and Weinberger, Kilian Q and Wilson, Andrew Gordon},
  booktitle={Advances in Neural Information Processing Systems},


To run the unit tests:

python -m unittest

By default, the random seeds are locked down for some of the tests. If you want to run the tests without locking down the seed, run

UNLOCK_SEED=true python -m unittest

If you plan on submitting a pull request, please make use of our pre-commit hooks to ensure that your commits adhere to the general style guidelines enforced by the repo. To do this, navigate to your local repository and run:

pip install pre-commit
pre-commit install

From then on, this will automatically run flake8, isort, black and other tools over the files you commit each time you commit to gpytorch or a fork of it.

The Team

GPyTorch is primarily maintained by:

We would like to thank our other contributors including (but not limited to) David Arbour, Eytan Bakshy, David Eriksson, Jared Frank, Sam Stanton, Bram Wallace, Ke Alexander Wang, Ruihan Wu.


Development of GPyTorch is supported by funding from the Bill and Melinda Gates Foundation, the National Science Foundation, and SAP.

  • Add priors [WIP]

    Add priors [WIP]

    This is an early attempt at adding priors. Lots of callsites in the code aren't updated yet, so this will fail spectacularly.

    The main thing we need to figure out is how to properly do the optimization using standard gpytorch optimizers that don't support bounds. We should probably modify the smoothed uniform prior so it has full support and is differentiable everywhere but decays rapidly outside the given bounds. Does this sound reasonable?

  • Using batch-GP for learnign single common GP over multiple experiments

    Using batch-GP for learnign single common GP over multiple experiments

    Howdy folks,

    Reading the docs, I understand that batch-GP is meant to learn k independent GPs, from k independent labels y over a common data set x.

    y1 = f1(x), y2 = f2(x), ..., yk = fk(x) , for k independent GPs.

    But how would one go about using batch-GP to learn a single common GP, from k independent experiments of the same underlying process?

    y1=f(x1), y2 = f(x2), ..., yk=f(xk) for one and the same GP

    For instance, I have k sets of data and labels (y) representing measurements of how the temperature changes over altitude (x) (e.g. from weather balloons launched at k different geographical locations), and I want to induce a GP prior hat represents the temperature change over altitude between mean sea level and some maximum altitude, marginalized over the all geographical areas.

    Thanks in advance


  • Ensure compatibility with breaking changes in pytorch master branch

    Ensure compatibility with breaking changes in pytorch master branch

    This is a run of the simple_gp_regression example notebook on the current alpha_release branch. Running kissgp_gp_regression_cuda yields similar errors

    import math
    import torch
    import gpytorch
    from matplotlib import pyplot as plt
    %matplotlib inline
    %load_ext autoreload
    %autoreload 2
    from torch.autograd import Variable
    # Training data is 11 points in [0,1] inclusive regularly spaced
    train_x = Variable(torch.linspace(0, 1, 11))
    # True function is sin(2*pi*x) with Gaussian noise N(0,0.04)
    train_y = Variable(torch.sin(train_x.data * (2 * math.pi)) + torch.randn(train_x.size()) * 0.2)
    from torch import optim
    from gpytorch.kernels import RBFKernel
    from gpytorch.means import ConstantMean
    from gpytorch.likelihoods import GaussianLikelihood
    from gpytorch.random_variables import GaussianRandomVariable
    # We will use the simplest form of GP model, exact inference
    class ExactGPModel(gpytorch.models.ExactGP):
        def __init__(self, train_x, train_y, likelihood):
            super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
            # Our mean function is constant in the interval [-1,1]
            self.mean_module = ConstantMean(constant_bounds=(-1, 1))
            # We use the RBF kernel as a universal approximator
            self.covar_module = RBFKernel(log_lengthscale_bounds=(-5, 5))
        def forward(self, x):
            mean_x = self.mean_module(x)
            covar_x = self.covar_module(x)
            # Return moddl output as GaussianRandomVariable
            return GaussianRandomVariable(mean_x, covar_x)
    # initialize likelihood and model
    likelihood = GaussianLikelihood(log_noise_bounds=(-5, 5))
    model = ExactGPModel(train_x.data, train_y.data, likelihood)
    # Find optimal model hyperparameters
    # Use adam optimizer on model and likelihood parameters
    optimizer = optim.Adam(list(model.parameters()) + list(likelihood.parameters()), lr=0.1)
    optimizer.n_iter = 0
    training_iter = 50
    for i in range(training_iter):
        # Zero gradients from previous iteration
        # Output from model
        output = model(train_x)
        # Calc loss and backprop gradients
        loss = -model.marginal_log_likelihood(likelihood, output, train_y)
        optimizer.n_iter += 1
        print('Iter %d/%d - Loss: %.3f   log_lengthscale: %.3f   log_noise: %.3f' % (
            i + 1, training_iter, loss.data[0],
            model.covar_module.log_lengthscale.data[0, 0],
    TypeError                                 Traceback (most recent call last)
    <ipython-input-8-bdcf88774fd0> in <module>()
         14     output = model(train_x)
         15     # Calc loss and backprop gradients
    ---> 16     loss = -model.marginal_log_likelihood(likelihood, output, train_y)
         17     loss.backward()
         18     optimizer.n_iter += 1
    /data/users/balandat/fbsource/fbcode/buck-out/dev-nosan/gen/experimental/ae/bento_kernel_ae_experimental#link-tree/gpytorch/models/exact_gp.py in marginal_log_likelihood(self, likelihood, output, target, n_data)
         43             raise RuntimeError('You must train on the training targets!')
    ---> 45         mean, covar = likelihood(output).representation()
         46         n_data = target.size(-1)
         47         return gpytorch.exact_gp_marginal_log_likelihood(covar, target - mean).div(n_data)
    /data/users/balandat/fbsource/fbcode/buck-out/dev-nosan/gen/experimental/ae/bento_kernel_ae_experimental#link-tree/gpytorch/module.py in __call__(self, *inputs, **kwargs)
        158                 raise RuntimeError('Input must be a RandomVariable or Variable, was a %s' %
        159                                    input.__class__.__name__)
    --> 160         outputs = self.forward(*inputs, **kwargs)
        161         if isinstance(outputs, Variable) or isinstance(outputs, RandomVariable) or isinstance(outputs, LazyVariable):
        162             return outputs
    /data/users/balandat/fbsource/fbcode/buck-out/dev-nosan/gen/experimental/ae/bento_kernel_ae_experimental#link-tree/gpytorch/likelihoods/gaussian_likelihood.py in forward(self, input)
         14         assert(isinstance(input, GaussianRandomVariable))
         15         mean, covar = input.representation()
    ---> 16         noise = gpytorch.add_diag(covar, self.log_noise.exp())
         17         return GaussianRandomVariable(mean, noise)
    /data/users/balandat/fbsource/fbcode/buck-out/dev-nosan/gen/experimental/ae/bento_kernel_ae_experimental#link-tree/gpytorch/__init__.py in add_diag(input, diag)
         36         return input.add_diag(diag)
         37     else:
    ---> 38         return _add_diag(input, diag)
    /data/users/balandat/fbsource/fbcode/buck-out/dev-nosan/gen/experimental/ae/bento_kernel_ae_experimental#link-tree/gpytorch/functions/__init__.py in add_diag(input, diag)
         18                        component added.
         19     """
    ---> 20     return AddDiag()(input, diag)
    /data/users/balandat/fbsource/fbcode/buck-out/dev-nosan/gen/experimental/ae/bento_kernel_ae_experimental#link-tree/gpytorch/functions/add_diag.py in forward(self, input, diag)
         12         if input.ndimension() == 3:
         13             diag_mat = diag_mat.unsqueeze(0).expand_as(input)
    ---> 14         return diag_mat.mul_(val).add_(input)
         16     def backward(self, grad_output):
    TypeError: mul_ received an invalid combination of arguments - got (Variable), but expected one of:
     * (float value)
          didn't match because some of the arguments have invalid types: (!Variable!)
     * (torch.FloatTensor other)
          didn't match because some of the arguments have invalid types: (!Variable!)
  • import gpytorch error

    import gpytorch error

    $ sudo python setup.py install [sudo] password for ubuntu: running install running bdist_egg running egg_info writing dependency_links to gpytorch.egg-info/dependency_links.txt writing top-level names to gpytorch.egg-info/top_level.txt writing requirements to gpytorch.egg-info/requires.txt writing gpytorch.egg-info/PKG-INFO reading manifest file 'gpytorch.egg-info/SOURCES.txt' writing manifest file 'gpytorch.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_py copying gpytorch/libfft/init.py -> build/lib.linux-x86_64-3.5/gpytorch/libfft running build_ext generating cffi module 'build/temp.linux-x86_64-3.5/gpytorch.libfft._libfft.c' already up-to-date creating build/bdist.linux-x86_64/egg creating build/bdist.linux-x86_64/egg/gpytorch creating build/bdist.linux-x86_64/egg/gpytorch/means copying build/lib.linux-x86_64-3.5/gpytorch/means/init.py -> build/bdist.linux-x86_64/egg/gpytorch/means copying build/lib.linux-x86_64-3.5/gpytorch/means/mean.py -> build/bdist.linux-x86_64/egg/gpytorch/means copying build/lib.linux-x86_64-3.5/gpytorch/means/constant_mean.py -> build/bdist.linux-x86_64/egg/gpytorch/means copying build/lib.linux-x86_64-3.5/gpytorch/gp_model.py -> build/bdist.linux-x86_64/egg/gpytorch copying build/lib.linux-x86_64-3.5/gpytorch/init.py -> build/bdist.linux-x86_64/egg/gpytorch creating build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/init.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/constant_random_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/independent_random_variables.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/samples_random_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/gaussian_random_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/batch_random_variables.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/random_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/categorical_random_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/bernoulli_random_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables creating build/bdist.linux-x86_64/egg/gpytorch/likelihoods copying build/lib.linux-x86_64-3.5/gpytorch/likelihoods/init.py -> build/bdist.linux-x86_64/egg/gpytorch/likelihoods copying build/lib.linux-x86_64-3.5/gpytorch/likelihoods/likelihood.py -> build/bdist.linux-x86_64/egg/gpytorch/likelihoods copying build/lib.linux-x86_64-3.5/gpytorch/likelihoods/gaussian_likelihood.py -> build/bdist.linux-x86_64/egg/gpytorch/likelihoods copying build/lib.linux-x86_64-3.5/gpytorch/likelihoods/bernoulli_likelihood.py -> build/bdist.linux-x86_64/egg/gpytorch/likelihoods creating build/bdist.linux-x86_64/egg/gpytorch/lazy copying build/lib.linux-x86_64-3.5/gpytorch/lazy/init.py -> build/bdist.linux-x86_64/egg/gpytorch/lazy copying build/lib.linux-x86_64-3.5/gpytorch/lazy/kronecker_product_lazy_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/lazy copying build/lib.linux-x86_64-3.5/gpytorch/lazy/toeplitz_lazy_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/lazy copying build/lib.linux-x86_64-3.5/gpytorch/lazy/lazy_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/lazy copying build/lib.linux-x86_64-3.5/gpytorch/module.py -> build/bdist.linux-x86_64/egg/gpytorch creating build/bdist.linux-x86_64/egg/gpytorch/inference copying build/lib.linux-x86_64-3.5/gpytorch/inference/init.py -> build/bdist.linux-x86_64/egg/gpytorch/inference creating build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models copying build/lib.linux-x86_64-3.5/gpytorch/inference/posterior_models/init.py -> build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models copying build/lib.linux-x86_64-3.5/gpytorch/inference/posterior_models/gp_posterior.py -> build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models copying build/lib.linux-x86_64-3.5/gpytorch/inference/posterior_models/exact_gp_posterior.py -> build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models copying build/lib.linux-x86_64-3.5/gpytorch/inference/posterior_models/variational_gp_posterior.py -> build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models copying build/lib.linux-x86_64-3.5/gpytorch/inference/inference.py -> build/bdist.linux-x86_64/egg/gpytorch/inference creating build/bdist.linux-x86_64/egg/gpytorch/functions copying build/lib.linux-x86_64-3.5/gpytorch/functions/init.py -> build/bdist.linux-x86_64/egg/gpytorch/functions copying build/lib.linux-x86_64-3.5/gpytorch/functions/log_normal_cdf.py -> build/bdist.linux-x86_64/egg/gpytorch/functions copying build/lib.linux-x86_64-3.5/gpytorch/functions/normal_cdf.py -> build/bdist.linux-x86_64/egg/gpytorch/functions copying build/lib.linux-x86_64-3.5/gpytorch/functions/dsmm.py -> build/bdist.linux-x86_64/egg/gpytorch/functions copying build/lib.linux-x86_64-3.5/gpytorch/functions/add_diag.py -> build/bdist.linux-x86_64/egg/gpytorch/functions creating build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/toeplitz.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/interpolation.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/init.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/lincg.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/fft.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/lanczos_quadrature.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/function_factory.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/kronecker_product.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/circulant.py -> build/bdist.linux-x86_64/egg/gpytorch/utils creating build/bdist.linux-x86_64/egg/gpytorch/kernels copying build/lib.linux-x86_64-3.5/gpytorch/kernels/init.py -> build/bdist.linux-x86_64/egg/gpytorch/kernels copying build/lib.linux-x86_64-3.5/gpytorch/kernels/grid_interpolation_kernel.py -> build/bdist.linux-x86_64/egg/gpytorch/kernels copying build/lib.linux-x86_64-3.5/gpytorch/kernels/kernel.py -> build/bdist.linux-x86_64/egg/gpytorch/kernels copying build/lib.linux-x86_64-3.5/gpytorch/kernels/rbf_kernel.py -> build/bdist.linux-x86_64/egg/gpytorch/kernels copying build/lib.linux-x86_64-3.5/gpytorch/kernels/spectral_mixture_kernel.py -> build/bdist.linux-x86_64/egg/gpytorch/kernels copying build/lib.linux-x86_64-3.5/gpytorch/kernels/index_kernel.py -> build/bdist.linux-x86_64/egg/gpytorch/kernels creating build/bdist.linux-x86_64/egg/gpytorch/libfft copying build/lib.linux-x86_64-3.5/gpytorch/libfft/init.py -> build/bdist.linux-x86_64/egg/gpytorch/libfft copying build/lib.linux-x86_64-3.5/gpytorch/libfft/_libfft.abi3.so -> build/bdist.linux-x86_64/egg/gpytorch/libfft byte-compiling build/bdist.linux-x86_64/egg/gpytorch/means/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/means/mean.py to mean.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/means/constant_mean.py to constant_mean.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/gp_model.py to gp_model.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/constant_random_variable.py to constant_random_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/independent_random_variables.py to independent_random_variables.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/samples_random_variable.py to samples_random_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/gaussian_random_variable.py to gaussian_random_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/batch_random_variables.py to batch_random_variables.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/random_variable.py to random_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/categorical_random_variable.py to categorical_random_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/bernoulli_random_variable.py to bernoulli_random_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/likelihoods/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/likelihoods/likelihood.py to likelihood.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/likelihoods/gaussian_likelihood.py to gaussian_likelihood.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/likelihoods/bernoulli_likelihood.py to bernoulli_likelihood.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/lazy/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/lazy/kronecker_product_lazy_variable.py to kronecker_product_lazy_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/lazy/toeplitz_lazy_variable.py to toeplitz_lazy_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/lazy/lazy_variable.py to lazy_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/module.py to module.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/inference/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models/gp_posterior.py to gp_posterior.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models/exact_gp_posterior.py to exact_gp_posterior.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models/variational_gp_posterior.py to variational_gp_posterior.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/inference/inference.py to inference.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/functions/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/functions/log_normal_cdf.py to log_normal_cdf.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/functions/normal_cdf.py to normal_cdf.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/functions/dsmm.py to dsmm.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/functions/add_diag.py to add_diag.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/toeplitz.py to toeplitz.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/interpolation.py to interpolation.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/lincg.py to lincg.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/fft.py to fft.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/lanczos_quadrature.py to lanczos_quadrature.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/function_factory.py to function_factory.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/kronecker_product.py to kronecker_product.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/circulant.py to circulant.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/kernels/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/kernels/grid_interpolation_kernel.py to grid_interpolation_kernel.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/kernels/kernel.py to kernel.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/kernels/rbf_kernel.py to rbf_kernel.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/kernels/spectral_mixture_kernel.py to spectral_mixture_kernel.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/kernels/index_kernel.py to index_kernel.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/libfft/init.py to init.cpython-35.pyc creating stub loader for gpytorch/libfft/_libfft.abi3.so byte-compiling build/bdist.linux-x86_64/egg/gpytorch/libfft/_libfft.py to _libfft.cpython-35.pyc creating build/bdist.linux-x86_64/egg/EGG-INFO copying gpytorch.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO copying gpytorch.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying gpytorch.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying gpytorch.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying gpytorch.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt zip_safe flag not set; analyzing archive contents... gpytorch.libfft.pycache._libfft.cpython-35: module references file creating 'dist/gpytorch-0.1-py3.5-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it removing 'build/bdist.linux-x86_64/egg' (and everything under it) Processing gpytorch-0.1-py3.5-linux-x86_64.egg removing '/usr/local/lib/python3.5/dist-packages/gpytorch-0.1-py3.5-linux-x86_64.egg' (and everything under it) creating /usr/local/lib/python3.5/dist-packages/gpytorch-0.1-py3.5-linux-x86_64.egg Extracting gpytorch-0.1-py3.5-linux-x86_64.egg to /usr/local/lib/python3.5/dist-packages gpytorch 0.1 is already the active version in easy-install.pth

    Installed /usr/local/lib/python3.5/dist-packages/gpytorch-0.1-py3.5-linux-x86_64.egg Processing dependencies for gpytorch==0.1 Searching for cffi==1.10.0 Best match: cffi 1.10.0 Adding cffi 1.10.0 to easy-install.pth file

    Using /usr/local/lib/python3.5/dist-packages Searching for pycparser==2.18 Best match: pycparser 2.18 Adding pycparser 2.18 to easy-install.pth file

    Using /usr/local/lib/python3.5/dist-packages Finished processing dependencies for gpytorch==0.1

    $ python Python 3.5.2 (default, Nov 17 2016, 17:05:23) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information.

    import gpytorch Traceback (most recent call last): File "", line 1, in File "/home/ubuntu/gpytorch-master/gpytorch/init.py", line 3, in from .lazy import LazyVariable, ToeplitzLazyVariable File "/home/ubuntu/gpytorch-master/gpytorch/lazy/init.py", line 2, in from .toeplitz_lazy_variable import ToeplitzLazyVariable File "/home/ubuntu/gpytorch-master/gpytorch/lazy/toeplitz_lazy_variable.py", line 4, in from gpytorch.utils import toeplitz File "/home/ubuntu/gpytorch-master/gpytorch/utils/toeplitz.py", line 2, in import gpytorch.utils.fft as fft File "/home/ubuntu/gpytorch-master/gpytorch/utils/fft.py", line 1, in from .. import libfft File "/home/ubuntu/gpytorch-master/gpytorch/libfft/init.py", line 3, in from ._libfft import lib as _lib, ffi as _ffi ImportError: No module named 'gpytorch.libfft._libfft'

  • Heteroskedastic likelihoods and log-noise models

    Heteroskedastic likelihoods and log-noise models

    Allows to specify generic (log-) noise models that are used to obtain out-of-sample noise estimates. This allows e.g. to stick a GP to be fit on the (log-) measured standard errors of observed data into the GaussianLikelihood, and then jointly fit that together with the GP to be fit on the data.

  • Arbitrary number of batch dimensions for LazyTensors

    Arbitrary number of batch dimensions for LazyTensors

    Major refactors

    • [x] Refactor _get_indices from all LazyTensors
    • [x] Simplify _getitem to handle all cases - including tensor indices
    • [x] Write efficient _getitem for (most) LazyTensors
      • [x] CatLazyTensor
      • [x] BlockDiagLazyTensor
      • [x] ToeplitzLazyTensor
    • [x] Write efficient _get_indices for all LazyTensors
    • [x] Add a custom _expand_batch method for certain LazyTensors
    • [x] Add a custom _unsqueeze_batch method for certain LazyTensors
    • [x] BlockDiagLazyTensor and SumBatchLazyTensor use an explicit batch dimension (rather than implicit one) for the block structure. Also they can sum/block along any batch dimension.
    • [x] Custom _sum_batch and _prod_batch methods
      • [x] NonLazyTensor
      • [x] DiagLazyTensor
      • [x] InterpolatedLazyTensor
      • [x] ZeroLazyTensor

    New features

    • [x] LazyTensors now handle multiple batch dimensions
    • [x] LazyTensors have squeeze and unsqueeze methods
    • [x] Replace sum_batch with sum (can accept arbitrary dimensions)
    • [x] Replace mul_batch with prod (can accept arbitrary dimensions)
    • [x] LazyTensor.mul now expects a tensor of size *constant_size, 1, 1 for constant mul. (More consistent with the Tensor api).
    • [x] Add broadcasting capabilities to remaining LazyTensors


    • [x] Add MultiBatch tests for all LazyTensors
    • [x] Add unittetsts for BlockDiagLazyTensor and SumBatchLazyTensor using any batch dimension for summing/blocking
    • [x] Add tests for sum and prod methods
    • [x] Add tests for constant mul
    • [x] Add tests for permuting dimensions
    • [x] Add tests for LazyEvaluatedKernelTensor

    Miscelaneous todos (as part of the whole refactoring process)

    • [x] Ensure that InterpolatedLazyTensor.diag didn't become more inefficient
    • [x] Make CatLazyTensor work on batch dimensions
    • [x] Add to LT docs that users might have to overwrite _getitem, _get_indices, _unsqueeze_batch, _expand_batch, and transpose.
    • [x] Fix #573


    The new __getitem__ method reduces all possible indices to two cases:

    • The row and/or column of the LT is absorbed into one of the batch dimensions (this happens when a batch dimension is tensor indexed and the row/column are as well). This calls the sub-method _get_indices, in which all dimensions are indexed by Tensor indices. The output is a Tensor.
    • Neither the row nor column are absorbed into one of the batch dimensions. In this case, the _getitem sub-method is called, and the resulting output will be an LT with a reduced row and column.

    Closes #369 Closes #490 Closes #533 Closes #532 Closes #573

  • Add TriangularLazyTensor

    Add TriangularLazyTensor

    Adds a new TriangularLazyTensor abstraction. This tensor can be upper or lower (default) triangular. This simplifies a bunch of stuff with solves, dets, logprobs etc.

    Some of the changes with larger blast radius in this PR are:

    1. CholLazyTensor now takes in a TriangularLazyTensor
    2. The _cholesky method is expected to return a TriangularLazyTensor
    3. The _cholesky method now takes an upper kwarg (allows to work with both lower and upper variants of TriangularLazyTensor)
    4. DiagLazyTensor is not subclassed from TriangularLazyTensor
    5. The memoization functionality is updated to allow caching results depending on args/kwargs (required for dealing with the upper/lower kwargs). By setting ignore_args=False in the @cached decorator, the existing behavior can be replicated.

    Some improvements:

    1. CholLazyTensor now has a more efficient inv_matmul and inv_quad methods using the factorization of the matrix.
    2. KroneckerProductLazyTensor now returns a Cholesky decomposition that itself uses a Kronecker product representation [previously suggested in #1086]
    3. Added a test_cholesky test to the LazyTensorTestCase (this covers some previously uncovered cases explicitly)
    4. There were a number of hard-to-spot issues due to hacky manual cache handling - I replaced all these call sites with the cache helpers from gpytorch.utils.memoize, which is the correct way to go about this.
  • Replicating results presented in Doubly Stochastic Variational Inference for Deep Gaussian Processes

    Replicating results presented in Doubly Stochastic Variational Inference for Deep Gaussian Processes

    Hi, has anybody succeeded in replicating the results of the paper Doubly Stochastic Variational Inference for Deep Gaussian Processes by Salimbeni and Deisenroth in GPyTorch? There is an example DeepGP notebook referring to the paper, but when I tried to run it on the datasets used by the paper I often observe divergence in the test log-likelihood (this is the example for training on kin8nm dataset). Training on kin8nm dataset

    The divergence does not occur every time, but I am not sure what is its cause and I see no way to control it...

    I am attaching my modified notebook with reading of the datasets, a model without residual connections, batch size and layer dimensions as in the paper. Any idea what is happening here?


    Thanks, Jan

  • [Feature Request] Missing data likelihoods

    [Feature Request] Missing data likelihoods

    🚀 Feature Request

    We'd like to use GPs in settings where some observations may be missing. My understanding is that, in these circumstances, missing observations do not contribute anything to the likelihood of the observation model.

    Initial Attempt

    My initial attempt to write such a likelihood is as follows:

    from gpytorch.likelihoods import GaussianLikelihood
    from torch.distributions import Normal
    class GaussianLikelihoodWithMissingObs(GaussianLikelihood):
        def __init__(self, **kwargs):
        def _get_masked_obs(x):
            missing_idx = x.isnan()
            x_masked = x.masked_fill(missing_idx, -999.)
            return missing_idx, x_masked
        def expected_log_prob(self, target, input, *params, **kwargs):
            missing_idx, target = self._get_masked_obs(target)
            res = super().expected_log_prob(target, input, *params, **kwargs)
            return res * ~missing_idx
        def log_marginal(self, observations, function_dist, *params, **kwargs):
            missing_idx, observations = self._get_masked_obs(observations)
            res = super().log_marginal(observations, function_dist, *params, **kwargs)
            return res * ~missing_idx


    import torch
    import numpy as np
    from tqdm import trange
    from gpytorch.distributions import MultivariateNormal
    from gpytorch.constraints import Interval
    mu = torch.zeros(2, 3)
    sigma = torch.tensor([[
            [ 1,  1-1e-7, -1+1e-7],
            [ 1-1e-7,  1, -1+1e-7],
            [-1+1e-7, -1+1e-7,  1] ]]*2).float()
    mvn = MultivariateNormal(mu, sigma)
    x = mvn.sample_n(10000)
    # x[np.random.binomial(1, 0.1, size=x.shape).astype(bool)] = np.nan
    x += np.random.normal(0, 0.5, size=x.shape)
    LikelihoodOfChoice = GaussianLikelihood#WithMissingObs
    likelihood = LikelihoodOfChoice(noise_constraint=Interval(1e-6, 2))
    opt = torch.optim.Adam(likelihood.parameters(), lr=0.5)
    bar = trange(1000)
    for _ in bar:
        loss = -likelihood.log_marginal(x, mvn).sum()
        bar.set_description("nll: " + str(int(loss.data)))
    print(likelihood.noise.sqrt()) # Test 1
    likelihood.expected_log_prob(x[0], mvn) == likelihood.log_marginal(x[0], mvn) # Test 2

    Test 1 outputs the correct 0.5 as expected, and Test 2 is False with LikelihoodOfChoice = GaussianLikelihood and LikelihoodOfChoice = GaussianLikelihoodWithMissingObs.

    Any further tests and suggestions are appreciated. Can I open a PR for this?

  • [Docs] Pointer to get started with (bayesian) GPLVM

    [Docs] Pointer to get started with (bayesian) GPLVM

    I am in the process of exploring gpytorch from some of my GP applications. Currently I use pyro for GPLVM tasks (i.e. https://pyro.ai/examples/gplvm.html). I am always interested in trying out various approaches, so I would like to see how I can do similar things in gpytorch.

    Specifically, I am interested in the bayesian GPLVM as described in Titsias et al 2010.

    I have found some documentation on handling uncertain inputs, so I am guessing that would be a good place to start, but I would love to hear some thoughts from any of the gpytorch developers.

  • [Bug] Upstream changes to tensor comparisons breaks things

    [Bug] Upstream changes to tensor comparisons breaks things

    🐛 Bug

    After https://github.com/pytorch/pytorch/pull/21113 a bunch of tests are failing b/c of the change in tensor comparison behavior (return type from uint8 to bool). Creating this issue to track the fix.

  • [Bug] VNNGP Example throwing an Error

    [Bug] VNNGP Example throwing an Error

    🐛 Bug VNNGP Example throwing an Error

    Hello there. I am trying to run Variational Nearest Neighbor Gaussian Process (VNNGP) example from this webpage VNGPP.

    When I run this example, it throws an error which is given below. It throws the same error on my dataset as well. There are two training modes on the examples webpage to train VNNGP; both are throwing the same error. I downloaded the elevator dataset from one of the posts here since the link that you provided in the example is broken.

    A little bit about the error: This code nearest_neighbor_indices = self.nn_xinduce_idx[..., kl_indices - self.k, :].to(inducing_points.device) is seemingly causing the error. As per my information, kl_indices and self.nn_xinduce_idx are both on CPU, and I was able to put them on GPU by explicitly calling inside the GPModel class

    The code is given below:

    import tqdm
    import math
    import torch
    import gpytorch
    from matplotlib import pyplot as plt
    from torch.utils.data import TensorDataset, DataLoader
    # Make plots inline
    %matplotlib inline 
    import urllib.request
    import os
    from scipy.io import loadmat
    from math import floor
    data = torch.Tensor(loadmat('elevators.mat')['data'])
    X = data[:1000, :-1]
    X = X - X.min(0)[0]
    X = 2 * (X / X.max(0)[0].clamp_min(1e-6)) - 1
    y = data[:1000, -1]
    y = y.sub(y.mean()).div(y.std())
    train_n = int(floor(0.8 * len(X)))
    train_x = X[:train_n, :].contiguous()
    train_y = y[:train_n].contiguous()
    test_x = X[train_n:, :].contiguous()
    test_y = y[train_n:].contiguous()
    if torch.cuda.is_available():
        train_x, train_y, test_x, test_y = train_x.cuda(), train_y.cuda(), test_x.cuda(), test_y.cuda()
    from gpytorch.models import ApproximateGP
    from gpytorch.variational.nearest_neighbor_variational_strategy import NNVariationalStrategy
    class GPModel(ApproximateGP):
        def __init__(self, inducing_points, likelihood, k=256, training_batch_size=256):
            m, d = inducing_points.shape
            self.m = m
            self.k = k
            variational_distribution = gpytorch.variational.MeanFieldVariationalDistribution(m)
            if torch.cuda.is_available():
                inducing_points = inducing_points.cuda()
            variational_strategy = NNVariationalStrategy(self, inducing_points, variational_distribution, k=k,
            kl_indices1 = variational_strategy._get_training_indices()
            print(variational_strategy.nn_xinduce_idx[..., kl_indices1 - variational_strategy.k, :].to(variational_strategy.inducing_points.device))
            super(GPModel, self).__init__(variational_strategy)
            self.mean_module = gpytorch.means.ZeroMean()
            self.covar_module = gpytorch.kernels.MaternKernel(nu=2.5, ard_num_dims=d)
            self.likelihood = likelihood
        def forward(self, x):
            mean_x = self.mean_module(x)
            covar_x = self.covar_module(x)
            return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
        def __call__(self, x, prior=False, **kwargs):
            if x is not None:
                if x.dim() == 1:
                    x = x.unsqueeze(-1)
            return self.variational_strategy(x=x, prior=False, **kwargs)
    if smoke_test:
        k = 32
        training_batch_size = 32
        k = 256
        training_batch_size = 64
    likelihood = gpytorch.likelihoods.GaussianLikelihood()
    # Note: one should use full training set as inducing points!
    model = GPModel(inducing_points=train_x, likelihood=likelihood, k=k, training_batch_size=training_batch_size)
    if torch.cuda.is_available():
        likelihood = likelihood.cuda()
        model = model.cuda()

    OUTPUT of the above cell

    tensor([[457, 475, 435,  ..., 264, 400, 204],
            [266,  79, 309,  ..., 139, 129, 280],
            [228, 269, 150,  ..., 199, 144, 101],
            [153,  60, 573,  ..., 403,  57, 564],
            [ 32, 342, 715,  ..., 382, 336, 469],
            [656, 522, 352,  ..., 668, 313, 131]], device='cuda:0')
    num_epochs = 1 if smoke_test else 20
    num_batches = model.variational_strategy._total_training_batches
    optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
    # Our loss object. We're using the VariationalELBO
    mll = gpytorch.mlls.VariationalELBO(likelihood, model, num_data=train_y.size(0))
    epochs_iter = tqdm.notebook.tqdm(range(num_epochs), desc="Epoch")
    for epoch in epochs_iter:
        minibatch_iter = tqdm.notebook.tqdm(range(num_batches), desc="Minibatch", leave=False)
        for i in minibatch_iter:
            output = model(x=None)
            # Obtain the indices for mini-batch data
            current_training_indices = model.variational_strategy.current_training_indices
            # Obtain the y_batch using indices. It is important to keep the same order of train_x and train_y
            y_batch = train_y[...,current_training_indices]
            if torch.cuda.is_available():
                y_batch = y_batch.cuda()
            loss = -mll(output, y_batch)

    The ERROR is:

    RuntimeError                              Traceback (most recent call last)
    Input In [12], in <cell line: 15>()
         18 for i in minibatch_iter:
         19     optimizer.zero_grad()
    ---> 20     output = model(x=None)
         21     # Obtain the indices for mini-batch data
         22     current_training_indices = model.variational_strategy.current_training_indices
    Input In [11], in GPModel.__call__(self, x, prior, **kwargs)
         32     if x.dim() == 1:
         33         x = x.unsqueeze(-1)
    ---> 34 return self.variational_strategy(x=x, prior=False, **kwargs)
    File /work/flemingc/belal/anaconda3/envs/pytorch/lib/python3.10/site-packages/gpytorch/variational/nearest_neighbor_variational_strategy.py:131, in NNVariationalStrategy.__call__(self, x, prior, **kwargs)
        129 if self.training:
        130     self._clear_cache()
    --> 131     return self.forward(x, self.inducing_points, None, None)
        132 else:
        133     # Ensure inducing_points and x are the same size
        134     inducing_points = self.inducing_points
    File /work/flemingc/belal/anaconda3/envs/pytorch/lib/python3.10/site-packages/gpytorch/variational/nearest_neighbor_variational_strategy.py:168, in NNVariationalStrategy.forward(self, x, inducing_points, inducing_values, variational_inducing_covar, **kwargs)
        165     # sample a different indices for stochastic estimation of kl
        166     kl_indices = self._get_training_indices()
    --> 168 kl = self._kl_divergence(kl_indices)
        169 add_to_cache(self, "kl_divergence_memo", kl)
        171 return MultivariateNormal(predictive_mean, DiagLinearOperator(predictive_var))
    File /work/flemingc/belal/anaconda3/envs/pytorch/lib/python3.10/site-packages/gpytorch/variational/nearest_neighbor_variational_strategy.py:327, in NNVariationalStrategy._kl_divergence(self, kl_indices, compute_full, batch_size)
        325         kl = self._firstk_kl_helper() * self.M / self.k
        326     else:
    --> 327         kl = self._stochastic_kl_helper(kl_indices) * self.M / len(kl_indices)
        328 return kl
    File /work/flemingc/belal/anaconda3/envs/pytorch/lib/python3.10/site-packages/gpytorch/variational/nearest_neighbor_variational_strategy.py:265, in NNVariationalStrategy._stochastic_kl_helper(self, kl_indices)
        263 # Select a mini-batch of inducing points according to kl_indices, and their k-nearest neighbors
        264 inducing_points = self.inducing_points[..., kl_indices, :]
    --> 265 nearest_neighbor_indices = self.nn_xinduce_idx[..., kl_indices - self.k, :].to(inducing_points.device)
        266 expanded_inducing_points_all = self.inducing_points.unsqueeze(-2).expand(
        267     *self._inducing_batch_shape, self.M, self.k, self.D
        268 )
        269 expanded_nearest_neighbor_indices = nearest_neighbor_indices.unsqueeze(-1).expand(
        270     *self._inducing_batch_shape, kl_bs, self.k, self.D
        271 )
    RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
  • [Bug] Use of zip leads to silent incomplete evaluation of IndependentModelList

    [Bug] Use of zip leads to silent incomplete evaluation of IndependentModelList

    🐛 Bug

    Due to the use of zip, when fewer than # of models inputs are passed to IndependentModelList.__call__/forward (or even fantasize), it will only evaluate the first # of inputs models and ignore the rest.

    To reproduce

    This extracted from the unit tests for IndependentModelList, which also falls into this bug.

    ** Code snippet to reproduce **

    import gpytorch
    from gpytorch.models import IndependentModelList
    class ExactGPModel(ExactGP):
        def __init__(self, train_x, train_y, likelihood):
            super().__init__(train_x, train_y, likelihood)
            self.mean_module = gpytorch.means.ConstantMean()
            self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())
        def forward(self, x):
            mean_x = self.mean_module(x)
            covar_x = self.covar_module(x)
            return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
    N_PTS = 5
    def create_test_data():
        return torch.randn(N_PTS, 1)
    def create_likelihood_and_labels():
        likelihood = gpytorch.likelihoods.GaussianLikelihood()
        labels = torch.randn(N_PTS) + 2
        return likelihood, labels
    def create_model():
        data = create_test_data()
        likelihood, labels = create_likelihood_and_labels()
        return ExactGPModel(data, labels, likelihood)
    models = [create_model() for _ in range(2)]
    model = IndependentModelList(*models)
    # This outputs [MultivariateNormal(loc: torch.Size([3]))], which is the evaluation of the first model only.
    model(torch.rand(3), torch.rand(3))
    # This outputs [MultivariateNormal(loc: torch.Size([3])), MultivariateNormal(loc: torch.Size([3]))], which is both models.

    Expected Behavior

    Either error out when the # of inputs does not match # of models, or repeat the input if only a single tensor input is given. I'd be happy to patch this if we agree on a solution.

    System information

    Please complete the following information:

    • GPyTorch: Latest
    • PyTorch: Latest
    • OS: CentOS8

    Additional context

    Originally surfaced in https://github.com/pytorch/botorch/issues/1467

  • [Feature Request] gpytorch on mps

    [Feature Request] gpytorch on mps

    🚀 Feature Request

    Enabling gpytorch to run on Apple Silicon mps devices


    Apple silicon is now on the scene for a while and looks like it is going to stay, and many devs have that as a primary device. Local development on such devices could benefit from gpytorch to be able to run on maps devices and not only cuda


    we could replace hardcoded calls to cuda methods in PyTorch with equivalents in the mps module. https://pytorch.org/docs/stable/notes/mps.html

    Am not sure about the extent of rewriting involved if any is required though.

    Additional context

  • [Bug] MultiDeviceKernel fails to put tensors on the same device

    [Bug] MultiDeviceKernel fails to put tensors on the same device

    🐛 Bug

    I was experimenting with the tutorial of Exact GP multiple GPUs here. However, when the base kernel was changed from RBF kernel to piecewise polynomial kernel, an error showed up that tensors are not on the same device.

    To reproduce

    ** Code snippet to reproduce **

    import torch
    import gpytorch
    from LBFGS import FullBatchLBFGS
    import os
    import numpy as np
    import urllib.request
    from scipy.io import loadmat
    dataset = 'protein'
    if not os.path.isfile(f'../../datasets/UCI/{dataset}.mat'):
        print(f'Downloading \'{dataset}\' UCI dataset...')
    data = torch.Tensor(loadmat(f'../../datasets/UCI/{dataset}.mat')['data'])
    n_train = 4000
    train_x, train_y = data[:n_train, :-1], data[:n_train, -1]
    n_devices = torch.cuda.device_count()
    output_device = torch.device('cuda:0')
    train_x, train_y = train_x.contiguous().to(output_device), train_y.contiguous().to(output_device)
    class ExactGPModel(gpytorch.models.ExactGP):
        def __init__(self, train_x, train_y, likelihood, n_devices):
            super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
            self.mean_module = gpytorch.means.ConstantMean()
            # change kernel here ----------------------------------------------------|
            base_covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.PiecewisePolynomialKernel())
            self.covar_module = gpytorch.kernels.MultiDeviceKernel(
                base_covar_module, device_ids=range(n_devices),
        def forward(self, x):
            mean_x = self.mean_module(x)
            covar_x = self.covar_module(x)
            return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
    def train(train_x,
        likelihood = gpytorch.likelihoods.GaussianLikelihood().to(output_device)
        model = ExactGPModel(train_x, train_y, likelihood, n_devices).to(output_device)
        optimizer = FullBatchLBFGS(model.parameters(), lr=0.1)
        mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)
        with gpytorch.beta_features.checkpoint_kernel(checkpoint_size), \
            def closure():
                output = model(train_x)
                loss = -mll(output, train_y)
                return loss
            loss = closure()
            options = {'closure': closure, 'current_loss': loss, 'max_ls': 10}
            loss, _, _, _, _, _, _, fail = optimizer.step(options)
        return model, likelihood
    _, _ = train(train_x, train_y,
                 n_devices=n_devices, output_device=output_device,
                 checkpoint_size=0, preconditioner_size=100)

    ** Stack trace/error message **

    RuntimeError                              Traceback (most recent call last)
    Input In [4], in <cell line: 1>()
    ----> 1 _, _ = train(train_x, train_y,
          2              n_devices=n_devices, output_device=output_device,
          3              checkpoint_size=0, preconditioner_size=100)
    Input In [3], in train(train_x, train_y, n_devices, output_device, checkpoint_size, preconditioner_size)
         41     return loss
         43 loss = closure()
    ---> 44 loss.backward()
         46 options = {'closure': closure, 'current_loss': loss, 'max_ls': 10}
         47 loss, _, _, _, _, _, _, fail = optimizer.step(options)
    File ~/anaconda3/envs/pyg/lib/python3.8/site-packages/torch/_tensor.py:396, in Tensor.backward(self, gradient, retain_graph, create_graph, inputs)
        387 if has_torch_function_unary(self):
        388     return handle_torch_function(
        389         Tensor.backward,
        390         (self,),
        394         create_graph=create_graph,
        395         inputs=inputs)
    --> 396 torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
    File ~/anaconda3/envs/pyg/lib/python3.8/site-packages/torch/autograd/__init__.py:173, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
        168     retain_graph = create_graph
        170 # The reason we repeat same the comment below is that
        171 # some Python versions print out the first line of a multi-line function
        172 # calls in the traceback and some print out the last line
    --> 173 Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
        174     tensors, grad_tensors_, retain_graph, create_graph, inputs,
        175     allow_unreachable=True, accumulate_grad=True)
    RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cpu!

    System information

    Please complete the following information:

    • GPyTorch Version: 1.9.0
    • PyTorch Version: 1.12.1
    • Computer OS: Ubuntu 16.04.5 LTS
    • CUDA version: 11.3
    • CUDA devices: two NVIDIA 3090

    Additional context

    I further experimented with training size and similar issue showed up when n_train = 100 using RBF kernel. Please see the error message below.

    RuntimeError                              Traceback (most recent call last)
    Input In [4], in <cell line: 1>()
    ----> 1 _, _ = train(train_x, train_y,
          2              n_devices=n_devices, output_device=output_device,
          3              checkpoint_size=0, preconditioner_size=100)
    Input In [3], in train(train_x, train_y, n_devices, output_device, checkpoint_size, preconditioner_size)
         40     loss = -mll(output, train_y)
         41     return loss
    ---> 43 loss = closure()
         44 loss.backward()
         46 options = {'closure': closure, 'current_loss': loss, 'max_ls': 10}
    Input In [3], in train.<locals>.closure()
         38 optimizer.zero_grad()
         39 output = model(train_x)
    ---> 40 loss = -mll(output, train_y)
         41 return loss
    File ~/anaconda3/envs/pyg/lib/python3.8/site-packages/gpytorch/module.py:30, in Module.__call__(self, *inputs, **kwargs)
         29 def __call__(self, *inputs, **kwargs):
    ---> 30     outputs = self.forward(*inputs, **kwargs)
         31     if isinstance(outputs, list):
         32         return [_validate_module_outputs(output) for output in outputs]
    File ~/anaconda3/envs/pyg/lib/python3.8/site-packages/gpytorch/mlls/exact_marginal_log_likelihood.py:64, in ExactMarginalLogLikelihood.forward(self, function_dist, target, *params)
         62 # Get the log prob of the marginal distribution
         63 output = self.likelihood(function_dist, *params)
    ---> 64 res = output.log_prob(target)
         65 res = self._add_other_terms(res, params)
         67 # Scale by the amount of data we have
    File ~/anaconda3/envs/pyg/lib/python3.8/site-packages/gpytorch/distributions/multivariate_normal.py:169, in MultivariateNormal.log_prob(self, value)
        167 # Get log determininant and first part of quadratic form
        168 covar = covar.evaluate_kernel()
    --> 169 inv_quad, logdet = covar.inv_quad_logdet(inv_quad_rhs=diff.unsqueeze(-1), logdet=True)
        171 res = -0.5 * sum([inv_quad, logdet, diff.size(-1) * math.log(2 * math.pi)])
        172 return res
    File ~/anaconda3/envs/pyg/lib/python3.8/site-packages/linear_operator/operators/_linear_operator.py:1594, in LinearOperator.inv_quad_logdet(self, inv_quad_rhs, logdet, reduce_inv_quad)
       1592             will_need_cholesky = False
       1593     if will_need_cholesky:
    -> 1594         cholesky = CholLinearOperator(TriangularLinearOperator(self.cholesky()))
       1595     return cholesky.inv_quad_logdet(
       1596         inv_quad_rhs=inv_quad_rhs,
       1597         logdet=logdet,
       1598         reduce_inv_quad=reduce_inv_quad,
       1599     )
       1601 # Short circuit to inv_quad function if we're not computing logdet
    File ~/anaconda3/envs/pyg/lib/python3.8/site-packages/linear_operator/operators/_linear_operator.py:1229, in LinearOperator.cholesky(self, upper)
       1221 @_implements(torch.linalg.cholesky)
       1222 def cholesky(self, upper: bool = False) -> "TriangularLinearOperator":  # noqa F811
       1223     """
       1224     Cholesky-factorizes the LinearOperator.
       1226     :param upper: Upper triangular or lower triangular factor (default: False).
       1227     :return: Cholesky factor (lower or upper triangular)
       1228     """
    -> 1229     chol = self._cholesky(upper=False)
       1230     if upper:
       1231         chol = chol._transpose_nonbatch()
    File ~/anaconda3/envs/pyg/lib/python3.8/site-packages/linear_operator/utils/memoize.py:59, in _cached.<locals>.g(self, *args, **kwargs)
         57 kwargs_pkl = pickle.dumps(kwargs)
         58 if not _is_in_cache(self, cache_name, *args, kwargs_pkl=kwargs_pkl):
    ---> 59     return _add_to_cache(self, cache_name, method(self, *args, **kwargs), *args, kwargs_pkl=kwargs_pkl)
         60 return _get_from_cache(self, cache_name, *args, kwargs_pkl=kwargs_pkl)
    File ~/anaconda3/envs/pyg/lib/python3.8/site-packages/linear_operator/operators/_linear_operator.py:483, in LinearOperator._cholesky(self, upper)
        480 if any(isinstance(sub_mat, KeOpsLinearOperator) for sub_mat in evaluated_kern_mat._args):
        481     raise RuntimeError("Cannot run Cholesky with KeOps: it will either be really slow or not work.")
    --> 483 evaluated_mat = evaluated_kern_mat.to_dense()
        485 # if the tensor is a scalar, we can just take the square root
        486 if evaluated_mat.size(-1) == 1:
    File ~/anaconda3/envs/pyg/lib/python3.8/site-packages/linear_operator/utils/memoize.py:59, in _cached.<locals>.g(self, *args, **kwargs)
         57 kwargs_pkl = pickle.dumps(kwargs)
         58 if not _is_in_cache(self, cache_name, *args, kwargs_pkl=kwargs_pkl):
    ---> 59     return _add_to_cache(self, cache_name, method(self, *args, **kwargs), *args, kwargs_pkl=kwargs_pkl)
         60 return _get_from_cache(self, cache_name, *args, kwargs_pkl=kwargs_pkl)
    File ~/anaconda3/envs/pyg/lib/python3.8/site-packages/linear_operator/operators/sum_linear_operator.py:68, in SumLinearOperator.to_dense(self)
         66 @cached
         67 def to_dense(self):
    ---> 68     return (sum(linear_op.to_dense() for linear_op in self.linear_ops)).contiguous()
    File ~/anaconda3/envs/pyg/lib/python3.8/site-packages/linear_operator/operators/sum_linear_operator.py:68, in <genexpr>(.0)
         66 @cached
         67 def to_dense(self):
    ---> 68     return (sum(linear_op.to_dense() for linear_op in self.linear_ops)).contiguous()
    File ~/anaconda3/envs/pyg/lib/python3.8/site-packages/linear_operator/operators/cat_linear_operator.py:378, in CatLinearOperator.to_dense(self)
        377 def to_dense(self):
    --> 378     return torch.cat([to_dense(L) for L in self.linear_ops], dim=self.cat_dim)
    RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument tensors in method wrapper_cat)
  • Convolutional Kernel

    Convolutional Kernel

    🚀 Feature Request

    Implementation of a convolutional kernel in pytorch like this paper.


    Is your feature request related to a problem? Please describe. No convolutional kernel in gpytorch, only gpflow!


    Describe the solution you'd like Implementation of convolutional kernel

    Describe alternatives you've considered Using GPflow

    Are you willing to open a pull request? (We LOVE contributions!!!)

    Additional context

A Python implementation of global optimization with gaussian processes.
A Python implementation of global optimization with gaussian processes.

Bayesian Optimization Pure Python implementation of bayesian global optimization with gaussian processes. PyPI (pip): $ pip install bayesian-optimizat

Dec 1, 2022
Supplementary code for the AISTATS 2021 paper "Matern Gaussian Processes on Graphs".
Supplementary code for the AISTATS 2021 paper

Matern Gaussian Processes on Graphs This repo provides an extension for gpflow with Matérn kernels, inducing variables and trainable models implemente

Nov 29, 2022
A Tensorflow based library for Time Series Modelling with Gaussian Processes

Markovflow Documentation | Tutorials | API reference | Slack What does Markovflow do? Markovflow is a Python library for time-series analysis via prob

Nov 26, 2022
Paddle implementation for "Highly Efficient Knowledge Graph Embedding Learning with Closed-Form Orthogonal Procrustes Analysis" (NAACL 2021)

ProcrustEs-KGE Paddle implementation for Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis ?? A more detailed re

Jun 9, 2021
A highly efficient, fast, powerful and light-weight anime downloader and streamer for your favorite anime.
A highly efficient, fast, powerful and light-weight anime downloader and streamer for your favorite anime.

AnimDL - Download & Stream Your Favorite Anime AnimDL is an incredibly powerful tool for downloading and streaming anime. Core features Abuses the dev

Dec 2, 2022
Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)
Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral) This is the official implementat

Dec 1, 2022
Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.
Official repository for

Restormer: Efficient Transformer for High-Resolution Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan,

Nov 28, 2022
Robust, modular and efficient implementation of advanced Hamiltonian Monte Carlo algorithms

AdvancedHMC.jl AdvancedHMC.jl provides a robust, modular and efficient implementation of advanced HMC algorithms. An illustrative example for Advanced

Nov 29, 2022
Official Pytorch implementation of ICLR 2018 paper Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge.
Official Pytorch implementation of ICLR 2018 paper Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge.

Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge: Official Pytorch implementation of ICLR 2018 paper Deep Learning for Phy

Nov 6, 2022
Official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch.

Multi-speaker DGP This repository provides official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch. O

Sep 7, 2022
PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Study-CSRNet-pytorch This is the PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Mar 1, 2022
Implementation of "Fast and Flexible Temporal Point Processes with Triangular Maps" (Oral @ NeurIPS 2020)

Fast and Flexible Temporal Point Processes with Triangular Maps This repository includes a reference implementation of the algorithms described in "Fa

Aug 24, 2022
Efficient-GlobalPointer - Pytorch Efficient GlobalPointer
Efficient-GlobalPointer - Pytorch Efficient GlobalPointer

引言 感谢苏神带来的模型,原文地址:https://spaces.ac.cn/archives/8877 如何运行 对应模型EfficientGlobalPoi

Nov 15, 2022
Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.
Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.

Faster R-CNN and Mask R-CNN in PyTorch 1.0 maskrcnn-benchmark has been deprecated. Please see detectron2, which includes implementations for all model

Nov 30, 2022
PyTorch implementation of Value Iteration Networks (VIN): Clean, Simple and Modular. Visualization in Visdom.
PyTorch implementation of Value Iteration Networks (VIN): Clean, Simple and Modular. Visualization in Visdom.

VIN: Value Iteration Networks This is an implementation of Value Iteration Networks (VIN) in PyTorch to reproduce the results.(TensorFlow version) Key

Sep 9, 2022
QuakeLabeler is a Python package to create and manage your seismic training data, processes, and visualization in a single place — so you can focus on building the next big thing.
QuakeLabeler is a Python package to create and manage your seismic training data, processes, and visualization in a single place — so you can focus on building the next big thing.

QuakeLabeler Quake Labeler was born from the need for seismologists and developers who are not AI specialists to easily, quickly, and independently bu

Nov 4, 2022
Everything you want about DP-Based Federated Learning, including Papers and Code. (Mechanism: Laplace or Gaussian, Dataset: femnist, shakespeare, mnist, cifar-10 and fashion-mnist. )

Differential Privacy (DP) Based Federated Learning (FL) Everything about DP-based FL you need is here. (所有你需要的DP-based FL的信息都在这里) Code Tip: the code o

Nov 26, 2022
A Python package for faster, safer, and simpler ML processes

Bender ?? A Python package for faster, safer, and simpler ML processes. Why use bender? Bender will make your machine learning processes, faster, safe

Jan 7, 2022