cuDF - GPU DataFrame Library

 cuDF - GPU DataFrames

Build Status

NOTE: For the latest stable README.md ensure you are on the main branch.

Resources

Overview

Built based on the Apache Arrow columnar memory format, cuDF is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data.

cuDF provides a pandas-like API that will be familiar to data engineers & data scientists, so they can use it to easily accelerate their workflows without going into the details of CUDA programming.

For example, the following snippet downloads a CSV, then uses the GPU to parse it into rows and columns and run calculations:

import cudf, io, requests
from io import StringIO

url = "https://github.com/plotly/datasets/raw/master/tips.csv"
content = requests.get(url).content.decode('utf-8')

tips_df = cudf.read_csv(StringIO(content))
tips_df['tip_percentage'] = tips_df['tip'] / tips_df['total_bill'] * 100

# display average tip by dining party size
print(tips_df.groupby('size').tip_percentage.mean())

Output:

size
1    21.729201548727808
2    16.571919173482897
3    15.215685473711837
4    14.594900639351332
5    14.149548965142023
6    15.622920072028379
Name: tip_percentage, dtype: float64

For additional examples, browse our complete API documentation, or check out our more detailed notebooks.

Quick Start

Please see the Demo Docker Repository, choosing a tag based on the NVIDIA CUDA version you’re running. This provides a ready to run Docker container with example notebooks and data, showcasing how you can utilize cuDF.

Installation

CUDA/GPU requirements

  • CUDA 10.1+
  • NVIDIA driver 418.39+
  • Pascal architecture or better (Compute Capability >=6.0)

Conda

cuDF can be installed with conda (miniconda, or the full Anaconda distribution) from the rapidsai channel:

For cudf version == 0.18 :

# for CUDA 10.1
conda install -c rapidsai -c nvidia -c numba -c conda-forge \
    cudf=0.18 python=3.7 cudatoolkit=10.1

# or, for CUDA 10.2
conda install -c rapidsai -c nvidia -c numba -c conda-forge \
    cudf=0.18 python=3.7 cudatoolkit=10.2

For the nightly version of cudf :

# for CUDA 10.1
conda install -c rapidsai-nightly -c nvidia -c numba -c conda-forge \
    cudf python=3.7 cudatoolkit=10.1

# or, for CUDA 10.2
conda install -c rapidsai-nightly -c nvidia -c numba -c conda-forge \
    cudf python=3.7 cudatoolkit=10.2

Note: cuDF is supported only on Linux, and with Python versions 3.7 and later.

See the Get RAPIDS version picker for more OS and version info.

Build/Install from Source

See build instructions.

Contributing

Please see our guide for contributing to cuDF.

Contact

Find out more details on the RAPIDS site

Open GPU Data Science

The RAPIDS suite of open source software libraries aim to enable execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

Apache Arrow on GPU

The GPU version of Apache Arrow is a common API that enables efficient interchange of tabular data between processes running on the GPU. End-to-end computation on the GPU avoids unnecessary copying and converting of data off the GPU, reducing compute time and cost for high-performance analytics common in artificial intelligence workloads. As the name implies, cuDF uses the Apache Arrow columnar data format on the GPU. Currently, a subset of the features in Apache Arrow are supported.

Owner
Comments
  • Make a plan for sort_values/set_index

    Make a plan for sort_values/set_index

    It would be nice to be able to use the set_index method to sort the dataframe by a particular column.

    There are currently two implementations for this, one in dask.dataframe and one in dask-cudf which uses a batcher sorting net. While most dask-cudf code has been removed in favor of the dask.dataframe implementations this sorting code has remained, mostly because I don't understand it fully, and don't know if there was a reason for this particular implementation.

    Why was this implementation chosen? Was this discussed somewhere? Alternatively @sklam, do you have any information here?

    cc @kkraus14 @randerzander

  • [WIP] Update cudf.to_parquet to use new GPU accelerated Parquet writer

    [WIP] Update cudf.to_parquet to use new GPU accelerated Parquet writer

    Update cudf.to_parquet to use new GPU accelerated Parquet writer. This including creating the appropriate c++ interface in io_writers and io_functions along with modifications to parquet pyx and pxd files.

    This closes #3574

  • [QST] cuDF performance with gridsearchcv

    [QST] cuDF performance with gridsearchcv

    In a conversation with @kkraus14 about cudf usage with cuml+gridsearch we looked at cudf performance. Attached is profile plot of running gridsearch+cuml+cudf.

    Screen Shot 2019-05-30 at 1 12 02 PM

    Folks can download the full dask profile here: https://gist.github.com/quasiben/1da49c5aa6e61d979dd42ce6c50e79b3

    In the image above you can see that the computation is spending ~80% in the iloc call. My initial thought was that iloc/_prepare_series_for_add could be improved. I believe @kkraus14 suggested we look at host_to_device transfers see if we can build requisite indicies in cuda/cupy/numba_cuda instead of numpy (this is required during splitting/kfold calls)

  • [DISCUSSION] libcudf column abstraction redesign

    [DISCUSSION] libcudf column abstraction redesign

    Creating and interfacing with the gdf_column C struct is rife with problems. We need better abstraction(s) to make libcudf a safer and more pleasant library to use and develop.

    In lieu of adding support for any new column types, the initial goal of a cudf::column design is to ease the current pain points in creating and interfacing with the column data structure in libcudf.

    Goals:

    • Identify pain points with existing gdf_column structure
    • Derive requirements for an abstraction or set of abstractions to ease those pain points
    • Define an API design that satisfies the requirements
    • Provide a working implementation of the design

    Non-Goals

    • Derive requirements to support new column types, e.g., variable width elements, compressed columns, etc.
    • Support delayed materialization or lazy evaluation

    Note that a “Non-Goal” is not something that we want to expressly forbid in our redesign, but rather are not the focus of the current effort. Whenever possible, we can make design decisions that will enable these “Non-Goals” sometime in the future, so long as those decisions do not compromise the timely accomplishment of the above “Goals”

    Process

    1. Gather pain points

    • Those who wish to participate should list 3-5 pain points (in priority order) that they would like to solve with the column redesign.
      • Note that choosing to participate implies a commitment to putting in the effort to derive requirements and provide feedback on designs, i.e., if you want something to change, you’re expected to put in the work to make it happen.
    • Pain points should be submitted by responding to this issue.
    • @jrhemstad will take responsibility for gathering pain points and distilling/organizing based on functional area.
    • Proposed Deadline: 0.7 release

    2. Derive requirements

    • Distill pain points into satisfiable requirements
    • @jrhemstad will take responsibility for providing an initial draft of requirements from pain points and distributing for feedback.
    • Stakeholders will provide feedback on requirements and iterate until consensus is reached on initial requirements
    • Proposed Deadline: 0.8 Release

    3. Design Mock Up

    • Create draft interface of class(es) that attempt to satisfy requirements.
    • APIs should be fully Doxymented.
    • Code does not need to function nor compile
    • Design should be submitted via a PR to cuDF
    • TBD will take responsibility for providing an initial interface design
    • Stakeholders will provide feedback and iterate until consensus is reached on design
    • Proposed Deadline: 0.8 Release

    4. Implementation

    • Implement the agreed upon interface
    • Should provide Google Test unit tests
    • Implementation/testing will likely expose necessary design changes
    • Implementation should be submitted as a PR to cuDF
    • TBD will take responsibility for implementing/testing the design
    • Stakeholders will review implementation PR until consensus is reached
    • Proposed Deadline 0.8 Release

    5. Initial Refactor

    • Two candidate libcudf features shall be chosen for refactoring to use the new cudf::column abstraction
    • Two developers (TBD) will take responsibility for refactoring the features (one each) to use the newly designed abstraction(s) and submitting a cuDF PR for review. At least one of the developers shall be different from the developer who designed and implemented the column abstraction.
    • Any required design changes exposed in refactoring shall be discussed in the PR
    • Stakeholders will review refactored feature until consensus is reached
    • TBD will be responsible for creating/amending a style guide with lessons learned and best practices for refactoring a feature using gdf_column to the new abstraction(s)
    • Proposed Deadline: 0.9 Release

    6. Full Refactor

    • Remaining libcudf features will be refactored one at a time to use the new column abstraction(s)
    • The style guide mentioned above will be distributed to all libcudf developers to provide guidance in this refactoring effort
    • This will be an ongoing process that likely will not be fully complete for several releases
  • [FEA] Make cudf::size_type 64-bit

    [FEA] Make cudf::size_type 64-bit

    Is your feature request related to a problem? Please describe. cudf::size_type is currently an int32_t, which limits column size to two billion elements (MAX_INT). Moreover, it limits child column size to the same. This causes problems, for example, for string columns, where there may be fewer than 2B strings, but the character data to represent them could easily exceed 2B characters.

    A 32-bit size was originally chosen to ensure compatibility with Apache Arrow, which dictates that Arrow arrays have a 32-bit size, and that larger arrays are made by chunking into individual Arrays.

    Describe the solution you'd like

    • Change size_type to be an int64_t.

    • Handle compatibility with Arrow by creating arrow chunked arrays in the libcudf to_arrow interface (not yet created), and combine arrow chunked arrays in the libcudf from_arrow interface. This can be dealt with when we create these APIs.

    Describe alternatives you've considered

    Chunked columns. This would be very challenging -- supporting chunked columns in every algorithm would result in complex distributed algorithms and implementations, where libcudf currently aims to be communication agnostic / ignorant. In other words, a higher level library handles distributed algorithms.

    Additional context

    A potential downside: @felipeblazing called us brave for considering supporting chunked columns. If we implement this feature request, perhaps he will not consider us quite so brave. :(

  • [Discussion] Requirements for schema/column names

    [Discussion] Requirements for schema/column names

    There have been a number of requests related to adding column names, either to the column's themselves and/or to tables and their views.

    libcudf internals don't use column names, so we need requirements to be driven by users that will make use of the names (cuIO/Spark/cuDF).

    For those who need column names, please discuss what you would like to see for column names.

    CC @kkraus14 @revans2 @jlowe @j-ieong @shwina

  • [BUG] nan_as_null parameter affects output of sort_values.

    [BUG] nan_as_null parameter affects output of sort_values.

    Describe the bug nan_as_null parameter affects output of sort_values.

    Steps/Code to reproduce bug

    In [22]: df = cudf.DataFrame({'a': cudf.Series([np.nan, 1.0, np.nan, 2.0, np.nan, 0.0], nan_as_null=True)})
    
    In [23]: print(df.sort_values(by='a'))
         a
    5  0.0
    1  1.0
    3  2.0
    0
    2
    4
    In [19]: df = cudf.DataFrame({'a': cudf.Series([np.nan, 1.0, np.nan, 2.0, np.nan, 0.0], nan_as_null=False)})
    
    In [20]: print(df.sort_values(by='a'))
         a
    0  nan
    1  1.0
    2  nan
    3  2.0
    4  nan
    5  0.0
    

    similar issues with methods using libcudf APIs. Eg. drop_duplicates (which uses sorting)

    df = cudf.DataFrame({'a': cudf.Series([1.0, np.nan, 0, np.nan, 0, 1], nan_as_null=False)})
    
    In [10]: print(df)
         a
    0  1.0
    1  nan
    2  0.0
    3  nan
    4  0.0
    5  1.0
    
    In [11]: print(df.drop_duplicates())
         a
    0  1.0
    1  nan
    2  0.0
    3  nan
    4  0.0
    5  1.0
    

    Expected behavior For sorting, drop_duplicates, nan should be considered equal.

    Environment overview (please complete the following information)

    • Environment location: Bare-metal
    • Method of cuDF install: from source
  • [DISCUSSION] Behavior for NaN comparisons in libcudf

    [DISCUSSION] Behavior for NaN comparisons in libcudf

    Recent issues (https://github.com/rapidsai/cudf/issues/4753 https://github.com/rapidsai/cudf/issues/4752) have called into question how libcudf handles NaN floating point values. We've only ever addressed this issue on an ad hoc basis as opposed to having a larger conversation about the issue.

    C++

    C++ follows the IEEE 754 standard for floating point values, which for comparisons with NaN has the following behavior:

    | Comparison | NaN ≥ x | NaN ≤ x | NaN > x | NaN < x | NaN = x | NaN ≠ x | |------------|--------------|--------------|--------------|--------------|--------------|-------------| | Result | Always False | Always False | Always False | Always False | Always False | Always True |

    https://en.wikipedia.org/wiki/NaN

    Spark

    Spark is non-conforming with the IEEE 754 standard:

    | Comparison | NaN ≥ x | NaN ≤ x | NaN > x | NaN < x | NaN = x | NaN ≠ x | |------------|-------------|-----------------------|-------------|--------------|------------------|----------------------| | Result | Always True | False unless x is NaN | Always True | Always False | True only if x is NaN | True unless x is NaN |

    See https://spark.apache.org/docs/latest/sql-reference.html#nan-semantics

    Python/Pandas

    Python is a bit of a grey area because prior to 1.0, Pandas did not have the concept of "null" values and used NaN's in their stead.

    In most regards, Python does respect IEEE 754. For example, see how numpy conforms with the expected IEEE754 behavior in binary ops https://github.com/rapidsai/cudf/issues/4752#issuecomment-606649251 (where Spark does not).

    However, there are some cases where Pandas is non-conforming due to the pseudo-null behavior. For example, in sort_values there is a na_position argument to control where NaN values are placed. This requires specializing the libcudf comparator used for sorting to special case floating point values and deviate from the IEEE 754 behavior of NaN < x == false and NaN > x == false. See https://github.com/rapidsai/cudf/issues/2191 and https://github.com/rapidsai/cudf/issues/3226 where this was done previously.

    That said, I believe Python's requirements could be satisfied by always converting NaN values to nulls, but @shwina @kkraus14 will need to confirm. Prior to Pandas 1.0, it wasn't possible to have both NaN and NULL values in a floating point column. We should see what the expected behavior is of NaNs vs Nulls will be in 1.0.

    Discussion

    We need to have a conversation and make decisions on what libcudf will and will not do with respect to NaN behavior.

    My stance is that libcudf should adhere to IEEE 754. Spark's semantics redefine a core concept of the C++ language/IEEE standard and satisfying those semantics would require extremely invasive changes that negatively impact both performance and code maintainability.

    Even worse, because Spark differs from C++/Pandas, we need to provide separate code paths for all comparison based operations: a "Spark" path, and a "C++/Pandas" path. This further increases code bloat and maintenance costs.

    Furthermore, for consistency, I think we should roll back the non-conformant changes introduced for comparators in https://github.com/rapidsai/cudf/issues/3226.

    In conclusion, we already have special logic for handling NULLs everywhere in libcudf. Users should leverage that logic by converting NaNs to NULLs. I understand that vanilla Spark treats NaNs and NULLs independently, but I believe trying to imitate that behavior in libcudf comes at too high a cost.

  • [DOC] [BUG] Building from source fails as deps are not fetched

    [DOC] [BUG] Building from source fails as deps are not fetched

    Describe the bug Building v0.9.0 from source fails as some dependencies are missing or not fetched.

    Steps/Code to reproduce bug

    • git clone and checkout v0.9.0.
    • update submodules
    • bash build.sh libcudf
    -- RMM: RMM_LIBRARY set to RMM_LIBRARY-NOTFOUND
    -- RMM: RMM_INCLUDE set to RMM_INCLUDE-NOTFOUND
    -- DLPACK: DLPACK_INCLUDE set to DLPACK_INCLUDE-NOTFOUND
    -- NVSTRINGS: NVSTRINGS_INCLUDE set to NVSTRINGS_INCLUDE-NOTFOUND
    -- NVSTRINGS: NVSTRINGS_LIBRARY set to NVSTRINGS_LIBRARY-NOTFOUND
    -- NVSTRINGS: NVCATEGORY_LIBRARY set to NVCATEGORY_LIBRARY-NOTFOUND
    -- NVSTRINGS: NVTEXT_LIBRARY set to NVTEXT_LIBRARY-NOTFOUND
    

    Expected behavior Build succeeds without missing deps.

    Environment overview (please complete the following information)

    • Environment location: Centos 7, avx512
    • Method of cuDF install: source

    Additional context

    • The documentation does not state dlpack, rmm or nvstrings as dependencies.
    • According to the RMM README

    RMM currently must be built from source. This happens automatically in a submodule when you build or install cuDF or RAPIDS containers.

    Users should then expect that theses deps should be automatically pulled.

  • [FEA] CUDA versions between PyTorch and RAPIDS

    [FEA] CUDA versions between PyTorch and RAPIDS

    Hi Developers,

    Thanks for the great tools you have made. Our group would like to use cudf for deep learning, however pytorch currently only support CUDA 10.2 and CUDA 11.1, the nightly version rapids supported is CUDA 11.0 and 11.2, which is a pain for users (mostly scientist) if they need to compile either pytorch or rapids from source. Is that possible for rapids to support CUDA 11.1 for user to install from conda?

    I noticed that #8224 just remove cuda 11.1 related files.

    Thanks! Richard

  • Use cuFile for Parquet IO when available

    Use cuFile for Parquet IO when available

    Adds optional cuFile integration:

    • cufile.h is included in the build when available.
    • libcufile.so is loaded at runtime if LIBCUDF_CUFILE_POLICY environment variable is set to "ALWAYS" or "GDS".
    • cuFile compatibility mode is set through the same policy variable - "ALWAYS" means on, "GDS" means off.
    • cuFile is currently only used on Parquet R/W and in CSV writer.
    • device_read/write API can be used with file datasource/data_sink.
    • Added CUDA stream to device_read.
  • Raise warnings as errors in the test suite

    Raise warnings as errors in the test suite

    Description

    This PR fixes a final few warnings introduced in PRs merged after #12406 or with changes that weren't merged into #12406. More importantly, this PR updates the testing suite to always raise uncaught warnings as errors. This is a substantial change to the testing suite that will allow us to proactively react to deprecations in our dependencies, identify areas where pandas/pyarrow/some other dependency handles input data in different ways than us, etc. I've opted for the strictest possible setting, but am open to reviewer requests to be a little more lax with certain classes of warnings.

    Checklist

    • [x] I am familiar with the Contributing Guidelines.
    • [x] New or existing tests cover these changes.
    • [x] The documentation is up to date with these changes.
  • [REVIEW] Remove `int32` hard-coding in python

    [REVIEW] Remove `int32` hard-coding in python

    Description

    Resolves #12303 This PR removes int32 hard-coding which are related to size_type in libcudf. Some of the hard-coding's have been left because they are un-related to size_type.

    Checklist

    • [x] I am familiar with the Contributing Guidelines.
    • [ ] New or existing tests cover these changes.
    • [x] The documentation is up to date with these changes.
  • Use cudaMemcpyDefault.

    Use cudaMemcpyDefault.

    Description

    This uses cudaMemcpyDefault instead of cudaMemcpy{Host,Device}To{Host,Device}. After consultation with @vuule on another PR and asking the CUDA team, there is no additional cost to using the default value over specifying the host/device residency.

    The only potential advantage of specifying the direction is that an error is raised if the direction does not match the source/destination parameters' location. @vuule and I are both +1 for this change but neither of us feel strongly if others feel differently. On the other hand, there is less to think about if we always use cudaMemcpyDefault.

    Checklist

    • [x] I am familiar with the Contributing Guidelines.
    • [x] New or existing tests cover these changes.
    • [x] The documentation is up to date with these changes.
  • [BUG] For an arrow table that contains string columns and is converted from pandas, the from_arrow fails after slice because the length does not match.

    [BUG] For an arrow table that contains string columns and is converted from pandas, the from_arrow fails after slice because the length does not match.

    Describe the bug For an arrow table that contains string columns and is converted from pandas, the from_arrow fails after slice because the length does not match.

    Steps/Code to reproduce bug

    import cudf
    import pyarrow
    import pandas as pd
    
    cdf = pd.DataFrame.from_dict({'a': ['aa', 'bb', 'cc'], 'b': [1, 2, 3]})
    print(cdf)
    
    tbl = pyarrow.Table.from_pandas(cdf)
    print(tbl)
    
    tbl_slice = tbl.slice(0, 2)
    print(tbl_slice)
    
    gdf = cudf.DataFrame.from_arrow(tbl_slice)
    
    >>> import cudf
    >>> import pyarrow
    >>> import pandas as pd
    >>>
    >>>
    >>> cdf = pd.DataFrame.from_dict({'a': ['aa', 'bb', 'cc'], 'b': [1, 2, 3]})
    >>> print(cdf)
        a  b
    0  aa  1
    1  bb  2
    2  cc  3
    >>>
    >>>
    >>> tbl = pyarrow.Table.from_pandas(cdf)
    >>> print(tbl)
    pyarrow.Table
    a: string
    b: int64
    ----
    a: [["aa","bb","cc"]]
    b: [[1,2,3]]
    >>>
    >>>
    >>> tbl_slice = tbl.slice(0, 2)
    >>> print(tbl_slice)
    pyarrow.Table
    a: string
    b: int64
    ----
    a: [["aa","bb"]]
    b: [[1,2]]
    >>>
    >>>
    >>> gdf = cudf.DataFrame.from_arrow(tbl_slice)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/opt/huawei/release/bi/uxdf/envPkg/Miniconda3/envs/uxdf_server/lib/python3.9/contextlib.py", line 79, in inner
        return func(*args, **kwds)
      File "/opt/huawei/release/bi/uxdf/envPkg/Miniconda3/envs/uxdf_server/lib/python3.9/site-packages/cudf/core/dataframe.py", line 4458, in from_arrow
        out = out.set_index(
      File "/opt/huawei/release/bi/uxdf/envPkg/Miniconda3/envs/uxdf_server/lib/python3.9/contextlib.py", line 79, in inner
        return func(*args, **kwds)
      File "/opt/huawei/release/bi/uxdf/envPkg/Miniconda3/envs/uxdf_server/lib/python3.9/site-packages/cudf/core/dataframe.py", line 2453, in set_index
        df.index = idx
      File "/opt/huawei/release/bi/uxdf/envPkg/Miniconda3/envs/uxdf_server/lib/python3.9/site-packages/cudf/core/dataframe.py", line 1027, in __setattr__
        super().__setattr__(key, col)
      File "/opt/huawei/release/bi/uxdf/envPkg/Miniconda3/envs/uxdf_server/lib/python3.9/site-packages/cudf/core/indexed_frame.py", line 341, in index
        raise ValueError(
    ValueError: Length mismatch: Expected axis has 2 elements, new values have 3 elements
    >>>
    

    Expected behavior Translate correctly.

    Environment overview (please complete the following information)

    • Environment location: Cloud(HuaweiCloud)
    • Method of cuDF install: conda

    Environment details Not found.

  • [FEA] Support for min_periods in DataFrame correlation

    [FEA] Support for min_periods in DataFrame correlation

    Hi. Pip installation on Google Colab with !pip install cudf-cu11 --extra-index-url=https://pypi.ngc.nvidia.com seems to work but results in missing functionality. For example, trying to compute column correlations in a dataframe with a min_periods argument specified raises NotImplementedError: Unsupported argument 'min_periods'. Other general functionality seems to be missing as well with various errors raised. Interestingly, everything seemed to be working yesterday. Any thoughts on what could be going on? Having a working pip installation on colab would be a game changer!

  • Refactor jni writer data sink

    Refactor jni writer data sink

    Description

    Fixes #12456.

    This is purely a JNI change. The implementation of jni_writer_data_sink has been streamlined a little:

    1. The commonality in device_write() and host_write() has been moved to a common function.
    2. rotate_buffer() has been renamed to handle_buffer_and_reallocate().
    3. jni_writer_data_sink has been moved to the cudf::jni::io namespace.
    4. jni_writer_data_sink is now named writer_data_sink.

    Checklist

    • [x] I am familiar with the Contributing Guidelines.
    • [x] New or existing tests cover these changes.
    • [x] The documentation is up to date with these changes.

    Note: This is purely a JNI change. This is a follow-up to #12425, which has yet to be merged. The changes in that PR are included in this one. Once #12425 is merged, this changes here will be revealed as minuscule.

General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases.
General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases.

Vulkan Kompute The general purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabl

Dec 26, 2022
ArrayFire: a general purpose GPU library.
ArrayFire: a general purpose GPU library.

ArrayFire is a general-purpose library that simplifies the process of developing software that targets parallel and massively-parallel architectures i

Dec 29, 2022
Python 3 Bindings for NVML library. Get NVIDIA GPU status inside your program.
Python 3 Bindings for NVML library. Get NVIDIA GPU status inside your program.

py3nvml Documentation also available at readthedocs. Python 3 compatible bindings to the NVIDIA Management Library. Can be used to query the state of

Jan 4, 2023
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

NVIDIA DALI The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provi

Jan 8, 2023
Library for faster pinned CPU <-> GPU transfer in Pytorch
Library for faster pinned CPU <-> GPU transfer in Pytorch

SpeedTorch Faster pinned CPU tensor <-> GPU Pytorch variabe transfer and GPU tensor <-> GPU Pytorch variable transfer, in certain cases. Update 9-29-1

Dec 19, 2022
📊 A simple command-line utility for querying and monitoring GPU status
📊 A simple command-line utility for querying and monitoring GPU status

gpustat Just less than nvidia-smi? NOTE: This works with NVIDIA Graphics Devices only, no AMD support as of now. Contributions are welcome! Self-Promo

Jan 4, 2023
Python interface to GPU-powered libraries
Python interface to GPU-powered libraries

Package Description scikit-cuda provides Python interfaces to many of the functions in the CUDA device/runtime, CUBLAS, CUFFT, and CUSOLVER libraries

Dec 26, 2022
A Python module for getting the GPU status from NVIDA GPUs using nvidia-smi programmically in Python

GPUtil GPUtil is a Python module for getting the GPU status from NVIDA GPUs using nvidia-smi. GPUtil locates all GPUs on the computer, determines thei

Dec 8, 2022
jupyter/ipython experiment containers for GPU and general RAM re-use
jupyter/ipython experiment containers for GPU and general RAM re-use

ipyexperiments jupyter/ipython experiment containers and utils for profiling and reclaiming GPU and general RAM, and detecting memory leaks. About Thi

Dec 7, 2022
A Python function for Slurm, to monitor the GPU information

Gpu-Monitor A Python function for Slurm, where I couldn't use nvidia-smi to monitor the GPU information. whole repo is not finish Installation TODO Mo

Feb 11, 2022
A NumPy-compatible array library accelerated by CUDA
A NumPy-compatible array library accelerated by CUDA

CuPy : A NumPy-compatible array library accelerated by CUDA Website | Docs | Install Guide | Tutorial | Examples | API Reference | Forum CuPy is an im

Jan 5, 2023
cuML - RAPIDS Machine Learning Library
cuML - RAPIDS Machine Learning Library

cuML - GPU Machine Learning Algorithms cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions t

Jan 4, 2023
cuGraph - RAPIDS Graph Analytics Library
cuGraph - RAPIDS Graph Analytics Library

cuGraph - GPU Graph Analytics The RAPIDS cuGraph library is a collection of GPU accelerated graph algorithms that process data found in GPU DataFrames

Jan 1, 2023
cuSignal - RAPIDS Signal Processing Library
cuSignal - RAPIDS Signal Processing Library

cuSignal The RAPIDS cuSignal project leverages CuPy, Numba, and the RAPIDS ecosystem for GPU accelerated signal processing. In some cases, cuSignal is

Dec 30, 2022
Python 3 Bindings for the NVIDIA Management Library

====== pyNVML ====== *** Patched to support Python 3 (and Python 2) *** ------------------------------------------------ Python bindings to the NVID

Jan 1, 2023
cuDF - GPU DataFrame Library
cuDF - GPU DataFrame Library

cuDF - GPU DataFrames NOTE: For the latest stable README.md ensure you are on the main branch. Resources cuDF Reference Documentation: Python API refe

Jan 8, 2023
BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.

A lightweight, GPU accelerated, SQL engine built on the RAPIDS.ai ecosystem. Get Started on app.blazingsql.com Getting Started | Documentation | Examp

Jan 2, 2023
:truck: Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
:truck: Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark

To launch a live notebook server to test optimus using binder or Colab, click on one of the following badges: Optimus is the missing framework to prof

Dec 30, 2022
Lazy Profiler is a simple utility to collect CPU, GPU, RAM and GPU Memory stats while the program is running.
Lazy Profiler is a simple utility to collect CPU, GPU, RAM and GPU Memory stats while the program is running.

lazyprofiler Lazy Profiler is a simple utility to collect CPU, GPU, RAM and GPU Memory stats while the program is running. Installation Use the packag

Dec 9, 2022
General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases.
General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases.

Vulkan Kompute The general purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabl

Dec 26, 2022